RexXiong commented on PR #3125:
URL: https://github.com/apache/celeborn/pull/3125#issuecomment-2693369740

   > will the spark app fallback to reading replicate shuffle data
   
   We cannot fallback to the replica because some sub reducer tasks may have 
already successfully read data from the primary copy. If a task that encounters 
an error fallback to the replica, it may read duplicate data, which is caused 
by the different order of data between the primary and replica. In this 
scenario, trigger stage rerun would be better.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to