waitinfuture commented on PR #2373:
URL: https://github.com/apache/celeborn/pull/2373#issuecomment-2041062688

   > Ah, I see what you mean ... `PartitionLocation` would change between 
retries. Yeah, this is a problem then - it will cause data loss. This would be 
a variant of SPARK-23207
   > 
   > I will need to relook at the PR, and how it interact with Celeborn - but 
if scenarios directly described in SPARK-23207 (or variants of it) are 
applicable (and we cant mitigate it), we should not proceed down this path 
given the correctness implications unfortunately.
   
   Maybe we can remain both this optimization and stage rerun, but only allows 
one to take effect by checking configs for now. The performance issue this PR 
solves does happen in production.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to