mridulm commented on PR #3569:
URL: https://github.com/apache/celeborn/pull/3569#issuecomment-3699790866

   @CodingCat while I am sympathetic to the intent behind the change, this is 
not the right way to address it. While Apache Spark has reasonably robust 
ability to recompute lost data - that is primarily to address fault tolerance; 
which is getting misused here.
   The rationale, used in this PR, applies to vanilla shuffle in Spark as well; 
and the analysis would be the same - it is unsound and violates how Spark 
currently expects shuffle to behave : which is why Spark relies on GC to clean 
up shuffle. Diverging nontrivially from Spark, in Apache Celeborn, will cause 
maintenance issues and ability to evolve the projects.
   
   Having said that, I understand the pain point - I am open to proposals to 
evolve this in Apache Spark (which can then be leveraged in Celeborn).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to