CodingCat commented on PR #3569:
URL: https://github.com/apache/celeborn/pull/3569#issuecomment-3699970392

   > @CodingCat , I have not thought through what needs to be done to address 
it in Apache Spark - if there are concrete proposal, I can help review and 
evolve it. My suggestion would be to address it at the right layer.
   > 
   > This appears to be a recurring issue in Spark, and has come up in past as 
well.
   > 
   > 
   > 
   > Having said that, while I was trying to be constructive in making progress 
here, I have already given my comments and cant keep revisiting them - as 
currently formulated, I am not in favor of the (fairly nontrivial) change.
   > 
   > 
   > 
   > If there is additional details/usecases and/or refinements which help  I 
am happy to take a look/revisit my position.
   
   I think that's the key option conflict here, I don't really take Spark as 
the right layer to address this issue
   
   one of the major reasons is that it cannot be extended to an advanced 
version of this PR,partition level early deletion, given vanilla Spark shuffle 
storage format (well, you can still work it out, but that will touch every 
piece of shuffle related code)
   
   to have a more cost efficient solution for shuffle storage via early shuffle 
deletion, no matter which layer you build it on, you always need to tradeoff 
between happy path storage cost and bad path computing cost..and with the 
facilities of remote shuffle systems storage layout, you can significantly 
improve the happy path gainings
   
   in summary, building on Spark layer brings the same if not higher, cost, we 
already have a solution and can extend it to a even better one in Remote 
Shuffle Systems, why not
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to