wankunde commented on PR #37922:
URL: https://github.com/apache/spark/pull/37922#issuecomment-1250635526

   
   > We should decouple current implementation details when making protocol 
changes, and make it extensible for future evolution.
   > 
   > In this case though, it is much more straightforward - there is an 
existing usecase which requires shuffle merge id. When retrying an 
indeterminate stage, we should cleanup merged shuffle data for previous stage 
attempt (in `submitMissingTasks`, before `unregisterAllMapAndMergeOutput`) - 
and given the potential race conditions there, we dont want 
`RemoveShuffleMerge` to clean up for the next attempt (when we add support for 
this).
   > 
   > This specific change can be done in a follow up PR though - I want to get 
the basic mechanics working in this PR, and ensure the cleanup usecase is 
handled - before looking at further enhancements.
   
   Since the push-based shuffle service will auto clean up the old shuffle 
merge data,  so we don't need send RemoveShuffleMerge RPC for a new 
ShuffleMerge?
   The only scenario I can think of now where a cleanup RPC needs is the spark 
job completes. Could we think of other scenarios?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to