hiboyang edited a comment on pull request #31715: URL: https://github.com/apache/spark/pull/31715#issuecomment-790013265
> I do not think this flag (indicating whether to unregister all the shuffle blocks created by the failed executor) must be set at the application level (the way this PR does). > > I think this must be a property kept for the replicated shuffle block so somewhere close to `MapStatus`. > > Imagine a mixed solution where the disaggregated storage just a fallback and still locally stored blocks can be used. > Then it make sense to unregister the shuffle blocks for the failed executors and leave the disaggregated storage access for the blocks intact. Yeah, if user uses a mixed solution where the disaggregated storage just a fallback, it is better to have MapStatus to track availability of shuffle blocks. This PR does not block this scenario. There is non-mixed solution where people totally rely on remote shuffle service (like Facebook/Uber's one). In that case, people only need to set this flag in the application level. This PR will help this scenario. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
