squito commented on issue #23647: [SPARK-26712]Support multi directories for executor shuffle info recovery in yarn shuffle serivce URL: https://github.com/apache/spark/pull/23647#issuecomment-458655480 >> If you have a bad disk, you're definitely losing some shuffle data. Furthermore, any other shuffleMapStages would need to know to not write their output to the bad disk also. > This blacklist is introduced in another PR #23614, it will solve the shuffle write issues. OK I see, I've reviewed that PR now. But at best, that still doesn't completely handle the problem, as any existing shuffle data written to the bad disks is gone (and as I noted on that PR, its somewhat complicated to make sure that the ExternalShuffleService and the executor keep a consistent view of good dirs).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
