tgravescs commented on issue #25962: [SPARK-29285][Shuffle] Temporary shuffle 
files should be able to handle disk failures
URL: https://github.com/apache/spark/pull/25962#issuecomment-547664664
 
 
   Yes it depends on how often your executors are created/destroyed, if using 
dynamic allocation and a lot of long tail it could be cycling those fairly 
often and yarn disk checker should help, if not it won't. Lots of jobs it won't 
help by itself.
   
   I mostly wonder how much this helps because it might help with the temp 
shuffle file, but then it might fail later creating real shuffle file.  Is this 
ok, maybe, but it's potentially changing from failing fast to failing later.  
if there is a long time between those then you potentially taking longer. There 
are a lot of mights in that sentence and I don't have any concrete idea how 
much it will help or hurt.
   Has this actually been run on real jobs and have you seen a benefit?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to