liupc commented on issue #23647: [SPARK-26712]Support multi directories for 
executor shuffle info recovery in yarn shuffle serivce
URL: https://github.com/apache/spark/pull/23647#issuecomment-458817205
 
 
   @vanzin 
   >  I'm hoping that the executor registration data can be somehow re-created
   
   My change does save recovery data to a better directory(as explained in the 
above note) if disk error happens, so spark can recover from it.
   
   > The shuffle service will need at least the app secret to allow the 
executors to connect. I'm wondering if after a restart, YARN actually calls the 
initializeApplication callback which would allow that data to be re-created. 
That's the bare minimum;
   
   this secrets recovery is done by YarnShuffleService itself. so maybe we 
should also change secret recovery related codes.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to