vanzin commented on issue #23647: [SPARK-26712]Support multi directories for executor shuffle info recovery in yarn shuffle serivce URL: https://github.com/apache/spark/pull/23647#issuecomment-458681837 > I think if this problem can be resolved by current implementation I think the current implementation could be enhanced, but I'd prefer a simpler approach. If you just change the current implementation to not save recovery data, what data is lost and how does Spark recover from it, if at all? The shuffle service will need at least the app secret to allow the executors to connect. I'm wondering if after a restart, YARN actually calls the `initializeApplication` callback which would allow that data to be re-created. That's the bare minimum; I'm hoping that the executor registration data can be somehow re-created, but haven't really looked into that.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
