Github user juanrh commented on the issue:
https://github.com/apache/spark/pull/19267
Hi @tgravescs, thanks again for your feedback. Regarding concrete uses
cases, this change might be used extend the existing graceful decommission
mechanism available in AWS EMR from a while ago. That mechanism was originally
designed for MapReduce with a 1 to 1 correspondence between tasks and YARN
containers, so it has to be adapted to Spark where an executor running in a
single YARN containers is able to run many Spark tasks. This could be useful to
react to cluster resizes or SPOT instance terminations that happen in the
middle of a Spark job execution. As described in the attached document, this PR
sets the base framework for graceful decommission on Spark, and performs the
first action to react to node decommission, but additional actions could be
added in future PRs to react to other transitions, like for example
unregistering shuffle blocks when a node transitions to DECOMMISSIONED state.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]