GitHub user juanrh opened a pull request:
https://github.com/apache/spark/pull/19267
[WIP][SPARK-20628][CORE] Blacklist nodes when they transition to
DECOMMISSIONING state in YARN
## What changes were proposed in this pull request?
Dynamic cluster configurations where cluster nodes are added and removed
frequently are common in public cloud environments, so this has become a
[problem for Spark
users](https://www.trulia.com/blog/tech/aws-emr-ad-hoc-spark-development-environme...
). To cope with this we propose implementing a mechanism in the line of
YARNâs support for [graceful node
decommission](https://issues.apache.org/jira/browse/YARN-914 ) or [Mesos
maintenance
primitives](https://mesos.apache.org/documentation/latest/maintenance/ ). These
changes allow cluster nodes to be transitioned to a âdecommissioningâ
state, at which point no more tasks will be scheduled on the executors running
on those nodes. After a configurable drain time, nodes in the
âdecommissioningâ state will transition to a âdecommissionedâ state,
where shuffle blocks are not available anymore. Shuffle blocks stored on nodes
in the âdecommissioningâ state are available to other executors. By
preventing more tasks from
running on nodes in the âdecommissioningâ state we avoid creating more
shuffle blocks on those nodes, as those blocks wonât be available when nodes
eventually transition to the âdecommissionedâ state.
We have implemented a first version of this proposal for YARN, using
Sparkâs [blacklisting mechanism for task
scheduling](https://issues.apache.org/jira/browse/SPARK-8425 ) âavailable at
the node level since Spark 2.2.0â to ensure tasks are not scheduled on nodes
in the âdecommissioningâ state. With this solution it is the cluster
manager, not the Spark application, that tracks the status of the node, and
handles the transition from âdecommissioningâ to âdecommissionedâ. The
Spark driver simply reacts to the node state transitions.
## How was this patch tested?
All functionality has been tested with unit tests, with an integration test
based on the BaseYarnClusterSuite, and with manual testing on a cluster
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/juanrh/spark
SPARK-20628-yarn-decommissioning-blacklisting
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19267.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19267
----
commit 35285871a40fbf0903784dfb8dc16bd4e37062fe
Author: Juan Rodriguez Hortala <[email protected]>
Date: 2017-08-23T16:58:24Z
Send Host status update signals to YarnSchedulerBackend, on Yarn node state
changes
commit 42891dafcfe7d25026719b6025c1272e7fe2d947
Author: Juan Rodriguez Hortala <[email protected]>
Date: 2017-08-23T18:49:04Z
Add mechanism to Blacklist/Unblacklist nodes based on Node status changes
in Cluster Manager
commit 0c840c74a85012a4d91e349ae4830455ef3d680b
Author: Juan Rodriguez Hortala <[email protected]>
Date: 2017-08-23T19:56:31Z
Add configuration to independently enable/disable task execution
blacklisting and decommissioning blacklisting
commit f9fdfb01ac2486cc268b13568d129f926b3b8ab2
Author: Juan Rodriguez Hortala <[email protected]>
Date: 2017-08-23T20:29:49Z
Integration test for Yarn node decommissioning
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]