Karthik Palaniappan created SPARK-19941:
-------------------------------------------
Summary: Spark should not schedule tasks on executors on
decommissioning YARN nodes
Key: SPARK-19941
URL: https://issues.apache.org/jira/browse/SPARK-19941
Project: Spark
Issue Type: Bug
Components: Scheduler, YARN
Affects Versions: 2.2.0
Environment: Hadoop 2.8.0-rc1
Reporter: Karthik Palaniappan
Hadoop 2.8 added a mechanism to gracefully decommission Node Managers in YARN:
https://issues.apache.org/jira/browse/YARN-914
Essentially you can mark nodes to be decommissioned, and let them a) finish
work in progress and b) finish serving shuffle data. But no new work will be
scheduled on the node.
Spark should respect when NMs are set to decommissioned, and similarly
decommission executors on those nodes by not scheduling any more tasks on them.
It looks like in the future YARN may inform the app master when containers will
be killed: https://issues.apache.org/jira/browse/YARN-3784. However, I don't
think Spark should schedule based on a timeout. We should gracefully
decommission the executor as fast as possible (which is the spirit of
YARN-914). The app master can query the RM for NM statuses (if it doesn't
already have them) and stop scheduling on executors on NMs that are
decommissioning.
Stretch feature: The timeout may be useful in determining whether running
further tasks on the executor is even helpful. Spark may be able to tell that
shuffle data will not be consumed by the time the node is decommissioned, so it
is not worth computing. The executor can be killed immediately.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]