Thomas Graves created SPARK-16630:
-------------------------------------
Summary: Blacklist a node if executors won't launch on it.
Key: SPARK-16630
URL: https://issues.apache.org/jira/browse/SPARK-16630
Project: Spark
Issue Type: Improvement
Components: YARN
Affects Versions: 1.6.2
Reporter: Thomas Graves
On YARN, its possible that a node is messed or misconfigured such that a
container won't launch on it. For instance if the Spark external shuffle
handler didn't get loaded on it , maybe its just some other transient error.
It would be nice we could recognize this happening and stop trying to launch
executors on it since that could end up causing us to hit our max number of
executor failures and then kill the job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]