[
https://issues.apache.org/jira/browse/SPARK-8424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Saisai Shao closed SPARK-8424.
------------------------------
Resolution: Duplicate
> Add blacklist mechanism for task scheduler and Yarn container allocation
> ------------------------------------------------------------------------
>
> Key: SPARK-8424
> URL: https://issues.apache.org/jira/browse/SPARK-8424
> Project: Spark
> Issue Type: New Feature
> Components: Scheduler, YARN
> Affects Versions: 1.4.0
> Reporter: Saisai Shao
>
> Previously MapReduce has a blacklist and graylist to exclude some constantly
> failed TaskTrackers/nodes, it is important for a large cluster to alleviate
> the problem of increasing chance of hardware and software failure.
> Unfortunately current version of Spark lacks such mechanism to blacklist some
> constantly failed executors/nodes. The only blacklist mechanism in Spark is
> to avoid relaunching the task on the same executor when this task is
> previously failed on this executor within specified time. So here propose a
> new feature to add blacklist mechanism for Spark, this proposal is divided
> into two sub-tasks:
> 1. Add a heuristic blacklist algorithm to track the status of executors by
> the status of finished tasks, and enable blacklist mechanism in tasking
> scheduling.
> 2. Enable blacklist mechanism in YARN container allocation (avoid allocating
> containers on the blacklist hosts).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]