[ 
https://issues.apache.org/jira/browse/SPARK-8424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao closed SPARK-8424.
------------------------------
    Resolution: Duplicate

> Add blacklist mechanism for task scheduler and Yarn container allocation
> ------------------------------------------------------------------------
>
>                 Key: SPARK-8424
>                 URL: https://issues.apache.org/jira/browse/SPARK-8424
>             Project: Spark
>          Issue Type: New Feature
>          Components: Scheduler, YARN
>    Affects Versions: 1.4.0
>            Reporter: Saisai Shao
>
> Previously MapReduce has  a blacklist and graylist to exclude some constantly 
> failed TaskTrackers/nodes, it is important for a large cluster to alleviate 
> the problem of  increasing chance of hardware and software failure. 
> Unfortunately current version of Spark lacks such mechanism to blacklist some 
> constantly failed executors/nodes. The only blacklist mechanism in Spark is 
> to avoid relaunching the task on the same executor when this task is 
> previously failed on this executor within specified time. So here propose a 
> new feature to add blacklist mechanism for Spark, this proposal is divided 
> into two sub-tasks:
> 1. Add a heuristic blacklist algorithm to track the status of executors by 
> the status of finished tasks, and enable blacklist mechanism in tasking 
> scheduling.
> 2. Enable blacklist mechanism in YARN container allocation (avoid allocating 
> containers on the blacklist hosts).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to