[
https://issues.apache.org/jira/browse/HADOOP-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Chansler updated HADOOP-4305:
------------------------------------
Release Note: Improved TaskTracker blacklisting strategy to better exclude
faulty tracker from executing tasks. (was: Improves the blacklisting strategy,
whereby, tasktrackers that are blacklisted are not given tasks to run from
other jobs, subject to the following conditions (all must be met):
1) The TaskTracker has been blacklisted by at least 4 jobs ( can be configured
by mapred.max.tasktracker.blacklists)
2) The TaskTracker has been blacklisted 50% more number of times than the
average
3) The cluster has less than 50% trackers blacklisted.
Once in 24 hours, a TaskTracker blacklisted for all jobs is given a chance.
Restarting the TaskTracker moves it out of the blacklist.)
Edit release note for publication.
Improves the blacklisting strategy, whereby, tasktrackers that are blacklisted
are not given tasks to run from other jobs, subject to the following conditions
(all must be met):
1) The TaskTracker has been blacklisted by at least 4 jobs ( can be configured
by mapred.max.tasktracker.blacklists)
2) The TaskTracker has been blacklisted 50% more number of times than the
average
3) The cluster has less than 50% trackers blacklisted.
Once in 24 hours, a TaskTracker blacklisted for all jobs is given a chance.
Restarting the TaskTracker moves it out of the blacklist.
> repeatedly blacklisted tasktrackers should get declared dead
> ------------------------------------------------------------
>
> Key: HADOOP-4305
> URL: https://issues.apache.org/jira/browse/HADOOP-4305
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Christian Kunz
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.20.0
>
> Attachments: patch-4305-0.18.txt, patch-4305-1.txt, patch-4305-2.txt,
> patch-4305-3.txt, patch-4305-4.txt
>
>
> When running a batch of jobs it often happens that the same tasktrackers are
> blacklisted again and again. This can slow job execution considerably, in
> particular, when tasks fail because of timeout.
> It would make sense to no longer assign any tasks to such tasktrackers and to
> declare them dead.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.