[ 
https://issues.apache.org/jira/browse/HADOOP-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642807#action_12642807
 ] 

Runping Qi commented on HADOOP-4305:
------------------------------------


I think TaskTracker is at a better position to decide whether it can accept 
more tasks or not.
If not, whether/when to shut down itself. If yes, what kind of tasks it can 
accept.

Many reasons may cause task failing on a node.
Most common one is resource limit.
A task tracker may be ok to run one task, but may fail to run more tasks 
simutaneously.
A task tracker may be ok to run 4 concurrent tasks of one job, but may fail to 
run 3 concurrent tasks of other jobs.
The task tracker should accumulate some stats and make decision based on the 
stats.

A simple heuristics may run like this: When a task fail, the task tracker 
records the concurrent number of tasks running on the tracker at that time
If multiple tasks fail at the same concurrence level, the TT should stop asking 
for new tasks until the concurrence level drop lower.
After running for a while without task failure, it can bump up the concurrence 
threahold level again.



> repeatedly blacklisted tasktrackers should get declared dead
> ------------------------------------------------------------
>
>                 Key: HADOOP-4305
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4305
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Christian Kunz
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.20.0
>
>
> When running a batch of jobs it often happens that the same tasktrackers are 
> blacklisted again and again. This can slow job execution considerably, in 
> particular, when tasks fail because of timeout.
> It would make sense to no longer assign any tasks to such tasktrackers and to 
> declare them dead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to