[ 
https://issues.apache.org/jira/browse/HADOOP-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644213#action_12644213
 ] 

Runping Qi commented on HADOOP-4305:
------------------------------------


I still think it is not enough to simply count the number of failed tasks.
We need to know under what conditions the task failed.
Consider two cases. One is that a task of a job  failed when it was the only 
task ran on a TT.
Another case is a task of a job failed when another 10 tasks ran concurrently.
Case one provides a strong evidence that this TT may not appropriate for the 
task for that job.
However, Case two is much weaker. The task might have succeeded if there were 
fewer concurrent tasks.


> repeatedly blacklisted tasktrackers should get declared dead
> ------------------------------------------------------------
>
>                 Key: HADOOP-4305
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4305
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Christian Kunz
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.20.0
>
>
> When running a batch of jobs it often happens that the same tasktrackers are 
> blacklisted again and again. This can slow job execution considerably, in 
> particular, when tasks fail because of timeout.
> It would make sense to no longer assign any tasks to such tasktrackers and to 
> declare them dead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to