[ 
https://issues.apache.org/jira/browse/HADOOP-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652368#action_12652368
 ] 

Devaraj Das commented on HADOOP-4305:
-------------------------------------

Some comments:
1. Format if condition brackets properly in incrementFaults method
2. You should be able to use the same datastructure for both potentiallyFaulty 
and blacklisted trackers.
3. Add a comment for mapred.cluster.average.blacklist.threshold that it is 
there solely for tuning purposes and once this feature has been tested in real 
clusters and an appropriate value for the threshold has been found, this config 
might be taken out.
4. Check whether you can remove initialContact flag and use only the restarted 
flag in the heartbeat method. This is a more serious change but might be 
worthwhile in simplifying the state machine.

> repeatedly blacklisted tasktrackers should get declared dead
> ------------------------------------------------------------
>
>                 Key: HADOOP-4305
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4305
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Christian Kunz
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.20.0
>
>         Attachments: patch-4305-0.18.txt, patch-4305-1.txt, patch-4305-2.txt
>
>
> When running a batch of jobs it often happens that the same tasktrackers are 
> blacklisted again and again. This can slow job execution considerably, in 
> particular, when tasks fail because of timeout.
> It would make sense to no longer assign any tasks to such tasktrackers and to 
> declare them dead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to