the job tracker re-runs failed tasks on the same node
-----------------------------------------------------

                 Key: HADOOP-400
                 URL: http://issues.apache.org/jira/browse/HADOOP-400
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.4.0
            Reporter: Owen O'Malley
         Assigned To: Owen O'Malley


The job tracker tries not to run tasks that have previously failed on a node on 
that node again, but it doesn't strictly prevent it.

I propose to change the rule so that when pollForNewTask is called by a 
TaskTracker, the JobTracker will only assign it a task that has failed on that 
TaskTracker, if and only if it has already failed on the entire cluster. Thus, 
for "normal" clusters with more than 4 TaskTrackers, you will be guaranteed 
that it will run on 4 different TaskTrackers. For small clusters, it will run 
on every TaskTracker in the cluster at least once.

Does that sound reasonable to everyone?

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to