the job tracker re-runs failed tasks on the same node
-----------------------------------------------------
Key: HADOOP-400
URL: http://issues.apache.org/jira/browse/HADOOP-400
Project: Hadoop
Issue Type: Bug
Components: mapred
Affects Versions: 0.4.0
Reporter: Owen O'Malley
Assigned To: Owen O'Malley
The job tracker tries not to run tasks that have previously failed on a node on
that node again, but it doesn't strictly prevent it.
I propose to change the rule so that when pollForNewTask is called by a
TaskTracker, the JobTracker will only assign it a task that has failed on that
TaskTracker, if and only if it has already failed on the entire cluster. Thus,
for "normal" clusters with more than 4 TaskTrackers, you will be guaranteed
that it will run on 4 different TaskTrackers. For small clusters, it will run
on every TaskTracker in the cluster at least once.
Does that sound reasonable to everyone?
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira