getMapOutput() keeps failing too many times before the tasktracker fails
------------------------------------------------------------------------
Key: HADOOP-3321
URL: https://issues.apache.org/jira/browse/HADOOP-3321
Project: Hadoop Core
Issue Type: Bug
Components: mapred
Affects Versions: 0.16.1
Reporter: Yiping Han
Priority: Critical
We are running a big job on our cluster. There are about 400 reducers. Around
361 reducers finished successfully while the last batch of 39 reducers all
failed roughly around the same time. After examining the log files, the
following error info was found 858 times for a single tasktracker:
2008-04-21 02:42:45,368 WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutput(task_200804101742_0001_m_032077_2,396) failed :
2008-04-21 02:42:49,468 WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutput(task_200804101742_0001_m_032077_2,396) failed :
2008-04-21 02:43:03,717 WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutput(task_200804101742_0001_m_032077_2,396) failed :
Shouldn't the task tracker failed early without trying so many times?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.