[ https://issues.apache.org/jira/browse/HADOOP-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arun C Murthy updated HADOOP-1158: ---------------------------------- Status: Open (was: Patch Available) > JobTracker should collect statistics of failed map output fetches, and take > decisions to reexecute map tasks and/or restart the (possibly faulty) Jetty > server on the TaskTracker > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-1158 > URL: https://issues.apache.org/jira/browse/HADOOP-1158 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Affects Versions: 0.12.2 > Reporter: Devaraj Das > Assignee: Arun C Murthy > Fix For: 0.15.0 > > Attachments: HADOOP-1158_20070702_1.patch, > HADOOP-1158_2_20070808.patch, HADOOP-1158_3_20070809.patch, > HADOOP-1158_4_20070817.patch > > > The JobTracker should keep a track (with feedback from Reducers) of how many > times a fetch for a particular map output failed. If this exceeds a certain > threshold, then that map should be declared as lost, and should be reexecuted > elsewhere. Based on the number of such complaints from Reducers, the > JobTracker can blacklist the TaskTracker. This will make the framework > reliable - it will take care of (faulty) TaskTrackers that sometimes always > fail to serve up map outputs (for which exceptions are not properly > raised/handled, for e.g., if the exception/problem happens in the Jetty > server). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.