[ https://issues.apache.org/jira/browse/HADOOP-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492656 ]
Arun C Murthy commented on HADOOP-1144: --------------------------------------- > It would be nice to have a similar knob for InputFormat, so that it tolerates > data that causes RecordReader to throw an exception without failing the TIP - > i.e. to treat such errors as a regular end of input data. I'd rather have the user implement a simple sub-class of the RecordReader in question to ignore the exception and return 'false' from next(key, value) - that should be very easy, no? > Hadoop should allow a configurable percentage of failed map tasks before > declaring a job failed. > ------------------------------------------------------------------------------------------------ > > Key: HADOOP-1144 > URL: https://issues.apache.org/jira/browse/HADOOP-1144 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Affects Versions: 0.12.0 > Reporter: Christian Kunz > Assigned To: Arun C Murthy > Fix For: 0.13.0 > > > In our environment it can occur that some map tasks will fail repeatedly > because of corrupt input data, which sometimes is non-critical as long as the > amount is limited. In this case it is annoying that the whole Hadoop job > fails and cannot be restarted till the corrupt data are identified and > eliminated from the input. It would be extremely helpful if the job > configuration would allow to indicate how many map tasks are allowed to fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.