Thank you, that's very useful. In addition, I changed the way the tasks work, so they store their data in HBase now (since it's more suited for handling small files). I'm not 100% sure yet if the problems have been resolved (still doing extensive testing), but I think I might have gotten rid of them (and I'll add the 'skipping records' option in case I do get a failure).
Mathias On Mon, Aug 10, 2009 at 5:46 PM, Koji Noguchi <[email protected]>wrote: > > but I didn't find a config option > > that allows ignoring tasks that fail. > > > If 0.18, > > http://hadoop.apache.org/common/docs/r0.18.3/api/org/apache/hadoop/mapred/Jo > bConf.html#setMaxMapTaskFailuresPercent(int)<http://hadoop.apache.org/common/docs/r0.18.3/api/org/apache/hadoop/mapred/Jo%0AbConf.html#setMaxMapTaskFailuresPercent%28int%29> > (mapred.max.map.failures.percent) > > > > http://hadoop.apache.org/common/docs/r0.18.3/api/org/apache/hadoop/mapred/Jo > bConf.html#setMaxReduceTaskFailuresPercent(int)<http://hadoop.apache.org/common/docs/r0.18.3/api/org/apache/hadoop/mapred/Jo%0AbConf.html#setMaxReduceTaskFailuresPercent%28int%29> > (mapred.max.reduce.failures.percent) > > > If 0.19 or later, you can also try skipping records. > > > Koji >
