Hi, I have a map task that works most of the time but fails on some data. I keep getting these exceptions:
Task attempt_200811031947_0003_m_000095_0 failed to report status for 600 seconds. Killing! I noticed that the tasks that fail have a lot of these at the end of the syslogs: 2008-11-03 21:05:52,745 INFO org.apache.hadoop.mapred.Merger: Merging 41 sorted segments 2008-11-03 21:05:52,746 INFO org.apache.hadoop.mapred.Merger: Merging 5 intermediate segments out of a total of 41 2008-11-03 21:05:53,016 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 37 2008-11-03 21:05:53,147 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 28 2008-11-03 21:05:53,329 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 19 2008-11-03 21:05:53,525 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 10 segments left of total size: 7866139 bytes 2008-11-03 21:05:53,848 INFO org.apache.hadoop.mapred.MapTask: Index: (2465254733, 7866121, 7866121) 2008-11-03 21:05:53,900 INFO org.apache.hadoop.mapred.Merger: Merging 41 sorted segments 2008-11-03 21:05:53,900 INFO org.apache.hadoop.mapred.Merger: Merging 5 intermediate segments out of a total of 41 2008-11-03 21:05:53,963 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 37 2008-11-03 21:05:53,976 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 28 2008-11-03 21:05:53,996 INFO org.apache.hadoop.mapred.Merger: Merging 10 intermediate segments out of a total of 19 2008-11-03 21:05:54,013 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 10 segments left of total size: 4290611 bytes ... Sure the ones that succeed have them too but the number of segments is always significantly lower: 2008-11-03 20:42:38,214 INFO org.apache.hadoop.mapred.MapTask: Index: (125745724, 351203, 351203) 2008-11-03 20:42:38,221 INFO org.apache.hadoop.mapred.Merger: Merging 2 sorted segments 2008-11-03 20:42:38,221 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 345895 bytes 2008-11-03 20:42:38,226 INFO org.apache.hadoop.mapred.MapTask: Index: (126096927, 345893, 345893) 2008-11-03 20:42:38,232 INFO org.apache.hadoop.mapred.Merger: Merging 2 sorted segments 2008-11-03 20:42:38,232 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 364718 bytes 2008-11-03 20:42:38,237 INFO org.apache.hadoop.mapred.MapTask: Index: (126442820, 364716, 364716) 2008-11-03 20:42:38,241 INFO org.apache.hadoop.mapred.Merger: Merging 2 sorted segments 2008-11-03 20:42:38,241 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 440435 bytes 2008-11-03 20:42:38,247 INFO org.apache.hadoop.mapred.MapTask: Index: (126807536, 440433, 440433) I don't get any exceptions beside the timeouts because the tasks don't report their status. So, my questions are: - what exactly is the Merger? Why is it only merging at the end of the tasks? Why does it seems to merge several times the same data? - Can it really be causing the problem or should I look somewhere else (there's no exception after all) ? It's most probably in my code but I don't see any exception so it's kind of hard to tell what's happening. Thanks in advance, Sebastien
