I have a situation which may be related. I am running hadoop 0.18.1. I am on a cluster with 5 machines and testing on very small input of 10 lines. Mapper produces either 1 or 0 output per line of input yet somehow I get 18 lines of output from the reducer. For example I have one input where the key is: fd349fc441ff5e726577aeb94cceb1e4
However, I added a print to the reducer to print keys right before calling output.collect and I have 3 instances of this key being printed. I have turned speculative execution off and still get this. Does this sound related? A known bug? Something I'm missing? Fixed in 19.1? - Malcolm -----Original Message----- From: Koji Noguchi [mailto:[email protected]] Sent: Monday, March 02, 2009 15:59 To: [email protected] Subject: RE: Potential race condition (Hadoop 18.3) Ryan, If you're using getOutputPath, try replacing it with getWorkOutputPath. http://hadoop.apache.org/core/docs/r0.18.3/api/org/apache/hadoop/mapred/ FileOutputFormat.html#getWorkOutputPath(org.apache.hadoop.mapred.JobConf ) Koji -----Original Message----- From: Ryan Shih [mailto:[email protected]] Sent: Monday, March 02, 2009 11:01 AM To: [email protected] Subject: Potential race condition (Hadoop 18.3) Hi - I'm not sure yet, but I think I might be hitting a race condition in Hadoop 18.3. What seems to happen is that in the reduce phase, some of my tasks perform speculative execution but when the initial task completes successfully, it sends a kill to the new task started. After all is said and done, perhaps one in every five or ten which kill their second task ends up with zero or truncated output. When I code it to turn off speculative execution, the problem goes away. Are there known race conditions that I should be aware of around this area? Thanks in advance, Ryan
