Koji - That looks like it did the trick - we're smooth sailing now. Thanks a
lot!

On Mon, Mar 2, 2009 at 2:02 PM, Ryan Shih <[email protected]> wrote:

> Koji - That makes a lot of sense. The two tasks are probably stepping over
> each other. I'll give it a try and let you know how it goes.
>
> Malcolm - if you turned off speculative execution and are still getting the
> problem, it doesn't sound the same. Do you want to do a cut&paste of your
> reduce code and I'll see if I can spot anything suspicious?
>
>
> On Mon, Mar 2, 2009 at 1:15 PM, Malcolm Matalka <
> [email protected]> wrote:
>
>> I have a situation which may be related.  I am running hadoop 0.18.1.  I
>> am on a cluster with 5 machines and testing on very small input of 10
>> lines.  Mapper produces either 1 or 0 output per line of input yet
>> somehow I get 18 lines of output from the reducer.  For example I have
>> one input where the key is:
>> fd349fc441ff5e726577aeb94cceb1e4
>>
>> However, I added a print to the reducer to print keys right before
>> calling output.collect and I have 3 instances of this key being printed.
>>
>> I have turned speculative execution off and still get this.
>>
>> Does this sound related?  A known bug?  Something I'm missing?  Fixed in
>> 19.1?
>>
>> - Malcolm
>>
>>
>> -----Original Message-----
>> From: Koji Noguchi [mailto:[email protected]]
>> Sent: Monday, March 02, 2009 15:59
>> To: [email protected]
>> Subject: RE: Potential race condition (Hadoop 18.3)
>>
>> Ryan,
>>
>> If you're using getOutputPath, try replacing it with getWorkOutputPath.
>>
>> http://hadoop.apache.org/core/docs/r0.18.3/api/org/apache/hadoop/mapred/
>> FileOutputFormat.html#getWorkOutputPath(org.apache.hadoop.mapred.JobConf<http://hadoop.apache.org/core/docs/r0.18.3/api/org/apache/hadoop/mapred/%0AFileOutputFormat.html#getWorkOutputPath%28org.apache.hadoop.mapred.JobConf>
>> )
>>
>> Koji
>>
>> -----Original Message-----
>> From: Ryan Shih [mailto:[email protected]]
>> Sent: Monday, March 02, 2009 11:01 AM
>> To: [email protected]
>> Subject: Potential race condition (Hadoop 18.3)
>>
>> Hi - I'm not sure yet, but I think I might be hitting a race condition
>> in
>> Hadoop 18.3. What seems to happen is that in the reduce phase, some of
>> my
>> tasks perform speculative execution but when the initial task completes
>> successfully, it sends a kill to the new task started. After all is said
>> and
>> done, perhaps one in every five or ten which kill their second task ends
>> up
>> with zero or truncated output.  When I code it to turn off speculative
>> execution, the problem goes away. Are there known race conditions that I
>> should be aware of around this area?
>>
>> Thanks in advance,
>> Ryan
>>
>
>

Reply via email to