On Thu, 05 Jul 2007 11:10:31 PDT, John Heidemann wrote: 
>
>I'm running hadoop streaming from svn (version 552930, reasonably
>recent).  My map/reduce job maps ~1M records, but then a few reduces
>succeed and many fail, eventually terminating the job unsuccessfully.
>I'm looking for some debugging hints.
>
>
>The failures are all "broken pipe":
>
>java.io.IOException: R/W/S=1/0/0 in:0=1/342 [rec/s] out:0=0/342 [rec/s]
>minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
>HOST=null
>USER=hadoop
>HADOOP_USER=null
>last Hadoop input: |null|
>last tool output: |null|
>Date: Thu Jul 05 10:46:28 PDT 2007
>Broken pipe
>       at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:91)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:323)
>       at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1763)
>
>...
>
>
>The other strange thing is I don't get 100% reduce failures, but maybe
>490/503 fail.
>


I found the problem... yes, "pipe broken" is the error message you get
when you your reduce task fails to start (in my case, due to a missing
library).

A bug in my code is clearly my fault.

But I might suggest it would be nice for hadoop streaming to check the
exit code of the map and reduce tasks, to provide, say, a more
informative error message.  (I'll see if I can fix this.)

The other odd thing is the ~10 reduces that succeed apparently succeed
because they were reducers with no input, which apparently hadoop
streaming silently turns into successful no output.

   -John

Reply via email to