Hadoop streaming job failure

Srinivas Surasani Sat, 20 Jul 2013 22:57:37 -0700

Hi All,

I'm running hadoop streaming job over 100 GB of data on 50 node cluster.
Job succeeds for the small amounts of data. But when running on 100 GB of
data, I get "memory error" and "BrokenPipe " error. I have enough memory on
each node.


Is there a way to increase the memory for python streaming tasks ?

below are sample error logs

cause:java.io.IOException: subprocess still running
R/W/S=32771708/10/0 in:34752=32771708/943 [rec/s] out:0=10/943 [rec/s]
minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
HOST=null
USER=root
HADOOP_USER=null
last Hadoop input: |null|
Broken pipe


Any help appreciated.

Thanks,
Srinivas

Hadoop streaming job failure

Reply via email to