Hello,

I'm currently developing a map/reduce program that emits a fair amount of maps 
per input record (around 50 - 100), and I'm getting OutOfMemory errors:

2008-09-06 15:28:08,993 ERROR org.apache.hadoop.mapred.pipes.BinaryProtocol: 
java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$BlockingBuffer.reset(MapTask.java:564)
        at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:440)
        at 
org.apache.hadoop.mapred.pipes.OutputHandler.output(OutputHandler.java:55)
        at 
org.apache.hadoop.mapred.pipes.BinaryProtocol$UplinkReaderThread.run(BinaryProtocol.java:117)


It is a reproducible error which occurs at the same percentage all the time - 
when I emit less maps per input record, the problem goes away.

Now, I have tried editing conf/hadoop-env.sh to increase the HADOOP_HEAPSIZE to 
2000MB and set `export HADOOP_TASKTRACKER_OPTS="-Xms32m -Xmx2048m"`, but the 
problem persists at the exact same place.

Now, my use case doesn't really look that spectacular; is this a common 
problem, and if so, what are the usual ways to get around this?

Thanks in advance for a response!

Regards,

Leon Mergen

Reply via email to