[jira] Created: (MAPREDUCE-2211) java.lang.OutOfMemoryError occurred while running the high ram streaming job.

Vinay Kumar Thota (JIRA) Mon, 06 Dec 2010 22:57:39 -0800

java.lang.OutOfMemoryError occurred while running the high ram streaming job.
-----------------------------------------------------------------------------


                 Key: MAPREDUCE-2211
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2211
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: contrib/streaming
            Reporter: Vinay Kumar Thota


I had generated the 3GB input data by using the random text writer. Later I 
submitted the high ram streaming job in the command line. However, I found that 
an out of memory error in one of the task attempt of reducer.

For reproducing the issue please follow the below steps.

1.      Run the below command for generating the input data.

${HADOOP_HOME}/bin/hadoop jar \
  ${HADOO_HOME}/hadoop-mapred-examples-0.22.0-SNAPSHOT.jar randomtextwriter \
  -D mapreduce.randomtextwriter.totalbytes= 3221225472 \
  -D mapreduce.randomtextwriter.bytespermap=$(( 3221225472/10)) \
  -D mapreduce.randomtextwriter.minwordskey=1 \
  -D mapreduce.randomtextwriter.maxwordskey=10 \
  -D mapreduce.randomtextwriter.minwordsvalue=0 \
  -D mapreduce.randomtextwriter.maxwordsvalue=50 \
  -D mapred.output.compress=false \
  -D mapreduce.jobtracker.maxmapmemory.mb=1024 \
  -D mapreduce.jobtracker.maxreducememory.mb=1024 \
  -D mapreduce.cluster.mapmemory.mb=800 \
  -D mapreduce.cluster.reducememory.mb=800 \
  -D mapreduce.map.memory.mb=2048 \
  -D mapreduce.reduce.memory.mb=2048 \
  -outFormat org.apache.hadoop.mapreduce.lib.output.TextOutputFormat \
  highramjob_unsort_input

2.      Run the below command for submitting the streaming job.

$HADOOP_HOME/bin/hadoop jar 
${HADOOP_HOME}/contrib/streaming/hadoop-0.22.0-SNAPSHOT-streaming.jar \
 -D mapreduce.jobtracker.maxmapmemory.mb=1024 \
 -D mapreduce.jobtracker.maxreducememory.mb=1024 \
 -D mapreduce.cluster.mapmemory.mb=800 \
 -D mapreduce.cluster.reducememory.mb=800 \
 -D mapreduce.map.memory.mb=2048 \
 -D mapreduce.reduce.memory.mb=2048 \
 -D mapreduce.job.name="StreamingWordCount" \
 -input highramjob_unsort_input \
 -output highramjob_output1 \
 -mapper cat \
 -reducer wc

I have using the 10 node security cluster with trunk 0.22 branch.

Error details:
==========
2010-12-07 06:32:39,963 WARN org.apache.hadoop.mapred.Child: Exception running 
child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in 
shuffle in fetcher#3
        at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:124)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:223)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
        at org.apache.hadoop.mapred.Child.main(Child.java:217)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:58)
        at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:45)
        at 
org.apache.hadoop.mapreduce.task.reduce.MapOutput.<init>(MapOutput.java:104)
        at 
org.apache.hadoop.mapreduce.task.reduce.MergeManager.unconditionalReserve(MergeManager.java:267)
        at 
org.apache.hadoop.mapreduce.task.reduce.MergeManager.reserve(MergeManager.java:257)
        at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:305)
        at 
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:251)
        at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-2211) java.lang.OutOfMemoryError occurred while running the high ram streaming job.

Reply via email to