This may be related to MAPREDUCE-5168
<https://issues.apache.org/jira/browse/MAPREDUCE-5168>. There's a memory
leak of sorts in the shuffle if many map outputs end up being merged
from disk.
Jason
On 05/04/2013 06:40 PM, Radim Kolar wrote:
After upgrade i am getting out of heap space during shuffle. I am
using compressed mapper outputs and 200 mb sort buffers. Was something
important changed? like for example allocating 200mb * number of
fetchers now.
2013-05-04 04:02:10,209 WARN [main]
org.apache.hadoop.mapred.YarnChild: Exception running child :
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in
shuffle in fetcher#4
at
org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:121)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
Caused by: java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
at
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
at
org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.<init>(InMemoryMapOutput.java:63)
at
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.unconditionalReserve(MergeManagerImpl.java:297)
at
org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.reserve(MergeManagerImpl.java:287)
at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:360)
at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:295)
at
org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:154)