[ 
https://issues.apache.org/jira/browse/HADOOP-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573169#action_12573169
 ] 

eric baldeschwieler commented on HADOOP-2095:
---------------------------------------------

What is the RAM overhead of an active decompressor?  We can probably compute a 
reasonable merge size from that.  If we are seeing lots of very small files, 
maybe we should consider just working with the uncompressed data for small 
inputs?  That may use less total RAM and it should be easy to determine that on 
the fly.  

You could cap the number of compressed files in RAM and when you add a new file 
you could choose to uncompress the smallest file you have to allow you to stay 
under the limit.  You could also merge and recompress your small files, this 
obviously uses a lot more CPU.

> Reducer failed due to Out ofMemory
> ----------------------------------
>
>                 Key: HADOOP-2095
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2095
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Runping Qi
>            Assignee: Arun C Murthy
>         Attachments: HADOOP-2095_CompressedBytesWithCodecPool.patch, 
> HADOOP-2095_debug.patch
>
>
> One of the reducers of my job failed with the following exceptions.
> The failure caused the whole job fail eventually.
> Java heapsize was 768MB and sort.io.mb was 140.
> 2007-10-23 19:24:06,100 WARN org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Intermediate Merge of the inmemory files 
> threw an exception: java.lang.OutOfMemoryError: Java heap space
>       at 
> org.apache.hadoop.io.compress.DecompressorStream.(DecompressorStream.java:43)
>       at 
> org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71)
>       at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1345)
>       at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1231)
>       at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1154)
>       at 
> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile.java:2726)
>       at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:2543)
>       at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2297)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:1311)
> 2007-10-23 19:24:06,102 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 done copying 
> task_200710231912_0001_m_001428_0 output .
> 2007-10-23 19:24:06,185 INFO org.apache.hadoop.fs.FileSystem: Initialized 
> InMemoryFileSystem: 
> ramfs://mapoutput31952838/task_200710231912_0001_r_000020_2/map_1423.out-0 of 
> size (in bytes): 209715200
> 2007-10-23 19:24:06,193 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,193 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001215_0 
> output from xxx
> 2007-10-23 19:24:06,188 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001211_0 
> output from xxx
> 2007-10-23 19:24:06,185 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryOutputStream.close(InMemoryFileSystem.java:161)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:312)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
>       at 
> org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:253)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:713)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,199 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001247_0 
> output from .
> 2007-10-23 19:24:06,200 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,204 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001422_0 
> output from .
> 2007-10-23 19:24:06,207 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,209 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001278_0 
> output from .
> 2007-10-23 19:24:06,198 WARN org.apache.hadoop.mapred.TaskTracker: Error 
> running child
> java.io.IOException: task_200710231912_0001_r_000020_2The reduce copier failed
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:253)
>       at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> 2007-10-23 19:24:06,198 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,231 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001531_0 
> output from .
> 2007-10-23 19:24:06,197 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,237 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001227_0 
> output from .
> 2007-10-23 19:24:06,196 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to