[jira] Commented: (HADOOP-2095) Reducer failed due to Out ofMemory

Arun C Murthy (JIRA) Fri, 25 Apr 2008 16:49:36 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-2095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592530#action_12592530
 ]


Arun C Murthy commented on HADOOP-2095:
---------------------------------------

+1 for Eric's proposal.

A minor refinement to discuss:

3) If ram is not full
c) initiate the merge with the in-memory compressed files which fit the quota; 
continue downloading the compressed splits and stuff the compressed split as-is 
in the InMemoryFileSystem (on ram).

The next pass of the merge will pick it up and run with it. It has the 
advantage of saving ram and keeping the merge-code simple: each split is either 
compressed (compression enabled) or not (compression disabled).

Thoughts?

> Reducer failed due to Out ofMemory
> ----------------------------------
>
>                 Key: HADOOP-2095
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2095
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Runping Qi
>            Assignee: Arun C Murthy
>         Attachments: HADOOP-2095_CompressedBytesWithCodecPool.patch, 
> HADOOP-2095_debug.patch
>
>
> One of the reducers of my job failed with the following exceptions.
> The failure caused the whole job fail eventually.
> Java heapsize was 768MB and sort.io.mb was 140.
> 2007-10-23 19:24:06,100 WARN org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Intermediate Merge of the inmemory files 
> threw an exception: java.lang.OutOfMemoryError: Java heap space
>       at 
> org.apache.hadoop.io.compress.DecompressorStream.(DecompressorStream.java:43)
>       at 
> org.apache.hadoop.io.compress.DefaultCodec.createInputStream(DefaultCodec.java:71)
>       at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1345)
>       at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1231)
>       at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1154)
>       at 
> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile.java:2726)
>       at 
> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:2543)
>       at 
> org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2297)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$InMemFSMergeThread.run(ReduceTask.java:1311)
> 2007-10-23 19:24:06,102 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 done copying 
> task_200710231912_0001_m_001428_0 output .
> 2007-10-23 19:24:06,185 INFO org.apache.hadoop.fs.FileSystem: Initialized 
> InMemoryFileSystem: 
> ramfs://mapoutput31952838/task_200710231912_0001_r_000020_2/map_1423.out-0 of 
> size (in bytes): 209715200
> 2007-10-23 19:24:06,193 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,193 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001215_0 
> output from xxx
> 2007-10-23 19:24:06,188 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001211_0 
> output from xxx
> 2007-10-23 19:24:06,185 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryOutputStream.close(InMemoryFileSystem.java:161)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.close(ChecksumFileSystem.java:312)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:49)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:64)
>       at 
> org.apache.hadoop.mapred.MapOutputLocation.getFile(MapOutputLocation.java:253)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:713)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,199 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001247_0 
> output from .
> 2007-10-23 19:24:06,200 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,204 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001422_0 
> output from .
> 2007-10-23 19:24:06,207 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,209 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001278_0 
> output from .
> 2007-10-23 19:24:06,198 WARN org.apache.hadoop.mapred.TaskTracker: Error 
> running child
> java.io.IOException: task_200710231912_0001_r_000020_2The reduce copier failed
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:253)
>       at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
> 2007-10-23 19:24:06,198 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,231 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001531_0 
> output from .
> 2007-10-23 19:24:06,197 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)
> 2007-10-23 19:24:06,237 INFO org.apache.hadoop.mapred.ReduceTask: 
> task_200710231912_0001_r_000020_2 Copying task_200710231912_0001_m_001227_0 
> output from .
> 2007-10-23 19:24:06,196 ERROR org.apache.hadoop.mapred.ReduceTask: Map output 
> copy failure: java.lang.NullPointerException
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$FileAttributes.access$300(InMemoryFileSystem.java:366)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem$InMemoryFileStatus.(InMemoryFileSystem.java:378)
>       at 
> org.apache.hadoop.fs.InMemoryFileSystem$RawInMemoryFileSystem.getFileStatus(InMemoryFileSystem.java:283)
>       at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>       at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:449)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:738)
>       at 
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:665)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2095) Reducer failed due to Out ofMemory

Reply via email to