[
https://issues.apache.org/jira/browse/HADOOP-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Douglas updated HADOOP-3737:
----------------------------------
Fix Version/s: 0.18.0
Assignee: Grant Glouser
Hadoop Flags: [Reviewed]
+1
This looks like a reasonable fix to CompressedWritable, given its current
semantics. SequenceFile pools its compressors/decompressors for its readers and
writers, so it shouldn't scale as badly.
> CompressedWritable throws OutOfMemoryError
> ------------------------------------------
>
> Key: HADOOP-3737
> URL: https://issues.apache.org/jira/browse/HADOOP-3737
> Project: Hadoop Core
> Issue Type: Bug
> Components: io
> Affects Versions: 0.17.0
> Reporter: Grant Glouser
> Assignee: Grant Glouser
> Fix For: 0.18.0
>
> Attachments: HADOOP-3737.patch
>
>
> We were seeing OutOfMemoryErrors with stack traces like the following (Hadoop
> 0.17.0):
> {noformat}
> java.lang.OutOfMemoryError
> at java.util.zip.Deflater.init(Native Method)
> at java.util.zip.Deflater.<init>(Deflater.java:123)
> at java.util.zip.Deflater.<init>(Deflater.java:132)
> at
> org.apache.hadoop.io.CompressedWritable.write(CompressedWritable.java:71)
> at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
> at
> org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
> at
> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1016)
> [...]
> {noformat}
> A Google search found the following long-standing issue in Java in which use
> of java.util.zip.Deflater causes an OutOfMemoryError:
> [http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4797189]
> CompressedWritable instantiates a Deflater, but does not call
> {{deflater.end()}}. It should do that in order to release the Deflater's
> resources immediately, instead of waiting for the object to be finalized.
> We applied this change locally and saw much improvement in the stability of
> memory usage of our app.
> This may also affect the SequenceFile compression types, because
> org.apache.hadoop.io.compress.zlib.BuiltInZlib{Deflater,Inflater} extend
> java.util.zip.{Deflater,Inflater}. org.apache.hadoop.io.compress.Compressor
> defines an end() method, but I do not see that this method is ever called.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.