[
https://issues.apache.org/jira/browse/HADOOP-14376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005110#comment-16005110
]
Eli Acherkan commented on HADOOP-14376:
---------------------------------------
I see what you mean, Jason. Thanks for your comments!
*BZip2CompressionOutputStream:*
Putting back output.close() brings us to the following:
{code:java}
@Override
public void close() throws IOException {
try {
super.close();
} finally {
output.close();
}
}
{code}
*CompressorStream:*
I was attempting to change the current implementation as little as possible.
Switching the order of closed = true and super.close() may affect subclasses,
especially user-supplied ones (e.g. if they rely on the state of the closed
flag in their finish() method). So what would be the best course of action
here? Switch the order to simplify the method? Move the closed check logic into
the parent (which also affects subclasses)? If so, should a separate "finished"
flag be added to keep track of whether finish() was completed successfully?
Similarly, should the closed check logic of DecompressorStream be moved to
_its_ parent? Also, in DecompressorStream the closed flag is set to true only
if super.close() doesn't throw - which I also haven't changed so far.
> Memory leak when reading a compressed file using the native library
> -------------------------------------------------------------------
>
> Key: HADOOP-14376
> URL: https://issues.apache.org/jira/browse/HADOOP-14376
> Project: Hadoop Common
> Issue Type: Bug
> Components: common, io
> Affects Versions: 2.7.0
> Reporter: Eli Acherkan
> Assignee: Eli Acherkan
> Attachments: Bzip2MemoryTester.java, HADOOP-14376.001.patch,
> HADOOP-14376.002.patch, HADOOP-14376.003.patch, log4j.properties
>
>
> Opening and closing a large number of bzip2-compressed input streams causes
> the process to be killed on OutOfMemory when using the native bzip2 library.
> Our initial analysis suggests that this can be caused by
> {{DecompressorStream}} overriding the {{close()}} method, and therefore
> skipping the line from its parent:
> {{CodecPool.returnDecompressor(trackedDecompressor)}}. When the decompressor
> object is a {{Bzip2Decompressor}}, its native {{end()}} method is never
> called, and the allocated memory isn't freed.
> If this analysis is correct, the simplest way to fix this bug would be to
> replace {{in.close()}} with {{super.close()}} in {{DecompressorStream}}.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]