Eli Acherkan created HADOOP-14376:
-------------------------------------
Summary: Memory leak when reading a bzip2-compressed file using
the native library
Key: HADOOP-14376
URL: https://issues.apache.org/jira/browse/HADOOP-14376
Project: Hadoop Common
Issue Type: Bug
Components: common, io
Affects Versions: 2.7.0
Reporter: Eli Acherkan
Opening and closing a large number of bzip2-compressed input streams causes the
process to be killed on OutOfMemory when using the native bzip2 library.
Our initial analysis suggests that this can be caused by {{DecompressorStream}}
overriding the {{close()}} method, and therefore skipping the line from its
parent: {{CodecPool.returnDecompressor(trackedDecompressor)}}. When the
decompressor object is a {{Bzip2Decompressor}}, its native {{end()}} method is
never called, and the allocated memory isn't freed.
If this analysis is correct, the simplest way to fix this bug would be to
replace {{in.close()}} with {{super.close()}} in {{DecompressorStream}}.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]