[ 
https://issues.apache.org/jira/browse/HADOOP-14376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996096#comment-15996096
 ] 

Eli Acherkan commented on HADOOP-14376:
---------------------------------------

Thanks [~jlowe]! Absolutely, I'll prepare a patch. I wasn't sure how to write a 
unit test that checks off-heap memory for a leak, but using 
CodecPool.getLeasedDecompressorsCount is much simpler.

> Memory leak when reading a compressed file using the native library
> -------------------------------------------------------------------
>
>                 Key: HADOOP-14376
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14376
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: common, io
>    Affects Versions: 2.7.0
>            Reporter: Eli Acherkan
>         Attachments: Bzip2MemoryTester.java, log4j.properties
>
>
> Opening and closing a large number of bzip2-compressed input streams causes 
> the process to be killed on OutOfMemory when using the native bzip2 library.
> Our initial analysis suggests that this can be caused by 
> {{DecompressorStream}} overriding the {{close()}} method, and therefore 
> skipping the line from its parent: 
> {{CodecPool.returnDecompressor(trackedDecompressor)}}. When the decompressor 
> object is a {{Bzip2Decompressor}}, its native {{end()}} method is never 
> called, and the allocated memory isn't freed.
> If this analysis is correct, the simplest way to fix this bug would be to 
> replace {{in.close()}} with {{super.close()}} in {{DecompressorStream}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to