[
https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998537#comment-13998537
]
Colin Patrick McCabe commented on HADOOP-10591:
-----------------------------------------------
Thanks, Gopal. I agree that this is a pre-existing issue, definitely not
introduced by HADOOP-10047. And, in fact, that JIRA should improve the
situation in many cases by eliminating the need for the {{Decompressor}} to
allocate its own direct buffer.
semi-related: One thing that I notice in the constructor for
{{ZlibDirectDecompressor}} is that it invokes the superclass constructor
({{ZlibDecompressor}}) with {{directBufferSize = 0}}, causing us to call
{{allocateDirect}} with a size of 0. I do wonder what this actually does... I
didn't manage to find any documentation for this case (maybe I missed it?).
> Compression codecs must used pooled direct buffers or deallocate direct
> buffers when stream is closed
> -----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-10591
> URL: https://issues.apache.org/jira/browse/HADOOP-10591
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Hari Shreedharan
> Assignee: Colin Patrick McCabe
>
> Currently direct buffers allocated by compression codecs like Gzip (which
> allocates 2 direct buffers per instance) are not deallocated when the stream
> is closed. Eventually for long running processes which create a huge number
> of files, these direct buffers are left hanging till a full gc, which may or
> may not happen in a reasonable amount of time - especially if the process
> does not use a whole lot of heap.
> Either these buffers should be pooled or they should be deallocated when the
> stream is closed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)