[ 
https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998537#comment-13998537
 ] 

Colin Patrick McCabe commented on HADOOP-10591:
-----------------------------------------------

Thanks, Gopal.  I agree that this is a pre-existing issue, definitely not 
introduced by HADOOP-10047.  And, in fact, that JIRA should improve the 
situation in many cases by eliminating the need for the {{Decompressor}} to 
allocate its own direct buffer.

semi-related: One thing that I notice in the constructor for 
{{ZlibDirectDecompressor}} is that it invokes the superclass constructor 
({{ZlibDecompressor}}) with {{directBufferSize = 0}}, causing us to call 
{{allocateDirect}} with a size of 0.  I do wonder what this actually does... I 
didn't manage to find any documentation for this case (maybe I missed it?).

> Compression codecs must used pooled direct buffers or deallocate direct 
> buffers when stream is closed
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-10591
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10591
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Hari Shreedharan
>            Assignee: Colin Patrick McCabe
>
> Currently direct buffers allocated by compression codecs like Gzip (which 
> allocates 2 direct buffers per instance) are not deallocated when the stream 
> is closed. Eventually for long running processes which create a huge number 
> of files, these direct buffers are left hanging till a full gc, which may or 
> may not happen in a reasonable amount of time - especially if the process 
> does not use a whole lot of heap.
> Either these buffers should be pooled or they should be deallocated when the 
> stream is closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to