[
https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998319#comment-13998319
]
Gopal V commented on HADOOP-10591:
----------------------------------
[~andrew.wang]: HADOOP-10047 was a change which avoided the need to allocate
direct buffers by the decompressors implementing the DirectDecompressor
interface.
DirectDecompressor::decompress(ByteBuffer src, ByteBuffer dst)
was meant to avoid allocating objects in the decompressor object's control.
That does a decompress from src into dst without an intermediate allocation or
copy.
Before that ORC couldn't use own the buffer pools for src/dst.
The issue in this bug pre-dates HADOOP-10047.
> Compression codecs must used pooled direct buffers or deallocate direct
> buffers when stream is closed
> -----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-10591
> URL: https://issues.apache.org/jira/browse/HADOOP-10591
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Hari Shreedharan
> Assignee: Colin Patrick McCabe
>
> Currently direct buffers allocated by compression codecs like Gzip (which
> allocates 2 direct buffers per instance) are not deallocated when the stream
> is closed. Eventually for long running processes which create a huge number
> of files, these direct buffers are left hanging till a full gc, which may or
> may not happen in a reasonable amount of time - especially if the process
> does not use a whole lot of heap.
> Either these buffers should be pooled or they should be deallocated when the
> stream is closed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)