[
https://issues.apache.org/jira/browse/HADOOP-10591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996789#comment-13996789
]
Colin Patrick McCabe commented on HADOOP-10591:
-----------------------------------------------
We have two ways we could go on this one. One is to implement a buffer pooling
scheme. Another is to manually free the direct buffers.
The buffer-pooling scheme initially might seem more attractive, but it's
problematic. We don't know that all the buffers we're creating will be the
same size, so we end up with the same kind of problems you get when
implementing {{malloc}}. It's also unclear how long we should hang on to
buffers when they're not in use.
Manually freeing the buffers is possible through a Sun-specific API. We do
this in a few other cases-- for example, to {{munmap}} a memory segment. This
is probably the simpler route to go.
> Compression codecs must used pooled direct buffers or deallocate direct
> buffers when stream is closed
> -----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-10591
> URL: https://issues.apache.org/jira/browse/HADOOP-10591
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Hari Shreedharan
>
> Currently direct buffers allocated by compression codecs like Gzip (which
> allocates 2 direct buffers per instance) are not deallocated when the stream
> is closed. Eventually for long running processes which create a huge number
> of files, these direct buffers are left hanging till a full gc, which may or
> may not happen in a reasonable amount of time - especially if the process
> does not use a whole lot of heap.
> Either these buffers should be pooled or they should be deallocated when the
> stream is closed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)