[
https://issues.apache.org/jira/browse/HDFS-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16778595#comment-16778595
]
Joe McDonnell commented on HDFS-14308:
--------------------------------------
[~knanasi] Yes, good point, DFSStripedInputStream does implement unbuffer(). I
think the issue revolves around curStripeBuf. It is allocated from the
BUFFER_POOL in resetCurStripeBuffer(true), but it doesn't get returned to the
BUFFER_POOL until close(). closeCurrentBlockReaders() calls
resetCurStripeBuffer(false), which clears curStripeBuf but does not return it
to the BUFFER_POOL.
I will update the description / title.
> DFSStripedInputStream should implement unbuffer()
> -------------------------------------------------
>
> Key: HDFS-14308
> URL: https://issues.apache.org/jira/browse/HDFS-14308
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: Joe McDonnell
> Priority: Major
> Attachments: ec_heap_dump.png
>
>
> Some users of HDFS cache opened HDFS file handles to avoid repeated
> roundtrips to the NameNode. For example, Impala caches up to 20,000 HDFS file
> handles by default. Recent tests on erasure coded files show that the open
> file handles can consume a large amount of memory when not in use.
> For example, here is output from Impala's JMX endpoint when 608 file handles
> are cached
> {noformat}
> {
> "name": "java.nio:type=BufferPool,name=direct",
> "modelerType": "sun.management.ManagementFactoryHelper$1",
> "Name": "direct",
> "TotalCapacity": 1921048960,
> "MemoryUsed": 1921048961,
> "Count": 633,
> "ObjectName": "java.nio:type=BufferPool,name=direct"
> },{noformat}
> This shows direct buffer memory usage of 3MB per DFSStripedInputStream.
> Attached is output from Eclipse MAT showing that the direct buffers come from
> DFSStripedInputStream objects.
> To support caching file handles on erasure coded files, DFSStripedInputStream
> should implement the unbuffer() call. See HDFS-7694. "unbuffer()" is intended
> to move an input stream to a lower memory state to support these caching use
> cases. Both Impala and HBase call unbuffer() when a file handle is being
> cached and potentially unused for significant chunks of time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]