[ https://issues.apache.org/jira/browse/HDFS-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957528#comment-16957528 ]
lindongdong commented on HDFS-14308: ------------------------------------ Hi, [~zhaoyim], Thanks for your work. A suggestion for the latest patch: using "super.unbuffer()" rather than "closeCurrentBlockReaders()" is better > DFSStripedInputStream curStripeBuf is not freed by unbuffer() > ------------------------------------------------------------- > > Key: HDFS-14308 > URL: https://issues.apache.org/jira/browse/HDFS-14308 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec > Affects Versions: 3.0.0 > Reporter: Joe McDonnell > Assignee: Zhao Yi Ming > Priority: Major > Attachments: ec_heap_dump.png > > > Some users of HDFS cache opened HDFS file handles to avoid repeated > roundtrips to the NameNode. For example, Impala caches up to 20,000 HDFS file > handles by default. Recent tests on erasure coded files show that the open > file handles can consume a large amount of memory when not in use. > For example, here is output from Impala's JMX endpoint when 608 file handles > are cached > {noformat} > { > "name": "java.nio:type=BufferPool,name=direct", > "modelerType": "sun.management.ManagementFactoryHelper$1", > "Name": "direct", > "TotalCapacity": 1921048960, > "MemoryUsed": 1921048961, > "Count": 633, > "ObjectName": "java.nio:type=BufferPool,name=direct" > },{noformat} > This shows direct buffer memory usage of 3MB per DFSStripedInputStream. > Attached is output from Eclipse MAT showing that the direct buffers come from > DFSStripedInputStream objects. Both Impala and HBase call unbuffer() when a > file handle is being cached and potentially unused for significant chunks of > time, yet this shows that the memory remains in use. > To support caching file handles on erasure coded files, DFSStripedInputStream > should avoid holding buffers after the unbuffer() call. See HDFS-7694. > "unbuffer()" is intended to move an input stream to a lower memory state to > support these caching use cases. In particular, the curStripeBuf seems to be > allocated from the BUFFER_POOL on a resetCurStripeBuffer(true) call. It is > not freed until close(). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org