[
https://issues.apache.org/jira/browse/HDFS-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17331878#comment-17331878
]
Qi Zhu commented on HDFS-13904:
-------------------------------
Is this going on?
I met the problem also.--
Thanks.
> ContentSummary does not always respect processing limit, resulting in long
> lock acquisitions
> --------------------------------------------------------------------------------------------
>
> Key: HDFS-13904
> URL: https://issues.apache.org/jira/browse/HDFS-13904
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs, namenode
> Reporter: Erik Krogen
> Assignee: Erik Krogen
> Priority: Major
>
> HDFS-4995 added a config {{dfs.content-summary.limit}} which allows for an
> administrator to set a limit on the number of entries processed during a
> single acquisition of the {{FSNamesystemLock}} during the creation of a
> content summary. This is useful to prevent very long (multiple seconds)
> pauses on the NameNode when {{getContentSummary}} is called on large
> directories.
> However, even on versions with HDFS-4995, we have seen warnings like:
> {code}
> INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem read
> lock held for 9398 ms via
> java.lang.Thread.getStackTrace(Thread.java:1552)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:950)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.readUnlock(FSNamesystemLock.java:188)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.readUnlock(FSNamesystem.java:1486)
> org.apache.hadoop.hdfs.server.namenode.ContentSummaryComputationContext.yield(ContentSummaryComputationContext.java:109)
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.computeDirectoryContentSummary(INodeDirectory.java:679)
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.computeContentSummary(INodeDirectory.java:642)
> org.apache.hadoop.hdfs.server.namenode.INodeDirectory.computeDirectoryContentSummary(INodeDirectory.java:656)
> {code}
> happen quite consistently when {{getContentSummary}} was called on a large
> directory on a heavily-loaded NameNode. Such long pauses completely destroy
> the performance of the NameNode. We have the limit set to its default of
> 5000; if it was respected, clearly there would not be a 10-second pause.
> The current {{yield()}} code within {{ContentSummaryComputationContext}}
> looks like:
> {code}
> public boolean yield() {
> // Are we set up to do this?
> if (limitPerRun <= 0 || dir == null || fsn == null) {
> return false;
> }
> // Have we reached the limit?
> long currentCount = counts.getFileCount() +
> counts.getSymlinkCount() +
> counts.getDirectoryCount() +
> counts.getSnapshotableDirectoryCount();
> if (currentCount <= nextCountLimit) {
> return false;
> }
> // Update the next limit
> nextCountLimit = currentCount + limitPerRun;
> boolean hadDirReadLock = dir.hasReadLock();
> boolean hadDirWriteLock = dir.hasWriteLock();
> boolean hadFsnReadLock = fsn.hasReadLock();
> boolean hadFsnWriteLock = fsn.hasWriteLock();
> // sanity check.
> if (!hadDirReadLock || !hadFsnReadLock || hadDirWriteLock ||
> hadFsnWriteLock || dir.getReadHoldCount() != 1 ||
> fsn.getReadHoldCount() != 1) {
> // cannot relinquish
> return false;
> }
> // unlock
> dir.readUnlock();
> fsn.readUnlock("contentSummary");
> try {
> Thread.sleep(sleepMilliSec, sleepNanoSec);
> } catch (InterruptedException ie) {
> } finally {
> // reacquire
> fsn.readLock();
> dir.readLock();
> }
> yieldCount++;
> return true;
> }
> {code}
> We believe that this check in particular is the culprit:
> {code}
> if (!hadDirReadLock || !hadFsnReadLock || hadDirWriteLock ||
> hadFsnWriteLock || dir.getReadHoldCount() != 1 ||
> fsn.getReadHoldCount() != 1) {
> // cannot relinquish
> return false;
> }
> {code}
> The content summary computation will only relinquish the lock if it is
> currently the _only_ holder of the lock. Given the high volume of read
> requests on a heavily loaded NameNode, especially when unfair locking is
> enabled, it is likely there may be another holder of the read lock performing
> some short-lived operation. By refusing to give up the lock in this case, the
> content summary computation ends up never relinquishing the lock.
> We propose to simply remove the readHoldCount checks from this {{yield()}}.
> This should alleviate the case described above by giving up the read lock and
> allowing other short-lived operations to complete (while the content summary
> thread sleeps) so that the lock can finally be given up completely. This has
> the drawback that sometimes, the content summary may give up the lock
> unnecessarily, if the read lock is never actually released by the time the
> thread continues again. The only negative impact from this is to make some
> large content summary operations slightly slower, with the tradeoff of
> reducing NameNode-wide performance impact.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]