[
https://issues.apache.org/jira/browse/HDFS-4995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811708#comment-13811708
]
Daryn Sharp commented on HDFS-4995:
-----------------------------------
I'd prefer to see ContentSummaryComputationContext and FSComputationContext
folded together to help prevent proliferation of this technique. While very
attractive for content summary, this approach is really going to make my
experimental fine grain locking harder so I'd like to see it confined to
content summary.
Minor: {{INodeDirectory.computeContentSummary}} appears to be unnecessarily
copying the last child's local name. I believe the name is considered
immutable.
I think the staleness handling could be improved and simpler to understand by
using an incrementing yield count instead of a stale boolean. Ie. calling
yield returns the current yield count which may be the same if no yield was
necessary. Then the directory iteration can remember the current yield count
when it fetches its children, and refresh its position if the current yield
count is different.
You may even consider passing inodes to the computation context so you could
encapsulate the entire yield behavior.
> Make getContentSummary() less expensive
> ---------------------------------------
>
> Key: HDFS-4995
> URL: https://issues.apache.org/jira/browse/HDFS-4995
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 0.23.9, 2.3.0
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Attachments: HDFS-4995.trunk.2.patch, HDFS-4995.trunk.patch,
> HDFS-4995.trunk1.patch
>
>
> When users call du or count DFS command, getContentSummary() method is called
> against namenode. If the directory has many directories and files, it could
> hold the namesystem lock for a long time. We've seen it taking over 20
> seconds. Namenode should not allow regular users to cause extended locking.
--
This message was sent by Atlassian JIRA
(v6.1#6144)