[jira] [Commented] (HDFS-4995) Make getContentSummary() less expensive

Daryn Sharp (JIRA) Fri, 01 Nov 2013 15:14:14 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811708#comment-13811708
 ]


Daryn Sharp commented on HDFS-4995:
-----------------------------------

I'd prefer to see ContentSummaryComputationContext and FSComputationContext 
folded together to help prevent proliferation of this technique.  While very 
attractive for content summary, this approach is really going to make my 
experimental fine grain locking harder so I'd like to see it confined to 
content summary.

Minor: {{INodeDirectory.computeContentSummary}} appears to be unnecessarily 
copying the last child's local name.  I believe the name is considered 
immutable.

I think the staleness handling could be improved and simpler to understand by 
using an incrementing yield count instead of a stale boolean.  Ie. calling 
yield returns the current yield count which may be the same if no yield was 
necessary.  Then the directory iteration can remember the current yield count 
when it fetches its children, and refresh its position if the current yield 
count is different.

You may even consider passing inodes to the computation context so you could 
encapsulate the entire yield behavior.


> Make getContentSummary() less expensive
> ---------------------------------------
>
>                 Key: HDFS-4995
>                 URL: https://issues.apache.org/jira/browse/HDFS-4995
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 0.23.9, 2.3.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-4995.trunk.2.patch, HDFS-4995.trunk.patch, 
> HDFS-4995.trunk1.patch
>
>
> When users call du or count DFS command, getContentSummary() method is called 
> against namenode. If the directory has many directories and files, it could 
> hold the namesystem lock for a long time. We've seen it taking over 20 
> seconds. Namenode should not allow regular users to cause extended locking.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HDFS-4995) Make getContentSummary() less expensive

Reply via email to