[ 
https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237909#comment-15237909
 ] 

Xiao Chen commented on HDFS-8986:
---------------------------------

Thanks Gautam for creating this, and everyone for the discussion.

I feel adding an option to exclude would be the better option to go for 
compatibility reasons. We should also update [the 
doc|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/FileSystemShell.html#du]
 to include this, and also document the current behavior of the going 1-level 
deep logic (code 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/FsUsage.java#L146]).

[~ggop], are you working on this? I can work on it or continue on what you 
already have, if you're busy.

> Add option to -du to calculate directory space usage excluding snapshots
> ------------------------------------------------------------------------
>
>                 Key: HDFS-8986
>                 URL: https://issues.apache.org/jira/browse/HDFS-8986
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: snapshots
>            Reporter: Gautam Gopalakrishnan
>            Assignee: Gautam Gopalakrishnan
>
> When running {{hadoop fs -du}} on a snapshotted directory (or one of its 
> children), the report includes space consumed by blocks that are only present 
> in the snapshots. This is confusing for end users.
> {noformat}
> $  hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M  2.3 G  /tmp/parent
> 799.7 M  2.3 G  /tmp/parent/sub1
> $ hdfs dfs -createSnapshot /tmp/parent snap1
> Created snapshot /tmp/parent/.snapshot/snap1
> $ hadoop fs -rm -skipTrash /tmp/parent/sub1/*
> ...
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M  2.3 G  /tmp/parent
> 799.7 M  2.3 G  /tmp/parent/sub1
> $ hdfs dfs -deleteSnapshot /tmp/parent snap1
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 0  0  /tmp/parent
> 0  0  /tmp/parent/sub1
> {noformat}
> It would be helpful if we had a flag, say -X, to exclude any snapshot related 
> disk usage in the output



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to