[
https://issues.apache.org/jira/browse/HDFS-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388857#comment-15388857
]
Wei-Chiu Chuang commented on HDFS-8986:
---------------------------------------
Hi [~xiaochen] thanks for working on the patch! I'd like to push this forward.
* in ContentSummary.java, the name of setter method for {{snapshotLength}},
{{snapshotFileCount}}, {{snapshotDirectoryCount}} and {{snapshotSpaceConsumed}}
should be prefixed by "set". E.g. {{setSnapshotLength}}
* in {{ContentSummary#equals()}}, you may declare a {{ContentSummary}} object
and typecast the {{to}} object to it, so as to avoid explicitly typecasting
every method call. This is just a personal taste, not big deal though.
* Please update FileSystemShell.md to include the -x option for the usage of
du. {noformat}Usage: `hadoop fs -du [-s] [-h] URI [URI ...]`{noformat}
* I don't understand this code in INodeDirectory, and I wonder if it has a bug.
If I understand it correctly, the counts field and snapshotCounts field of
summary object will be exactly the same. On the contrary, I think you may have
to declare another method similar to
{{DirectoryWithSnapshotFeature.computeContentSummary4Snapshot}}, but which
computes content for snapshottable subdirectories and files only.
{quote}
// if the getContentSummary call is against a non-snapshot path, the
// computation should include all the deleted files/directories
sf.computeContentSummary4Snapshot(summary.getBlockStoragePolicySuite(),
summary.getCounts());
// Also compute ContentSummary for snapshotCounts
sf.computeContentSummary4Snapshot(summary.getBlockStoragePolicySuite(),
summary.getSnapshotCounts());
{quote}
> Add option to -du to calculate directory space usage excluding snapshots
> ------------------------------------------------------------------------
>
> Key: HDFS-8986
> URL: https://issues.apache.org/jira/browse/HDFS-8986
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: snapshots
> Reporter: Gautam Gopalakrishnan
> Assignee: Xiao Chen
> Attachments: HDFS-8986.01.patch, HDFS-8986.02.patch
>
>
> When running {{hadoop fs -du}} on a snapshotted directory (or one of its
> children), the report includes space consumed by blocks that are only present
> in the snapshots. This is confusing for end users.
> {noformat}
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M 2.3 G /tmp/parent
> 799.7 M 2.3 G /tmp/parent/sub1
> $ hdfs dfs -createSnapshot /tmp/parent snap1
> Created snapshot /tmp/parent/.snapshot/snap1
> $ hadoop fs -rm -skipTrash /tmp/parent/sub1/*
> ...
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 799.7 M 2.3 G /tmp/parent
> 799.7 M 2.3 G /tmp/parent/sub1
> $ hdfs dfs -deleteSnapshot /tmp/parent snap1
> $ hadoop fs -du -h -s /tmp/parent /tmp/parent/*
> 0 0 /tmp/parent
> 0 0 /tmp/parent/sub1
> {noformat}
> It would be helpful if we had a flag, say -X, to exclude any snapshot related
> disk usage in the output
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]