[ https://issues.apache.org/jira/browse/HADOOP-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172743#comment-14172743 ]
Byron Wong commented on HADOOP-6857: ------------------------------------ *Scenario 1*: "/test" is a snapshottable directory with a file "a" that has 41 bytes, replication factor 3. We run {{hadoop fs -du /test}}: {code} 41 123 /test/a {code} which is consistent with what we get when we run {{hadoop fs -du -s /test}}: {code} 41 123 /test {code} When we create a snapshot "ss1" and rerun the -du commands again, we still get the same results as seen above. Let's say we now run {{hadoop fs -mv /test/a /test/b}}. Now, when we run {{hadoop fs -du /test}}, we get: {code} 41 123 /test/b {code} which is inconsistent with what we see when we run {{hadoop fs -du -s /test}}: {code} 41 246 /test {code} If we report this process again (i.e. create snapshot, rename /test/b to /test/a), we get more and more deviations between the 2 commands. > FsShell should report raw disk usage including replication factor > ----------------------------------------------------------------- > > Key: HADOOP-6857 > URL: https://issues.apache.org/jira/browse/HADOOP-6857 > Project: Hadoop Common > Issue Type: Improvement > Components: fs > Reporter: Alex Kozlov > Assignee: Byron Wong > Attachments: HADOOP-6857.patch, show-space-consumed.txt > > > Currently FsShell report HDFS usage with "hadoop fs -dus <path>" command. > Since replication level is per file level, it would be nice to add raw disk > usage including the replication factor (maybe "hadoop fs -dus -raw <path>"?). > This will allow to assess resource usage more accurately. -- Alex K -- This message was sent by Atlassian JIRA (v6.3.4#6332)