[ 
https://issues.apache.org/jira/browse/HDFS-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15631433#comment-15631433
 ] 

Andrew Wang commented on HDFS-10531:
------------------------------------

Thanks for working on this Wei-Chiu. I have some code review comments, but 
would like to start by asking about the usecase. We already have {{hadoop fs 
-count -q}} which shows us usage by storage type. If we want to do this for EC 
too, I'd prefer we add functionality to the {{Count}} command.

If instead we want some way of printing out the EC policy set on paths, that 
seems like it belongs in {{ls}} or even {{hdfs erasurecode}}.

Since EC doesn't relate to quota though, I think normally admins will be the 
ones who care about how much data is EC, and at a cluster level. Cluster-wide 
stats are thus a better place for these numbers.

Code comments:

* In ContentSummaryComputationContext, let's try to follow the recommended 
modifier order: 
http://cr.openjdk.java.net/~alundblad/styleguide/index-v6.html#toc-modifiers
* Should we say "redundancy" rather than "replication" in the docs / help? e.g. 
{{disk_space_consumed_with_all_replicas}} and {{Display storage and replication 
policy}}.
* Commented new code in INodeFile
* I think it'd look better to put the new columns at the end. This is less 
likely to break parsing scripts too, since they normally parse left-to-right. 
Since this output is tab-delimited, we also don't need brackets.
* Could you add a test for "-s" with "-t" ?

> Add EC policy and storage policy related usage summarization function to dfs 
> du command
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-10531
>                 URL: https://issues.apache.org/jira/browse/HDFS-10531
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Rui Gao
>            Assignee: Wei-Chiu Chuang
>         Attachments: HDFS-10531.001.patch
>
>
> Currently du command output:
> {code}
>         [ ~]$ hdfs dfs -du  -h /home/rgao/
>         0      /home/rgao/.Trash
>         0      /home/rgao/.staging
>         100 M  /home/rgao/ds
>         250 M  /home/rgao/ds-2
>         200 M  /home/rgao/noECBackup-ds
>         500 M  /home/rgao/noECBackup-ds-2
> {code}
> For hdfs users and administrators, EC policy and storage policy related usage 
> summarization would be very helpful when managing storages of cluster. The 
> imitate output of du could be like the following.
> {code}
>         [ ~]$ hdfs dfs -du  -h -t( total, parameter to be added) /home/rgao
>          
>         0      /home/rgao/.Trash
>         0      /home/rgao/.staging
>         [Archive] [EC:RS-DEFAULT-6-3-64k] 100 M  /home/rgao/ds
>         [DISK] [EC:RS-DEFAULT-6-3-64k]     250 M  /home/rgao/ds-2
>         [DISK] [Replica]     200 M  /home/rgao/noECBackup-ds
>         [DISK] [Replica]     500 M  /home/rgao/noECBackup-ds-2
>          
>         Total:
>          
>         [Archive][EC:RS-DEFAULT-6-3-64k]  100 M
>         [Archive][Replica]                                0 M
>         [DISK] [EC:RS-DEFAULT-6-3-64k]     250 M
>         [DISK] [Replica]                               700 M  
>      
>         [Archive][ALL]                                 100M
>         [DISK]    [ALL]                                  950M
>         [ALL]     [EC:RS-DEFAULT-6-3-64k]    350M
>         [ALL]     [Replica]                              700M
> {code}     



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to