[ 
https://issues.apache.org/jira/browse/HDFS-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956441#comment-15956441
 ] 

SammiChen commented on HDFS-10531:
----------------------------------

bq. If instead we want some way of printing out the EC policy set on paths, 
that seems like it belongs in {{ls}} or even {{hdfs erasurecode}}.
Agree. I will fire a new JIRA to improve {{ls}} to show EC policy as long as 
the replication factor. Will leverage current "replication factor" column to 
show the EC policy name.  
{quote}
 hdfs dfs -ls /
Found 5 items
-rw-r--r--   3 root supergroup       1366 2017-03-15 16:51 /README.txt
drwxr-xr-x   - root supergroup          0 2017-03-16 15:54 /benchmarks
drwxr-xr-x   - root supergroup          0 2017-04-05 14:10 /home
drwxr-xr-x   - root supergroup          0 2017-03-16 16:16 /system
drwx------   - root supergroup          0 2017-03-07 14:08 /tmp
{quote}
>From user's point of view, put the function in "ls" is better than put in "ec" 
>function. Because "ls" has already has the column to show file replication 
>factor. EC is one of file replication scheme. So it's natural to show file's 
>EC policy here. However it will make the "ec -getPolicy" sub-function a little 
>bit redundant. 

As for this JIRA, since EC file is no different from 3-way replication file 
from quotation point of view,  it's not clear user can benefit what from 
knowing how many quotas used by each type of EC policy. So I will not recommend 
add "EC" information in "hdfs dfs -count" command. 
Cluster wide stats is helpful. And if consider multi-tenant cluster 
environment, per directory stats will also be helpful. So have EC policy 
summary in "du" command can help user. 

[~andrew.wang], what do you think? 




> Add EC policy and storage policy related usage summarization function to dfs 
> du command
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-10531
>                 URL: https://issues.apache.org/jira/browse/HDFS-10531
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Rui Gao
>            Assignee: SammiChen
>              Labels: hdfs-ec-3.0-nice-to-have
>         Attachments: HDFS-10531.001.patch
>
>
> Currently du command output:
> {code}
>         [ ~]$ hdfs dfs -du  -h /home/rgao/
>         0      /home/rgao/.Trash
>         0      /home/rgao/.staging
>         100 M  /home/rgao/ds
>         250 M  /home/rgao/ds-2
>         200 M  /home/rgao/noECBackup-ds
>         500 M  /home/rgao/noECBackup-ds-2
> {code}
> For hdfs users and administrators, EC policy and storage policy related usage 
> summarization would be very helpful when managing storages of cluster. The 
> imitate output of du could be like the following.
> {code}
>         [ ~]$ hdfs dfs -du  -h -t( total, parameter to be added) /home/rgao
>          
>         0      /home/rgao/.Trash
>         0      /home/rgao/.staging
>         [Archive] [EC:RS-DEFAULT-6-3-64k] 100 M  /home/rgao/ds
>         [DISK] [EC:RS-DEFAULT-6-3-64k]     250 M  /home/rgao/ds-2
>         [DISK] [Replica]     200 M  /home/rgao/noECBackup-ds
>         [DISK] [Replica]     500 M  /home/rgao/noECBackup-ds-2
>          
>         Total:
>          
>         [Archive][EC:RS-DEFAULT-6-3-64k]  100 M
>         [Archive][Replica]                                0 M
>         [DISK] [EC:RS-DEFAULT-6-3-64k]     250 M
>         [DISK] [Replica]                               700 M  
>      
>         [Archive][ALL]                                 100M
>         [DISK]    [ALL]                                  950M
>         [ALL]     [EC:RS-DEFAULT-6-3-64k]    350M
>         [ALL]     [Replica]                              700M
> {code}     



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to