[
https://issues.apache.org/jira/browse/HDFS-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15956441#comment-15956441
]
SammiChen commented on HDFS-10531:
----------------------------------
bq. If instead we want some way of printing out the EC policy set on paths,
that seems like it belongs in {{ls}} or even {{hdfs erasurecode}}.
Agree. I will fire a new JIRA to improve {{ls}} to show EC policy as long as
the replication factor. Will leverage current "replication factor" column to
show the EC policy name.
{quote}
hdfs dfs -ls /
Found 5 items
-rw-r--r-- 3 root supergroup 1366 2017-03-15 16:51 /README.txt
drwxr-xr-x - root supergroup 0 2017-03-16 15:54 /benchmarks
drwxr-xr-x - root supergroup 0 2017-04-05 14:10 /home
drwxr-xr-x - root supergroup 0 2017-03-16 16:16 /system
drwx------ - root supergroup 0 2017-03-07 14:08 /tmp
{quote}
>From user's point of view, put the function in "ls" is better than put in "ec"
>function. Because "ls" has already has the column to show file replication
>factor. EC is one of file replication scheme. So it's natural to show file's
>EC policy here. However it will make the "ec -getPolicy" sub-function a little
>bit redundant.
As for this JIRA, since EC file is no different from 3-way replication file
from quotation point of view, it's not clear user can benefit what from
knowing how many quotas used by each type of EC policy. So I will not recommend
add "EC" information in "hdfs dfs -count" command.
Cluster wide stats is helpful. And if consider multi-tenant cluster
environment, per directory stats will also be helpful. So have EC policy
summary in "du" command can help user.
[~andrew.wang], what do you think?
> Add EC policy and storage policy related usage summarization function to dfs
> du command
> ---------------------------------------------------------------------------------------
>
> Key: HDFS-10531
> URL: https://issues.apache.org/jira/browse/HDFS-10531
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Affects Versions: 3.0.0-alpha1
> Reporter: Rui Gao
> Assignee: SammiChen
> Labels: hdfs-ec-3.0-nice-to-have
> Attachments: HDFS-10531.001.patch
>
>
> Currently du command output:
> {code}
> [ ~]$ hdfs dfs -du -h /home/rgao/
> 0 /home/rgao/.Trash
> 0 /home/rgao/.staging
> 100 M /home/rgao/ds
> 250 M /home/rgao/ds-2
> 200 M /home/rgao/noECBackup-ds
> 500 M /home/rgao/noECBackup-ds-2
> {code}
> For hdfs users and administrators, EC policy and storage policy related usage
> summarization would be very helpful when managing storages of cluster. The
> imitate output of du could be like the following.
> {code}
> [ ~]$ hdfs dfs -du -h -t( total, parameter to be added) /home/rgao
>
> 0 /home/rgao/.Trash
> 0 /home/rgao/.staging
> [Archive] [EC:RS-DEFAULT-6-3-64k] 100 M /home/rgao/ds
> [DISK] [EC:RS-DEFAULT-6-3-64k] 250 M /home/rgao/ds-2
> [DISK] [Replica] 200 M /home/rgao/noECBackup-ds
> [DISK] [Replica] 500 M /home/rgao/noECBackup-ds-2
>
> Total:
>
> [Archive][EC:RS-DEFAULT-6-3-64k] 100 M
> [Archive][Replica] 0 M
> [DISK] [EC:RS-DEFAULT-6-3-64k] 250 M
> [DISK] [Replica] 700 M
>
> [Archive][ALL] 100M
> [DISK] [ALL] 950M
> [ALL] [EC:RS-DEFAULT-6-3-64k] 350M
> [ALL] [Replica] 700M
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]