[jira] [Commented] (HDFS-10999) Use more generic "low redundancy" blocks instead of "under replicated" blocks

Takanobu Asanuma (JIRA) Fri, 24 Feb 2017 06:44:32 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882803#comment-15882803
 ]


Takanobu Asanuma commented on HDFS-10999:
-----------------------------------------

Thanks for the good summary, [~manojg]! I agree with you for the most part. I 
want to share my thoughts.

1. +1 for not changing {{fsck}}.

----

2, 3. I think changing {{dfsadmin -report}} and {{NN-WebUI}} are almost same 
work because they refers to the same metrics of {{FSNamesystemMBean}}. So the 
key point is how to extend {{FSNamesystemMBean}}.

{quote}
– For backward compatibility reasons, let the current FSNameSystem#getStats() 
be as is, and will continue to return cumulative stats for all Block combined.
– Introduce FSNameSystem#getReplicatedBlockStats() and 
FSNameSystem#getECBlockStats() to capture Replicated and EC Blocks stats 
separately.
{quote}

I agree with that. And I think this is fit for my suggestion that is adding new 
two mbeans for replicated-blocks and ec-block-groups to {{FSNamesystem}}.

*My proposal based on your proposal* :
-- Since {{FSNameSystem#getStats}} refers to {{FSNameSystemMBean}}, let them be 
as they are. It would be good if we use the new generic terms here.
-- Add new mbeans, {{ReplicatedBlockMBean}} and {{ECBlockGroupMBean}}, to 
{{FSNamesystem}}.
-- {{FSNameSystem#getReplicatedBlockStats}} refers to {{ReplicatedBlockMBean}}.
-- {{FSNameSystem#getECBlockGroupStats}} refers to {{ECBlockGroupMBean}}.

----

Let's be careful with terminology to avoid confusions. Referring to fsck would 
be better.

|| replicated || erasure coded ||
| block(s) | block group(s) |
| replica(s) | internal block(s) |

So like this:
{noformat}
# hdfs dfsadmin -report
Configured Capacity: 1498775814144 (1.36 TB)
Present Capacity: 931852427264 (867.86 GB)
DFS Remaining: 931805765632 (867.81 GB)
DFS Used: 46661632 (44.50 MB)
DFS Used%: 0.01%
Replicated Blocks:
  Under replicated blocks: 0
  Blocks with corrupt replicas: 0
  Missing blocks: 0
  Missing blocks (with replication factor 1): 0
  Pending deletion blocks: 0
Erasure Coded Block Groups:
  Under ec block groups: 0
  EC block groups with corrupt internal blocks: 0
  Missing ec block groups: 0
  Pending deletion ec block groups: 0
{noformat}

> Use more generic "low redundancy" blocks instead of "under replicated" blocks
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-10999
>                 URL: https://issues.apache.org/jira/browse/HDFS-10999
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Wei-Chiu Chuang
>            Assignee: Manoj Govindassamy
>              Labels: hdfs-ec-3.0-nice-to-have, supportability
>
> Per HDFS-9857, it seems in the Hadoop 3 world, people prefer the more generic 
> term "low redundancy" to the old-fashioned "under replicated". But this term 
> is still being used in messages in several places, such as web ui, dfsadmin 
> and fsck. We should probably change them to avoid confusion.
> File this jira to discuss it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-10999) Use more generic "low redundancy" blocks instead of "under replicated" blocks

Reply via email to