[ 
https://issues.apache.org/jira/browse/HDFS-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881777#comment-15881777
 ] 

Manoj Govindassamy commented on HDFS-10999:
-------------------------------------------

Based on the discussions and consensus above, my understanding is that we want 
to go about having tools/UI reporting Replicated and EC Blocks separately. 

1. {{fsck}} command already reports Replicated blocks and EC blocks separately. 
Verified the reporting under EC blocks and they look good to me. Not planning 
to add more changes to {{fsck}} for now w.r.t this jira.
{noformat}
# hdfs fsck /
Connecting to namenode via http://127.0.0.1:50002/fsck?ugi=manoj&path=%2F
FSCK started by manoj (auth:SIMPLE) from /127.0.0.1 for path / at Thu Feb 23 
15:21:06 PST 2017

Status: HEALTHY
 Number of data-nodes:  3
 Number of racks:               1
 Total dirs:                    5
 Total symlinks:                0

Replicated Blocks:
 Total size:    10240000 B
 Total files:   5
 Total blocks (validated):      5 (avg. block size 2048000 B)
 Minimally replicated blocks:   5 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0
 Missing blocks:                0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)

Erasure Coded Block Groups:
 Total size:    10240000 B
 Total files:   5
 Total block groups (validated):        5 (avg. block group size 2048000 B)
 Minimally erasure-coded block groups:  5 (100.0 %)
 Over-erasure-coded block groups:       0 (0.0 %)
 Under-erasure-coded block groups:      0 (0.0 %)
 Unsatisfactory placement block groups: 0 (0.0 %)
 Default ecPolicy:              RS-DEFAULT-6-3-64k
 Average block group size:      3.0
 Missing block groups:          0
 Corrupt block groups:          0
 Missing internal blocks:       0 (0.0 %)
FSCK ended at Thu Feb 23 15:21:06 PST 2017 in 15 milliseconds

The filesystem under path '/' is HEALTHY
{noformat}


----


2. {{dfsadmin -report}} command is not reporting EC blocks separately.  Today, 
report command gets stats from {{FSNameSystem#getStats()}} which is the 
combined stats for both Replicated and EC Blocks. 
*  {noformat}
# hdfs dfsadmin -report
Configured Capacity: 1498775814144 (1.36 TB)
Present Capacity: 931852427264 (867.86 GB)
DFS Remaining: 931805765632 (867.81 GB)
DFS Used: 46661632 (44.50 MB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
{noformat}

*Proposal:* 
-- For backward compatibility reasons, let the current 
{{FSNameSystem#getStats()}} be as is, and will continue to return cumulative 
stats for all Block combined.
-- Introduce {{FSNameSystem#getReplicatedBlockStats()}} and 
{{FSNameSystem#getECBlockStats()}} to capture Replicated and EC Blocks stats 
separately.
-- In the report {{Under replicated blocks}}, {{Blocks with corrupt replicas}}, 
{{Missing blocks}} will only show stats for Replicated blocks (compared to the 
current cumulative numbers)
-- New fields like {{Under erasure coded block groups}}, {{Corrupt erasure 
coded block groups}}, {{Missing erasure coded block groups}} will be added to 
the report command which contains stats for Erasure coded blocks only.
*  {noformat}
# hdfs dfsadmin -report
Configured Capacity: 1498775814144 (1.36 TB)
Present Capacity: 931852427264 (867.86 GB)
DFS Remaining: 931805765632 (867.81 GB)
DFS Used: 46661632 (44.50 MB)
DFS Used%: 0.01%
Replicated Blocks:
  Under replicated blocks: 0
  Blocks with corrupt replicas: 0
  Missing blocks: 0
  Missing blocks (with replication factor 1): 0
  Pending deletion blocks: 0
Erasure Coded Block Groups:
  Under erasure coded blocks groups: 0
  Erasure coded blocks with corrupt replicas: 0
  Missing erasure coded blocks: 0
  Pending deletion erasure coded blocks: 0
{noformat}


----

3. For the WebUI, in order to report Erasure Coded blocks details 
{{FSNameSysatemMBean}} need to be extended.

-- Currently we have the following ones reported under Summary section in 
NameNode UI, but they will be including both Replicated + EC stats  {noformat}
Number of Under-Replicated Blocks       
Number of Blocks Pending Deletion       
{noformat}
-- [~lewuathe] has already 
[proposed|https://issues.apache.org/jira/secure/attachment/12852567/Screen%20Shot%202017-02-14%20at%2022.43.57.png]
 a patch for adding Total EC blocks and its size under HDFS-8196. 

*Proposal:* 
-- Display the Replicated and EC block stats separately in the Summary section 
NameNode UI. No cumulative stats. {noformat}
Number of Under-Replicated Blocks 
Number of Blocks Pending Deletion 
Number of Under-Erasure-Coded Blocks Groups
Number of Erasure Coded Blocks Pending Deletion 
{noformat}
 

[~andrew.wang], [~aw], [~tasanuma0829], [~jojochuang], [~yuanbo], Can you 
please share your thoughts on the above proposals ? Thanks.


> Use more generic "low redundancy" blocks instead of "under replicated" blocks
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-10999
>                 URL: https://issues.apache.org/jira/browse/HDFS-10999
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Wei-Chiu Chuang
>            Assignee: Manoj Govindassamy
>              Labels: hdfs-ec-3.0-nice-to-have, supportability
>
> Per HDFS-9857, it seems in the Hadoop 3 world, people prefer the more generic 
> term "low redundancy" to the old-fashioned "under replicated". But this term 
> is still being used in messages in several places, such as web ui, dfsadmin 
> and fsck. We should probably change them to avoid confusion.
> File this jira to discuss it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to