[ 
https://issues.apache.org/jira/browse/HDFS-10999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576844#comment-15576844
 ] 

Allen Wittenauer commented on HDFS-10999:
-----------------------------------------

bq. We tried to draw an equivalence between the durability of EC and replicated 
files by looking at the # of failures to data loss. This way we have a way of 
prioritizing both types of recovery work on the NN (see the LowRedundancyBlocks 
class, nee UnderReplicatedBlocks).

Hmm. That's great for the NN, but it leaves me as an admin in the dark. 

A: "So we had some issues on HDFS."

M: "What's the damage?"

A: "We are missing x blocks."

M: "How long for recovery?"

A: "No idea.  The NN doesn't tell me if these are EC blocks or regular blocks 
that were lost and one is faster to recover than the other."

bq.  In my experience, the "# under replicated blocks" is used as a quick check 
of cluster health.

It's used for that, but I've also used it during system recovery and migrations 
as a measurement of how many more DNs I need to bring up such that I more 
sources for block replication. This number represents something that I as an 
admin have some semblance of control over:  I could always manually copy blocks 
from one node to another to speed things up.  

Under EC, I don't of anything manual I can do if it is missing chunks of blocks.



> Use more generic "low redundancy" blocks instead of "under replicated" blocks
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-10999
>                 URL: https://issues.apache.org/jira/browse/HDFS-10999
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Wei-Chiu Chuang
>            Assignee: Yuanbo Liu
>              Labels: supportability
>
> Per HDFS-9857, it seems in the Hadoop 3 world, people prefer the more generic 
> term "low redundancy" to the old-fashioned "under replicated". But this term 
> is still being used in messages in several places, such as web ui, dfsadmin 
> and fsck. We should probably change them to avoid confusion.
> File this jira to discuss it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to