[ 
https://issues.apache.org/jira/browse/HDFS-7955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15159765#comment-15159765
 ] 

Zhe Zhang commented on HDFS-7955:
---------------------------------

bq. Yeah, will start with BlockManager entities, UnderReplicatedBlocks => 
LowRedundancyBlocks, PendingReplicationBlocks => PendingReconstructionBlocks, 
neededReplications => neededReconstruction, excessReplicateMap => 
extraRedundancyMap
Thanks [~rakeshr], LGTM

bq. I hope you are suggesting to deprecate dfs#getUnderReplicatedBlocksCount() 
and add new API dfs#getLowRedundancyBlockCount(). Again we need to consider 
FSNamesystemMBean metrics UnderReplicatedBlocks, PendingReplicationBlocks, 
scheduledReplicationBlocksCount etc. Also, requires changes in DFSAdmin -report 
basic filesystem information and fsck#ReplicationResult commands output.
The {{getUnderReplicatedBlocksCount}} API is used by upper layer apps quite 
heavily, so deprecating it looks risky. I think we can leave this discussion 
open for a longer window to solicit feedbacks.

> Improve naming of classes, methods, and variables related to block 
> replication and recovery
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7955
>                 URL: https://issues.apache.org/jira/browse/HDFS-7955
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: erasure-coding
>            Reporter: Zhe Zhang
>            Assignee: Rakesh R
>         Attachments: HDFS-7955-001.patch, HDFS-7955-002.patch, 
> HDFS-7955-003.patch, HDFS-7955-004.patch, HDFS-7955-5.patch
>
>
> Many existing names should be revised to avoid confusion when blocks can be 
> both replicated and erasure coded. This JIRA aims to solicit opinions on 
> making those names more consistent and intuitive.
> # In current HDFS _block recovery_ refers to the process of finalizing the 
> last block of a file, triggered by _lease recovery_. It is different from the 
> intuitive meaning of _recovering a lost block_. To avoid confusion, I can 
> think of 2 options:
> #* Rename this process as _block finalization_ or _block completion_. I 
> prefer this option because this is literally not a recovery.
> #* If we want to keep existing terms unchanged we can name all EC recovery 
> and re-replication logics as _reconstruction_.  
> # As Kai [suggested | 
> https://issues.apache.org/jira/browse/HDFS-7369?focusedCommentId=14361131&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14361131]
>  under HDFS-7369, several replication-based names should be made more generic:
> #* {{UnderReplicatedBlocks}} and {{neededReplications}}. E.g. we can use 
> {{LowRedundancyBlocks}}/{{AtRiskBlocks}}, and 
> {{neededRecovery}}/{{neededReconstruction}}.
> #* {{PendingReplicationBlocks}}
> #* {{ReplicationMonitor}}
> I'm sure the above list is incomplete; discussions and comments are very 
> welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to