[ 
https://issues.apache.org/jira/browse/HDFS-7955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804298#comment-14804298
 ] 

Andrew Wang commented on HDFS-7955:
-----------------------------------

I noticed with HDFS-8899 the datanode config keys need some renaming:

{noformat}
  public static final String  DFS_DATANODE_STRIPED_READ_THREADS_KEY = 
"dfs.datanode.stripedread.threads";
  public static final int     DFS_DATANODE_STRIPED_READ_THREADS_DEFAULT = 20;
  public static final String  DFS_DATANODE_STRIPED_READ_BUFFER_SIZE_KEY = 
"dfs.datanode.stripedread.buffer.size";
  public static final int     DFS_DATANODE_STRIPED_READ_BUFFER_SIZE_DEFAULT = 
64 * 1024;
  public static final String  DFS_DATANODE_STRIPED_READ_TIMEOUT_MILLIS_KEY = 
"dfs.datanode.stripedread.timeout.millis";
  public static final int     DFS_DATANODE_STRIPED_READ_TIMEOUT_MILLIS_DEFAULT 
= 5000; //5s
  public static final String  DFS_DATANODE_STRIPED_BLK_RECOVERY_THREADS_KEY = 
"dfs.datanode.striped.blockrecovery.threads.size";
  public static final int     DFS_DATANODE_STRIPED_BLK_RECOVERY_THREADS_DEFAULT 
= 8;
{noformat}

The term "block recovery" is overloaded here, I'd recommend "reconstruction" 
instead. All of these config keys are also for ECWorker and related, so should 
also have the same prefix, e.g. "dfs.datanode.ec.reconstruction" or something. 
IIUC there's a "read" thread pool and a "compute" thread pool; that distinction 
hopefully is also made apparent in the key naming and descriptions.

> Improve naming of classes, methods, and variables related to block 
> replication and recovery
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7955
>                 URL: https://issues.apache.org/jira/browse/HDFS-7955
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Rakesh R
>         Attachments: HDFS-7955-001.patch
>
>
> Many existing names should be revised to avoid confusion when blocks can be 
> both replicated and erasure coded. This JIRA aims to solicit opinions on 
> making those names more consistent and intuitive.
> # In current HDFS _block recovery_ refers to the process of finalizing the 
> last block of a file, triggered by _lease recovery_. It is different from the 
> intuitive meaning of _recovering a lost block_. To avoid confusion, I can 
> think of 2 options:
> #* Rename this process as _block finalization_ or _block completion_. I 
> prefer this option because this is literally not a recovery.
> #* If we want to keep existing terms unchanged we can name all EC recovery 
> and re-replication logics as _reconstruction_.  
> # As Kai [suggested | 
> https://issues.apache.org/jira/browse/HDFS-7369?focusedCommentId=14361131&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14361131]
>  under HDFS-7369, several replication-based names should be made more generic:
> #* {{UnderReplicatedBlocks}} and {{neededReplications}}. E.g. we can use 
> {{LowRedundancyBlocks}}/{{AtRiskBlocks}}, and 
> {{neededRecovery}}/{{neededReconstruction}}.
> #* {{PendingReplicationBlocks}}
> #* {{ReplicationMonitor}}
> I'm sure the above list is incomplete; discussions and comments are very 
> welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to