[jira] [Commented] (HDFS-16735) Reduce the number of HeartbeatManager loops

ASF GitHub Bot (Jira) Tue, 23 Aug 2022 11:46:06 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-16735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17583803#comment-17583803
 ]


ASF GitHub Bot commented on HDFS-16735:
---------------------------------------

goiri commented on code in PR #4780:
URL: https://github.com/apache/hadoop/pull/4780#discussion_r952993440


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java:
##########
@@ -492,12 +498,12 @@ void heartbeatCheck() {
       // log nodes detected as stale since last heartBeat
       dumpStaleNodes(staleNodes);
 
-      allAlive = dead == null && failedStorage == null;
+      allAlive = deadDatanodes.size() == 0 && failedStorages.size() == 0;

Review Comment:
   isEmpty()



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java:
##########
@@ -96,6 +97,9 @@ class HeartbeatManager implements DatanodeStatistics {
     enableLogStaleNodes = conf.getBoolean(
         DFSConfigKeys.DFS_NAMENODE_ENABLE_LOG_STALE_DATANODE_KEY,
         DFSConfigKeys.DFS_NAMENODE_ENABLE_LOG_STALE_DATANODE_DEFAULT);
+    this.removeBatchNum =

Review Comment:
   ```
   this.removeBatchNum = conf.getInt(
       DFSConfigKeys.DFS_NAMENODE_REMOVE_BAD_BATCH_NUM,
       DFSConfigKeys.DFS_NAMENODE_REMOVE_BAD_BATCH_NUM_DEFAULT);
   ```



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java:
##########
@@ -436,12 +440,14 @@ void heartbeatCheck() {
       return;
     }
     boolean allAlive = false;
+    // Locate limited dead nodes.
+    List<DatanodeDescriptor> deadDatanodes = new ArrayList<>(removeBatchNum);
+    // Locate limited failed storages that isn't on a dead node.
+    List<DatanodeStorageInfo> failedStorages = new ArrayList<>(removeBatchNum);
     while (!allAlive) {

Review Comment:
   break line





> Reduce the number of HeartbeatManager loops
> -------------------------------------------
>
>                 Key: HDFS-16735
>                 URL: https://issues.apache.org/jira/browse/HDFS-16735
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Shuyan Zhang
>            Assignee: Shuyan Zhang
>            Priority: Major
>              Labels: pull-request-available
>
> HeartbeatManager only processes one dead datanode (and failed storage) per 
> round in heartbeatCheck(), that is to say, if there are ten failed storages, 
> all datanode states need to be scanned 10 times, which is unnecessary and a 
> waste of resources. This patch makes the number of bad storages processed per 
> scan configurable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-16735) Reduce the number of HeartbeatManager loops

Reply via email to