[ 
https://issues.apache.org/jira/browse/HDFS-16348?focusedWorklogId=699169&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-699169
 ]

ASF GitHub Bot logged work on HDFS-16348:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Dec/21 04:09
            Start Date: 21/Dec/21 04:09
    Worklog Time Spent: 10m 
      Work Description: tasanuma commented on a change in pull request #3704:
URL: https://github.com/apache/hadoop/pull/3704#discussion_r772796622



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
##########
@@ -1254,10 +1273,52 @@ public void run() {
       }
     }
 
+    void markSlowNode(List<DatanodeInfo> slownodesFromAck) throws IOException {
+      Set<DatanodeInfo> discontinuousNodes = new 
HashSet<>(slowNodeMap.keySet());
+      for (DatanodeInfo slowNode : slownodesFromAck) {
+        if (!slowNodeMap.containsKey(slowNode)) {
+          slowNodeMap.put(slowNode, 1);
+        } else {
+          int oldCount = slowNodeMap.get(slowNode);
+          slowNodeMap.put(slowNode, ++oldCount);
+        }
+        discontinuousNodes.remove(slowNode);
+      }
+      for (DatanodeInfo discontinuousNode : discontinuousNodes) {
+        slowNodeMap.remove(discontinuousNode);
+      }
+
+      if (!slowNodeMap.isEmpty()) {
+        for (Map.Entry<DatanodeInfo, Integer> entry : slowNodeMap.entrySet()) {
+          if (entry.getValue() >= markSlowNodeAsBadNodeThreshold) {
+            DatanodeInfo slowNode = entry.getKey();
+            int index = getDatanodeIndex(slowNode);
+            if (index >= 0) {
+              errorState.setBadNodeIndex(
+                  getDatanodeIndex(entry.getKey()));

Review comment:
       We can reuse `index` variable.
   ```suggestion
                 errorState.setBadNodeIndex(index);
   ```

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
##########
@@ -230,14 +260,27 @@ public static ECN getECNFromHeader(int header) {
     return StatusFormat.getECN(header);
   }
 
+  public static SLOW getSLOWFromHeader(int header) {
+    return StatusFormat.getSLOW(header);
+  }
+
   public static int setStatusForHeader(int old, Status status) {
     return StatusFormat.setStatus(old, status);
   }
 
+  public static int setSLOWForHeader(int old, SLOW slow) {

Review comment:
       Only the unit test uses this method. Would you please add 
VisibleForTesting?
   ```suggestion
     @VisibleForTesting
     public static int setSLOWForHeader(int old, SLOW slow) {
   ```

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
##########
@@ -230,14 +260,27 @@ public static ECN getECNFromHeader(int header) {
     return StatusFormat.getECN(header);
   }
 
+  public static SLOW getSLOWFromHeader(int header) {
+    return StatusFormat.getSLOW(header);
+  }
+
   public static int setStatusForHeader(int old, Status status) {
     return StatusFormat.setStatus(old, status);
   }
 
+  public static int setSLOWForHeader(int old, SLOW slow) {
+    return StatusFormat.setSLOW(old, slow);
+  }
+
   public static int combineHeader(ECN ecn, Status status) {
+    return combineHeader(ecn, status, SLOW.DISABLED);
+  }
+
+  public static int combineHeader(ECN ecn, Status status, SLOW slow) {

Review comment:
       I want `PipelineAck#getHeaderFlag()` to use this method.
   ```java
     public int getHeaderFlag(int i) {
       if (proto.getFlagCount() > 0) {
         return proto.getFlag(i);
       } else {
         return combineHeader(ECN.DISABLED, proto.getReply(i), SLOW.DISABLED);
       }
     }
   ```

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
##########
@@ -1620,8 +1623,10 @@ private void sendAckUpstreamUnprotected(PipelineAck ack, 
long seqno,
         // downstream nodes, reply should contain one reply.
         replies = new int[] { myHeader };
       } else if (mirrorError) { // ack read error
-        int h = PipelineAck.combineHeader(datanode.getECN(), Status.SUCCESS);
-        int h1 = PipelineAck.combineHeader(datanode.getECN(), Status.ERROR);
+        int h = PipelineAck.combineHeader(datanode.getECN(), Status.SUCCESS,
+            datanode.getSLOW());
+        int h1 = PipelineAck.combineHeader(datanode.getECN(), Status.ERROR,
+            datanode.getSLOW());

Review comment:
       Why it doesn't use 
`datanode.getSLOWByBlockPoolId(block.getBlockPoolId())`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 699169)
    Time Spent: 3h 10m  (was: 3h)

> Mark slownode as badnode to recover pipeline
> --------------------------------------------
>
>                 Key: HDFS-16348
>                 URL: https://issues.apache.org/jira/browse/HDFS-16348
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Janus Chow
>            Assignee: Janus Chow
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In HDFS-16320, the DataNode can retrieve the SLOW status from each NameNode. 
> This ticket is to send this information back to Clients who are writing 
> blocks. If a Clients noticed the pipeline is build on a slownode, he/she can 
> choose to mark the slownode as a badnode to exclude the node or rebuild a 
> pipeline.
> In order to avoid the false positives, we added a config of "threshold", only 
> clients continuously receives slownode reply from the same node will the node 
> be marked as SLOW.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to