[ 
https://issues.apache.org/jira/browse/HDFS-15879?focusedWorklogId=564282&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-564282
 ]

ASF GitHub Bot logged work on HDFS-15879:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Mar/21 04:11
            Start Date: 11/Mar/21 04:11
    Worklog Time Spent: 10m 
      Work Description: dineshchitlangia commented on a change in pull request 
#2748:
URL: https://github.com/apache/hadoop/pull/2748#discussion_r592050376



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##########
@@ -356,6 +382,44 @@
         
DFSConfigKeys.DFS_NAMENODE_BLOCKS_PER_POSTPONEDBLOCKS_RESCAN_KEY_DEFAULT);
   }
 
+  private void startSlowPeerCollector() {
+    if (slowPeerCollectorDaemon != null) {
+      return;
+    }
+    slowPeerCollectorDaemon = new Daemon(new Runnable() {
+      @Override
+      public void run() {
+        while (true) {
+          try {
+            slowPeers = getSlowPeers();
+          } catch (Exception e) {
+            LOG.error("Slow peers collected failed", e);

Review comment:
       ```suggestion
               LOG.error("Failed to collect slow peers", e);
   ```

##########
File path: hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
##########
@@ -2368,6 +2368,37 @@
   </description>
 </property>
 
+<property>
+  <name>dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled</name>
+  <value>false</value>
+  <description>
+    If this is set to true, we will filter out slow nodes
+    when choosing targets for blocks.
+  </description>
+</property>
+
+<property>
+  <name>dfs.namenode.max.slowpeer.collect.nodes</name>
+  <value>5</value>
+  <description>
+    How many slow nodes we will collect for filtering out
+    when choosing targets for blocks.
+
+    It is ignored if 
dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled is false.
+  </description>
+</property>
+
+<property>
+  <name>dfs.namenode.slowpeer.collect.interval</name>
+  <value>30m</value>
+  <description>
+    How offen slow nodes we will collect for filtering out
+    when choosing targets for blocks.

Review comment:
       ```suggestion
        Interval at which the slow peer trackers runs in the background to 
collect slow peers.
   ```

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SlowPeerTracker.java
##########
@@ -233,6 +234,20 @@ public String getSlowNode() {
     }
   }
 
+  /**
+   * Returns all tracking slow peers.
+   * @param numNodes
+   * @return
+   */
+  public ArrayList<String> getSlowNodes(int numNodes) {
+    Collection<ReportForJson> jsonReports = getJsonReports(numNodes);
+    ArrayList<String> slowNodes = new ArrayList<>();
+    for (ReportForJson jsonReport : jsonReports) {
+      slowNodes.add(jsonReport.getSlowNode());
+    }
+    return slowNodes;
+  }

Review comment:
       Somewhere in this method, we should log the slow peers as a WARN so that 
it shows up on Namenode logs.
   This will be useful for admins when debugging HDFS performance issue. If 
they see list of slow nodes reported, they can quickly take a look at affected 
nodes instead of searching for various datanodes logs and searching for various 
kinds of "Slow Block Receiver" type of log messages in those logs.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 564282)
    Time Spent: 50m  (was: 40m)

> Exclude slow nodes when choose targets for blocks
> -------------------------------------------------
>
>                 Key: HDFS-15879
>                 URL: https://issues.apache.org/jira/browse/HDFS-15879
>             Project: Hadoop HDFS
>          Issue Type: Wish
>            Reporter: tomscut
>            Assignee: tomscut
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Previously, we have monitored the slow nodes, related to 
> [HDFS-11194|https://issues.apache.org/jira/browse/HDFS-11194].
> We can use a thread to periodically collect these slow nodes into a set. Then 
> use the set to filter out slow nodes when choose targets for blocks.
> This feature can be configured to be turned on when needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to