dineshchitlangia commented on a change in pull request #2748:
URL: https://github.com/apache/hadoop/pull/2748#discussion_r592050376
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
##########
@@ -356,6 +382,44 @@
DFSConfigKeys.DFS_NAMENODE_BLOCKS_PER_POSTPONEDBLOCKS_RESCAN_KEY_DEFAULT);
}
+ private void startSlowPeerCollector() {
+ if (slowPeerCollectorDaemon != null) {
+ return;
+ }
+ slowPeerCollectorDaemon = new Daemon(new Runnable() {
+ @Override
+ public void run() {
+ while (true) {
+ try {
+ slowPeers = getSlowPeers();
+ } catch (Exception e) {
+ LOG.error("Slow peers collected failed", e);
Review comment:
```suggestion
LOG.error("Failed to collect slow peers", e);
```
##########
File path: hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
##########
@@ -2368,6 +2368,37 @@
</description>
</property>
+<property>
+ <name>dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled</name>
+ <value>false</value>
+ <description>
+ If this is set to true, we will filter out slow nodes
+ when choosing targets for blocks.
+ </description>
+</property>
+
+<property>
+ <name>dfs.namenode.max.slowpeer.collect.nodes</name>
+ <value>5</value>
+ <description>
+ How many slow nodes we will collect for filtering out
+ when choosing targets for blocks.
+
+ It is ignored if
dfs.namenode.block-placement-policy.exclude-slow-nodes.enabled is false.
+ </description>
+</property>
+
+<property>
+ <name>dfs.namenode.slowpeer.collect.interval</name>
+ <value>30m</value>
+ <description>
+ How offen slow nodes we will collect for filtering out
+ when choosing targets for blocks.
Review comment:
```suggestion
Interval at which the slow peer trackers runs in the background to
collect slow peers.
```
##########
File path:
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SlowPeerTracker.java
##########
@@ -233,6 +234,20 @@ public String getSlowNode() {
}
}
+ /**
+ * Returns all tracking slow peers.
+ * @param numNodes
+ * @return
+ */
+ public ArrayList<String> getSlowNodes(int numNodes) {
+ Collection<ReportForJson> jsonReports = getJsonReports(numNodes);
+ ArrayList<String> slowNodes = new ArrayList<>();
+ for (ReportForJson jsonReport : jsonReports) {
+ slowNodes.add(jsonReport.getSlowNode());
+ }
+ return slowNodes;
+ }
Review comment:
Somewhere in this method, we should log the slow peers as a WARN so that
it shows up on Namenode logs.
This will be useful for admins when debugging HDFS performance issue. If
they see list of slow nodes reported, they can quickly take a look at affected
nodes instead of searching for various datanodes logs and searching for various
kinds of "Slow Block Receiver" type of log messages in those logs.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]