saintstack commented on code in PR #4107:
URL: https://github.com/apache/hadoop/pull/4107#discussion_r860008767
##########
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md:
##########
@@ -394,7 +394,7 @@ Usage:
| COMMAND\_OPTION | Description |
|:---- |:---- |
-| `-report` `[-live]` `[-dead]` `[-decommissioning]` `[-enteringmaintenance]`
`[-inmaintenance]` | Reports basic filesystem information and statistics, The
dfs usage can be different from "du" usage, because it measures raw space used
by replication, checksums, snapshots and etc. on all the DNs. Optional flags
may be used to filter the list of displayed DataNodes. |
+| `-report` `[-live]` `[-dead]` `[-decommissioning]` `[-enteringmaintenance]`
`[-inmaintenance]` `[-slownodes]` | Reports basic filesystem information and
statistics, The dfs usage can be different from "du" usage, because it measures
raw space used by replication, checksums, snapshots and etc. on all the DNs.
Optional flags may be used to filter the list of displayed DataNodes. Filters
are either based on the DN state (e.g. live, dead, decommissioning) or the
nature of the DN (e.g. slow nodes). |
Review Comment:
Where do we explain a 'slownode' is?
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodePeerMetrics.java:
##########
@@ -142,14 +144,28 @@ public void collectThreadLocalStates() {
* than their peers.
*/
public Map<String, Double> getOutliers() {
- // This maps the metric name to the aggregate latency.
- // The metric name is the datanode ID.
- final Map<String, Double> stats =
- sendPacketDownstreamRollingAverages.getStats(
- minOutlierDetectionSamples);
- LOG.trace("DataNodePeerMetrics: Got stats: {}", stats);
-
- return slowNodeDetector.getOutliers(stats);
+ // outlier must be null for source code.
+ if (testOutlier == null) {
+ // This maps the metric name to the aggregate latency.
+ // The metric name is the datanode ID.
+ final Map<String, Double> stats =
+
sendPacketDownstreamRollingAverages.getStats(minOutlierDetectionSamples);
+ LOG.trace("DataNodePeerMetrics: Got stats: {}", stats);
+ return slowNodeDetector.getOutliers(stats);
+ } else {
+ // this happens only for test code.
+ return testOutlier;
+ }
+ }
+
+ /**
+ * Strictly to be used by test code only. Source code is not supposed to use
this. This method
+ * directly sets outlier mapping so that aggregate latency metrics are not
calculated for tests.
+ *
+ * @param outlier outlier directly set by tests.
+ */
+ public void setTestOutliers(Map<String, Double> outlier) {
Review Comment:
This is a little awkward? Add comment on the testOutlier data member that it
is for test only?
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java:
##########
@@ -632,6 +638,20 @@ private static void
printDataNodeReports(DistributedFileSystem dfs,
}
}
+ private static void printSlowDataNodeReports(DistributedFileSystem dfs,
boolean listNodes,
Review Comment:
One comment on the slow datanode report is that it seems to say nothing
about why the NN thinks it slow; it does not note the attributes that are in
excess of normal. Should it? (Hard to do when this info is not part of
DatanodeInfo) For example, say something about how in excess a DNs latency is?
(Perhaps this could be added later)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]