[ https://issues.apache.org/jira/browse/HDFS-15745?focusedWorklogId=727965&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-727965 ]
ASF GitHub Bot logged work on HDFS-15745: ----------------------------------------- Author: ASF GitHub Bot Created on: 16/Feb/22 00:42 Start Date: 16/Feb/22 00:42 Worklog Time Spent: 10m Work Description: tasanuma merged pull request #3992: URL: https://github.com/apache/hadoop/pull/3992 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 727965) Time Spent: 50m (was: 40m) > Make DataNodePeerMetrics#LOW_THRESHOLD_MS and MIN_OUTLIER_DETECTION_NODES > configurable > -------------------------------------------------------------------------------------- > > Key: HDFS-15745 > URL: https://issues.apache.org/jira/browse/HDFS-15745 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Haibin Huang > Assignee: Haibin Huang > Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: HDFS-15745-001.patch, HDFS-15745-002.patch, > HDFS-15745-003.patch, HDFS-15745-branch-3.1.001.patch, > HDFS-15745-branch-3.2.001.patch, HDFS-15745-branch-3.3.001.patch, > image-2020-12-22-17-00-50-796.png > > Time Spent: 50m > Remaining Estimate: 0h > > When i enable DataNodePeerMetrics to find slow slow peer in cluster, i found > there is a lot of slow peer but ReportingNodes's averageDelay is very low, > and these slow peer node are normal. I think the reason of why generating so > many slow peer is that the value of DataNodePeerMetrics#LOW_THRESHOLD_MS is > too small (only 5ms) and it is not configurable. The default value of slow io > warning log threshold is 300ms, i.e. > DFSConfigKeys.DFS_DATANODE_SLOW_IO_WARNING_THRESHOLD_DEFAULT = 300, so > DataNodePeerMetrics#LOW_THRESHOLD_MS should not be less than 300ms, otherwise > namenode will get a lot of invalid slow peer information. > !image-2020-12-22-17-00-50-796.png! -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org