virajjasani opened a new pull request, #4323: URL: https://github.com/apache/hadoop/pull/4323
### Description of PR When any datanode is reported to be slower by another node, we expose the slow node as well as the reporting nodes list for the slow node. However, we don't provide latency numbers of the slownode as reported by the reporting node. Having the latency exposed in the metrics would be really helpful for operators to keep a track of how far behind a given slow node is performing compared to the rest of the nodes in the cluster. The operator should be able to gather aggregated latencies of all slow nodes with their reporting nodes in Namenode metrics. ### How was this patch tested? Dev cluster and UT. <img width="1488" alt="Screenshot 2022-05-17 at 8 43 09 PM" src="https://user-images.githubusercontent.com/34790606/168956923-d53e727a-c683-4d99-b075-9b3f776fd9f4.png"> ### For code changes: - [X] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
