huangzhaobo99 opened a new pull request, #6364: URL: https://github.com/apache/hadoop/pull/6364
### Description of PR 1. Add a new record of the number of times the DatanodeManager#slowPeerCollectorDaemon thread collects SlowNodes, and display it in a map structure. 2. The same SlowNode may always appear in the prod env, so when slowPeerCollectorDaemon is turned on, record the number of times it has been collected by the slowPeerCollectorDaemon thread. If the collection frequency is too high, SRE or DEV need to repair the machine. 3. The following figure shows the SlowNodes collected by the slowPeerCollectorDaemon thread at different time periods. (If "DataNodeWriteXceiversCount" is 0, there is no write request, indicating a SlowNode) <img width="1213" alt="image" src="https://github.com/apache/hadoop/assets/63718681/f383874b-3f05-4d04-b963-c6c9430d2836"> ### How was this patch tested? Add Unit Test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
