[
https://issues.apache.org/jira/browse/HDFS-16582?focusedWorklogId=771713&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771713
]
ASF GitHub Bot logged work on HDFS-16582:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 18/May/22 04:27
Start Date: 18/May/22 04:27
Worklog Time Spent: 10m
Work Description: virajjasani opened a new pull request, #4323:
URL: https://github.com/apache/hadoop/pull/4323
### Description of PR
When any datanode is reported to be slower by another node, we expose the
slow node as well as the reporting nodes list for the slow node. However, we
don't provide latency numbers of the slownode as reported by the reporting
node. Having the latency exposed in the metrics would be really helpful for
operators to keep a track of how far behind a given slow node is performing
compared to the rest of the nodes in the cluster.
The operator should be able to gather aggregated latencies of all slow nodes
with their reporting nodes in Namenode metrics.
### How was this patch tested?
Dev cluster and UT.
<img width="1488" alt="Screenshot 2022-05-17 at 8 43 09 PM"
src="https://user-images.githubusercontent.com/34790606/168956923-d53e727a-c683-4d99-b075-9b3f776fd9f4.png">
### For code changes:
- [X] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
Issue Time Tracking
-------------------
Worklog Id: (was: 771713)
Remaining Estimate: 0h
Time Spent: 10m
> Expose aggregate latency of slow node as perceived by the reporting node
> ------------------------------------------------------------------------
>
> Key: HDFS-16582
> URL: https://issues.apache.org/jira/browse/HDFS-16582
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Viraj Jasani
> Assignee: Viraj Jasani
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When any datanode is reported to be slower by another node, we expose the
> slow node as well as the reporting nodes list for the slow node. However, we
> don't provide latency numbers of the slownode as reported by the reporting
> node. Having the latency exposed in the metrics would be really helpful for
> operators to keep a track of how far behind a given slow node is performing
> compared to the rest of the nodes in the cluster.
> The operator should be able to gather aggregated latencies of all slow nodes
> with their reporting nodes in Namenode metrics.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]