[
https://issues.apache.org/jira/browse/HDDS-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874971#comment-17874971
]
Ethan Rose commented on HDDS-11341:
-----------------------------------
cc [~ritesh] [~muskan.mishra]
> Add dashboard for HDDS health and replication progress
> ------------------------------------------------------
>
> Key: HDDS-11341
> URL: https://issues.apache.org/jira/browse/HDDS-11341
> Project: Apache Ozone
> Issue Type: Improvement
> Components: Ozone Dashboards
> Reporter: Ethan Rose
> Priority: Major
>
> Add a Grafana dashboard to show information about datanode health, ongoing
> and pending replication and reconstruction tasks, and the amount of data
> being moved between nodes due to these tasks. This board will be useful to
> monitor during disk failure, node failure, node decom, and maintenance.
> SCM replication manager likely has a lot of the metrics for ongoing tasks
> already. We may need to add more metrics to datanodes to monitor tasks that
> are ongoing (not just those that are queued) and the amount of data being
> moved. I think some datanode command queue and handler related metrics are
> unused as well and those can be checked/removed/updated as part of this PR.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]