Ethan Rose created HDDS-11341:
---------------------------------
Summary: Add dashboard for HDDS health and replication progress
Key: HDDS-11341
URL: https://issues.apache.org/jira/browse/HDDS-11341
Project: Apache Ozone
Issue Type: Improvement
Components: Ozone Dashboards
Reporter: Ethan Rose
Add a Grafana dashboard to show information about datanode health, ongoing and
pending replication and reconstruction tasks, and the amount of data being
moved between nodes due to these tasks. This board will be useful to monitor
during disk failure, node failure, node decom, and maintenance.
SCM replication manager likely has a lot of the metrics for ongoing tasks
already. We may need to add more metrics to datanodes to monitor tasks that are
ongoing (not just those that are queued) and the amount of data being moved. I
think some datanode command queue and handler related metrics are unused as
well and those can be checked/removed/updated as part of this PR.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]