jojochuang commented on PR #10427:
URL: https://github.com/apache/ozone/pull/10427#issuecomment-4724820435
After comparing it against existing dashboards in dashboards, this new
dashboard is a direct modern replacement/duplicate of:
• Ozone - Overall Metrics.json
Here is a detailed comparison of the overlaps:
### 1. Duplication & Improvements Over Ozone - Overall Metrics
The new Ozone - Status overview dashboard monitors the exact same
categories of high-level cluster metrics as Ozone - Overall Metrics , but with
updated and more precise queries:
• Disk/Storage Capacity:
• Old: Used/Capacity via SCM scm_node_manager_disk_used /
scm_node_manager_disk_capacity .
• New: Uses volume-level metrics ( volume_info_metrics_*_capacity/used
) for more accuracy with a fallback to the SCM disk metrics.
• DataNode Counts:
• Old: Count of nodes in service ( in_service_[[node_type]]_nodes ).
• New: Plots total, decommissioning, and dead nodes.
• Ozone Manager Namespace:
• Old: Counts for directories, files, volumes, and buckets.
• New: Plots Keys (with fallback to RocksDB estimates
rocksdb_om_db_keytable_estimate_num_keys to reduce query performance impact),
Volumes, and
Buckets.
• Containers:
• Old: Counts via scm_container_metrics_[[container_type]]_containers
.
• New: Counts by detailed Replication Manager states (open, closing,
quasi-closed, closed, deleting, deleted, recovering).
### 2. What the New Dashboard Adds
• Disk/Volume Health: Plots total, healthy, and failed disks.
• S3 Gateway Instances: Plots the number of S3 Gateway instances online.
• Deletion Backlog: Adds a section tracking space pending reclaim (blocks
queued/pending on DNs, deleted keys sent for purge, DELETING containers).
### 3. Redundancies Offloaded
The old Ozone - Overall Metrics dashboard contained general panels for
RPC, RATIS, and raw DataNode Chunk/Block counts. The new Ozone - Status
overview intentionally omits these since they are now comprehensively
monitored in:
• Ozone - RPC Metrics.json
• Ozone - SCM overview.json (added in PR #10382)
• Ozone - DataNode Overview.json (added in PR #10314)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]