jojochuang commented on PR #10427:
URL: https://github.com/apache/ozone/pull/10427#issuecomment-4724820435

     After comparing it against existing dashboards in dashboards, this new 
dashboard is a direct modern replacement/duplicate of:
   
     • Ozone - Overall Metrics.json
   
     Here is a detailed comparison of the overlaps:
   
     ### 1. Duplication & Improvements Over  Ozone - Overall Metrics
   
     The new  Ozone - Status overview  dashboard monitors the exact same 
categories of high-level cluster metrics as  Ozone - Overall Metrics , but with
     updated and more precise queries:
   
     • Disk/Storage Capacity:
         • Old: Used/Capacity via SCM  scm_node_manager_disk_used  /  
scm_node_manager_disk_capacity .
         • New: Uses volume-level metrics ( volume_info_metrics_*_capacity/used 
) for more accuracy with a fallback to the SCM disk metrics.
     • DataNode Counts:
         • Old: Count of nodes in service ( in_service_[[node_type]]_nodes ).
         • New: Plots total, decommissioning, and dead nodes.
     • Ozone Manager Namespace:
         • Old: Counts for directories, files, volumes, and buckets.
         • New: Plots Keys (with fallback to RocksDB estimates  
rocksdb_om_db_keytable_estimate_num_keys  to reduce query performance impact), 
Volumes, and
         Buckets.
     • Containers:
         • Old: Counts via  scm_container_metrics_[[container_type]]_containers 
.
         • New: Counts by detailed Replication Manager states (open, closing, 
quasi-closed, closed, deleting, deleted, recovering).
   
   
     ### 2. What the New Dashboard Adds
   
     • Disk/Volume Health: Plots total, healthy, and failed disks.
     • S3 Gateway Instances: Plots the number of S3 Gateway instances online.
     • Deletion Backlog: Adds a section tracking space pending reclaim (blocks 
queued/pending on DNs, deleted keys sent for purge, DELETING containers).
   
     ### 3. Redundancies Offloaded
   
     The old  Ozone - Overall Metrics  dashboard contained general panels for 
RPC, RATIS, and raw DataNode Chunk/Block counts. The new  Ozone - Status
     overview  intentionally omits these since they are now comprehensively 
monitored in:
   
     • Ozone - RPC Metrics.json
     • Ozone - SCM overview.json (added in PR #10382)
     • Ozone - DataNode Overview.json (added in PR #10314)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to