errose28 commented on PR #8755:
URL: https://github.com/apache/ozone/pull/8755#issuecomment-3090840155

   > Our main goal here is to build a dashboard in Recon to show storage usage 
distribution across the cluster. 
   
   We should leave all dashboarding to Grafana instead of reimplementing the 
wheel in Recon.
   
   > Recon already uses StorageReport for other usage stats, so we thought it 
would be a good idea to extend it to include pending deletion info as well, 
keeping everything in one place
   
   StorageReport is not a catch-all to implement metrics collection in Recon, 
it is for reporting information about the disks within datanodes. In general 
Recon displays current categorical information (keys, containers, pipelines, 
volumes/disks) and metrics + dashboards track numeric information (bytes, 
durations, or event counts) over time. Deletion progress falls cleanly in the 
second category.
   
   > We considered using Prometheus metrics, but based on my understanding. In 
ozone services these values might reset from beginning and become inaccurate in 
case of service restarts. This can lead to wrong conclusions in the dashboard.
   
   Metrics collection is decoupled from persistence. Yes we will need to 
persist the counters in datanodes so they do not need to be recomputed on every 
restart, but this is independent of how the metrics are published and consumed 
by other services.
   
   > Also I think to avoid dependency on Prometheus service running in this 
case (most customers do not have it set up when they hit issues related to 
deletion or anything related to space reclamation).
   
   If any Ozone users want dashboards they will need Prometheus and Grafana. We 
do not have the bandwidth to build and maintain our own dashboarding setup when 
quality ones already exist.
   
   This should be a reasonably small change if it is done the way Ozone is 
designed to handle it:
   - Add metrics, with persistence underneath if necessary (this PR)
   - Add the metrics to a dashboard, like 
[this](https://github.com/apache/ozone/blob/master/hadoop-ozone/dist/src/main/compose/common/grafana/dashboards/Ozone%20-%20DeleteKey%20Metrics.json)
 (follow-up PR)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to