k5342 commented on pull request #3088:
URL: https://github.com/apache/ozone/pull/3088#issuecomment-1042491770


   > if we have a metric system pulling this data every N seconds, would it be 
better to cache this info rather than do a SQL lookup everytime?
   
   Yes, I can put a cache inside `getMetrics()` but MetricsSystem already has a 
similar feature [^1] so individual cache may be out of the way of MetricsSystem.
   
   MetricsSystem defaults to expose metrics from sources every 10 seconds by 
`publishMetrics()` but I think `PrometheusServlet#doGet()` uses 
`publishMetricsNow()` [^2] and this disables the feature. Why is this used 
here? Could we switch to `publishMetrics()` instead?
   
   If we switch to `publishMetrics()`:
   - Pros) Simple. We don’t need to worry about individual cache on each 
metrics.
   - Cons) The metric resolution would become 10 seconds (that is configurable. 
I guess), which may let us miss important events for users who use short scrape 
interval (but, how many users? – Prometheus defaults to scrape in 1 min). This 
change is given to PrometheusServlet, which affects every component e.g. 
OM/SCM/DN.
   
   [^1]: 
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java#L360-L397
   [^2]: 
https://github.com/apache/ozone/blob/129325829ebdbea5359db2081d2f9f3850d5bb1b/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/server/http/PrometheusServlet.java#L61


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to