[ 
https://issues.apache.org/jira/browse/SLING-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17508795#comment-17508795
 ] 

Joerg Hoh commented on SLING-11192:
-----------------------------------

These metrics are all defined in 
[GaugeImpl|https://github.com/apache/sling-org-apache-sling-event/blob/master/src/main/java/org/apache/sling/event/impl/jobs/stats/GaugeSupport.java]
 and delegate to the 
[StatisticsImpl|https://github.com/apache/sling-org-apache-sling-event/blob/master/src/main/java/org/apache/sling/event/impl/jobs/stats/StatisticsImpl.java];
 there many methods are synchronized.

[~stefanegli] do you know why this is the case? I don't see a reason why so 
many of these read operations are synchronized. 


But even then almost calculation is done, but only variables are read. So there 
is definitely no heavy calculation.

> Calculating metrics takes too long
> ----------------------------------
>
>                 Key: SLING-11192
>                 URL: https://issues.apache.org/jira/browse/SLING-11192
>             Project: Sling
>          Issue Type: Improvement
>          Components: Event
>    Affects Versions: Event 4.2.24
>            Reporter: Joerg Hoh
>            Priority: Major
>
> we use the prometheus exporter to export Sling Metrics / Dropwizard metrics, 
> and we often see messages like this:
> {noformat}
> 10.03.2022 08:50:15.333 [...] *WARN* [qtp568481508-1779] 
> io.prometheus.client.dropwizard.DropwizardExports Gauge has been blacklisted 
> for 300000 ms due timeout:  Generated from Dropwizard metric import 
> (metric=sling_event.jobs.cancelled.count, 
> type=org.apache.sling.event.impl.jobs.stats.GaugeSupport$2) 
> {noformat}
> This means that calculating the metric took too long. We should make sure 
> that the calculation is done asnychronously and just pre-computed values are 
> returned.
> For at least these values the handling needs to be improved:
> * sling_event.jobs.active.count
> * sling_event.jobs.averageProcessingTime
> * sling_event.jobs.averageWaitingTime
> * sling_event.jobs.cancelled.count



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to