[prometheus-users] Re: Maximum and Minimum Request Duration on Prometheus Classic Histograms

'Brian Candler' via Prometheus Users Thu, 19 Jun 2025 09:39:21 -0700

In general, I don't think you can get an accurate answer to that question 
from a histogram.


You can work out which *bucket* the lowest and highest request durations 
sat in, which means you could give the lower and upper bounds of the 
minimum, and the lower and upper bounds of the maximum. Just compare the 
bucket counters at the start and end of the time range, and find the lowest 
boundary (le) which has changed, and the highest boundary which has 
changed. But this still doesn't tell you what the *actual* value was.  

I don't think there's any point in trying to make an estimate of the actual 
value; these values are, by definition, outliers, so even if your data 
points fitted a nice distribution, these ones would be at the ends of the 
curve and subject to high error.

Your LLM answer is essentially what it says in the documentation 
<https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_quantile>
 
for histogram_quantile:

*You can use histogram_quantile(0, v instant-vector) to get the estimated 
minimum value stored in a histogram.*

*You can use histogram_quantile(1, v instant-vector) to get the estimated 
maximum value stored in a histogram.*
I thought it was worth testing. Here is a metric from my home prometheus 
server, running 2.53.4:

*go_gc_pauses_seconds_bucket*
=>
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
le="6.399999999999999e-08"} 0
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
le="6.399999999999999e-07"} 0
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
le="7.167999999999999e-06"} 12193
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
le="8.191999999999999e-05"} 15369
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
le="0.0009175039999999999"} 27038
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
le="0.010485759999999998"} 27085
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
le="0.11744051199999998"} 27086
go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
le="+Inf"} 27086

*go_gc_pauses_seconds_bucket - go_gc_pauses_seconds_bucket offset 10m*
=>
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 0
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 0
{instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 5
{instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 5
{instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 10
{instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 10
{instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 10
{instance="localhost:9090", job="prometheus", le="+Inf"} 10

*rate(go_gc_pauses_seconds_bucket[10m])*
=>
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 0
{instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 0
{instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 
0.007407407407407408
{instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 
0.007407407407407408
{instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 
0.014814814814814815
{instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 
0.014814814814814815
{instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 
0.014814814814814815
{instance="localhost:9090", job="prometheus", le="+Inf"} 
0.014814814814814815

Those exponential bucket boundaries in scientific notation aren't very 
readable, but you can see that:
* the lowest response time must have been somewhere 
between 6.399999999999999e-07 and 7.167999999999999e-06
* the highest response time must have been somewhere between 
8.191999999999999e-05 and 0.0009175039999999999
 
Here are the answers from the formula the LLM suggested:


*histogram_quantile(0, rate(go_gc_pauses_seconds_bucket[10m]))*=>
{instance="localhost:9090", job="prometheus"} *NaN*

*histogram_quantile(1, rate(go_gc_pauses_seconds_bucket[10m]))*
=>
{instance="localhost:9090", job="prometheus"} *0.0009175039999999999*

The lower boundary of "NaN" is not useful at all (possibly this is a bug?), 
but I found I could get a value by specifying a very low, but non-zero, 
quantile:


*histogram_quantile(0.000000001, rate(go_gc_pauses_seconds_bucket[10m]))*
=>
{instance="localhost:9090", job="prometheus"} *6.40000013056e-07*

Those values *do* sit between the boundaries given:

>>> 6.399999999999999e-07 < 6.40000013056e-07 <= 7.167999999999999e-06
True
>>> 8.191999999999999e-05 < 0.0009175039999999999 <= 0.0009175039999999999
True

In fact, the "minimum" answer is very close to the lower edge of the 
relevant bucket, and the "maximum" is the upper edge of the relevant bucket.

Therefore, these are not the *actual* minimum and maximum request times. In 
effect, they are saying "the minimum request time was *more than* 
6.399999999999999e-07, 
and the maximum request time was *no more than* 0.0009175039999999999".  
But that's as good as you can get with a histogram.

On Wednesday, 18 June 2025 at 18:17:15 UTC+1 tejaswini vadlamudi wrote:

> Including answer from Gen-AI:
>
> | Description                         | PromQL Query                       
>                                                                             
>   | Notes                                                                   
>                         |
>
> |-------------------------------------|------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
> | Minimum request duration (1m)       | histogram_quantile(0, sum by (le) 
> (rate(http_request_duration_seconds_bucket[1m])))                           
>   | Fast but may be noisy or return NaN if low traffic. Good for 
> near-real-time.                   |
> | Maximum request duration (1m)       | histogram_quantile(1, sum by (le) 
> (rate(http_request_duration_seconds_bucket[1m])))                           
>   | Same as above, for longest duration estimate.                           
>                         |
> | Minimum request duration (5m)       | histogram_quantile(0, sum by (le) 
> (rate(http_request_duration_seconds_bucket[5m])))                           
>   | More stable, smoother estimate over a slightly longer window.           
>                         |
> | Maximum request duration (5m)       | histogram_quantile(1, sum by (le) 
> (rate(http_request_duration_seconds_bucket[5m])))                           
>   | Recommended when traffic is bursty or histogram series are sparse.     
>                         |
>
> Please confirm if the above answer is reliable or not. 
> On Wednesday, June 18, 2025 at 3:23:54 PM UTC+2 tejaswini vadlamudi wrote:
>
>> Hi,
>>
>> I’m using Prometheus to monitor request durations via a histogram metric, 
>> e.g., http_request_duration_seconds_bucket. I would like to query:
>>
>>    - The minimum time taken by a request
>>    - The maximum time taken by a request
>>
>> …over a given time range (say, the last 1h or 24h).
>>
>> I understand that histogram buckets give cumulative counts of requests 
>> below certain durations, but I’m not sure how to extract the actual min or 
>> max values of request durations during a time window.
>>
>> Is this possible directly via PromQL? Or is there a recommended 
>> workaround (e.g., recording rules, external processing, or using 
>> histogram_quantile() in a specific way)?
>>
>> Thanks in advance for any guidance!
>>
>> Br,
>> Teja
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/prometheus-users/2973b60c-a52b-495f-8e4f-aa2172b59ee0n%40googlegroups.com.

[prometheus-users] Re: Maximum and Minimum Request Duration on Prometheus Classic Histograms

Reply via email to