[prometheus-users] Re: Maximum and Minimum Request Duration on Prometheus Classic Histograms

'Brian Candler' via Prometheus Users Mon, 23 Jun 2025 04:42:43 -0700

Remember that histograms don't store values. All they do is increment a 
counter by 1; the value is only used to select which bucket to increment.  
This means that the amount of storage used by a histogram is very small - a 
fixed number of buckets with one counter each. It doesn't matter if you are 
processing 1 sample per second or 10,000 samples per second.


If you wanted to retrieve the *exact* lowest or highest value, over *any* 
arbitrary time period that you query, you would have to store every single 
value into a database. Prometheus is not a event logging system, and it 
will never work this way. A columnar datastore like Clickhouse can do that 
quite well, but if the number of samples is large, you will still have a 
very large storage issue.

More realistically, you could find the minimum or maximum value seen over a 
fixed time period (say one minute), and at the end of that minute, export 
the min/max value seen. That's cheap and quick. Indeed, you could do it 
over a relatively short time period (e.g. 1 second), and use prometheus' 
min/max_over_time functions if you want to query a longer period, i.e. to 
find the min of the mins, or the max of the maxes.  You need to make sure 
that every distinct min/max value ends up in the database though; either 
use remote_write to push them, or scrape your exporter at least twice as 
fast as the min/max values are changing.

In my experience, people are often not so interested in the single minimum 
or maximum value, but in the quantiles, such as the 1st percentile ("the 
fastest 1% of queries were answered in less than X seconds") or the 99th 
percentile ("the slowest 1% of queries were answered in more than Y 
seconds"). Prometheus can help you using a data type called a "summary":
https://prometheus.io/docs/concepts/metric_types/#summary
https://prometheus.io/docs/practices/histograms/#quantiles

A summary can give you very good estimates of the percentiles over a 
sliding time window (of a size you have to choose in advance), and uses a 
relatively small amount of storage like a histogram. It is better than a 
histogram in the case where you don't know in advance what the highest and 
lowest values are likely to be (i.e. you don't need to pre-allocate your 
bucket boundaries correctly).

On Monday, 23 June 2025 at 08:15:42 UTC+1 tejaswini vadlamudi wrote:

> Thanks Brain, for the clear heads-up and explanation!
>
> It looks to me that there is no possibility to secure exact maximum and 
> exact minimum values for durations (based on Prometheus histograms) :-(
>
> However, for performing exploratory data analysis on the application 
> software, need this summary statistics information, such as minimum and 
> maximum values. Legacy monitoring systems have always had this support, 
> which in turn expects the new technology to fit the use case to ensure 
> backward compatibility. 
>
> Please share what can be done in this regard to secure this info.
>
> I'm thinking out loud, please correct/add wherever possible:
>
> 1. Does changing from Prometheus to OTEL instrumentation provide this 
> feature (exact max and min duration time)?
> 2. Can metrics derived from distributed traces (instrumented with 
> OTEL/Jaeger) be used to obtain minimum and maximum request durations?
> 3. Is it possible to secure the max and min duration time with Prometheus 
> with any hack?
>       a. For Classic Histograms?
>       b. For Native Histograms?
> 4. A new PR/contribution on Prometheus to offer this support?
>
> Thanks,
> Teja
>
> On Thursday, June 19, 2025 at 6:38:59 PM UTC+2 Brian Candler wrote:
>
>> In general, I don't think you can get an accurate answer to that question 
>> from a histogram.
>>
>> You can work out which *bucket* the lowest and highest request durations 
>> sat in, which means you could give the lower and upper bounds of the 
>> minimum, and the lower and upper bounds of the maximum. Just compare the 
>> bucket counters at the start and end of the time range, and find the lowest 
>> boundary (le) which has changed, and the highest boundary which has 
>> changed. But this still doesn't tell you what the *actual* value was.  
>>
>> I don't think there's any point in trying to make an estimate of the 
>> actual value; these values are, by definition, outliers, so even if your 
>> data points fitted a nice distribution, these ones would be at the ends of 
>> the curve and subject to high error.
>>
>> Your LLM answer is essentially what it says in the documentation 
>> <https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_quantile>
>>  
>> for histogram_quantile:
>>
>> *You can use histogram_quantile(0, v instant-vector) to get the estimated 
>> minimum value stored in a histogram.*
>>
>> *You can use histogram_quantile(1, v instant-vector) to get the estimated 
>> maximum value stored in a histogram.*
>> I thought it was worth testing. Here is a metric from my home prometheus 
>> server, running 2.53.4:
>>
>> *go_gc_pauses_seconds_bucket*
>> =>
>> go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
>> le="6.399999999999999e-08"} 0
>> go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
>> le="6.399999999999999e-07"} 0
>> go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
>> le="7.167999999999999e-06"} 12193
>> go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
>> le="8.191999999999999e-05"} 15369
>> go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
>> le="0.0009175039999999999"} 27038
>> go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
>> le="0.010485759999999998"} 27085
>> go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
>> le="0.11744051199999998"} 27086
>> go_gc_pauses_seconds_bucket{instance="localhost:9090", job="prometheus", 
>> le="+Inf"} 27086
>>
>> *go_gc_pauses_seconds_bucket - go_gc_pauses_seconds_bucket offset 10m*
>> =>
>> {instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 
>> 0
>> {instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 
>> 0
>> {instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 
>> 5
>> {instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 
>> 5
>> {instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 
>> 10
>> {instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 
>> 10
>> {instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 10
>> {instance="localhost:9090", job="prometheus", le="+Inf"} 10
>>
>> *rate(go_gc_pauses_seconds_bucket[10m])*
>> =>
>> {instance="localhost:9090", job="prometheus", le="6.399999999999999e-08"} 
>> 0
>> {instance="localhost:9090", job="prometheus", le="6.399999999999999e-07"} 
>> 0
>> {instance="localhost:9090", job="prometheus", le="7.167999999999999e-06"} 
>> 0.007407407407407408
>> {instance="localhost:9090", job="prometheus", le="8.191999999999999e-05"} 
>> 0.007407407407407408
>> {instance="localhost:9090", job="prometheus", le="0.0009175039999999999"} 
>> 0.014814814814814815
>> {instance="localhost:9090", job="prometheus", le="0.010485759999999998"} 
>> 0.014814814814814815
>> {instance="localhost:9090", job="prometheus", le="0.11744051199999998"} 
>> 0.014814814814814815
>> {instance="localhost:9090", job="prometheus", le="+Inf"} 
>> 0.014814814814814815
>>
>> Those exponential bucket boundaries in scientific notation aren't very 
>> readable, but you can see that:
>> * the lowest response time must have been somewhere 
>> between 6.399999999999999e-07 and 7.167999999999999e-06
>> * the highest response time must have been somewhere between 
>> 8.191999999999999e-05 and 0.0009175039999999999
>>  
>> Here are the answers from the formula the LLM suggested:
>>
>>
>> *histogram_quantile(0, rate(go_gc_pauses_seconds_bucket[10m]))*=>
>> {instance="localhost:9090", job="prometheus"} *NaN*
>>
>> *histogram_quantile(1, rate(go_gc_pauses_seconds_bucket[10m]))*
>> =>
>> {instance="localhost:9090", job="prometheus"} *0.0009175039999999999*
>>
>> The lower boundary of "NaN" is not useful at all (possibly this is a 
>> bug?), but I found I could get a value by specifying a very low, but 
>> non-zero, quantile:
>>
>>
>> *histogram_quantile(0.000000001, rate(go_gc_pauses_seconds_bucket[10m]))*
>> =>
>> {instance="localhost:9090", job="prometheus"} *6.40000013056e-07*
>>
>> Those values *do* sit between the boundaries given:
>>
>> >>> 6.399999999999999e-07 < 6.40000013056e-07 <= 7.167999999999999e-06
>> True
>> >>> 8.191999999999999e-05 < 0.0009175039999999999 <= 0.0009175039999999999
>> True
>>
>> In fact, the "minimum" answer is very close to the lower edge of the 
>> relevant bucket, and the "maximum" is the upper edge of the relevant bucket.
>>
>> Therefore, these are not the *actual* minimum and maximum request times. 
>> In effect, they are saying "the minimum request time was *more than* 
>> 6.399999999999999e-07, 
>> and the maximum request time was *no more than* 0.0009175039999999999".  
>> But that's as good as you can get with a histogram.
>>
>> On Wednesday, 18 June 2025 at 18:17:15 UTC+1 tejaswini vadlamudi wrote:
>>
>>> Including answer from Gen-AI:
>>>
>>> | Description                         | PromQL Query                     
>>>                                                                             
>>>     | Notes                                                                 
>>>                           |
>>>
>>> |-------------------------------------|------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
>>> | Minimum request duration (1m)       | histogram_quantile(0, sum by 
>>> (le) (rate(http_request_duration_seconds_bucket[1m])))                     
>>>         | Fast but may be noisy or return NaN if low traffic. Good for 
>>> near-real-time.                   |
>>> | Maximum request duration (1m)       | histogram_quantile(1, sum by 
>>> (le) (rate(http_request_duration_seconds_bucket[1m])))                     
>>>         | Same as above, for longest duration estimate.                     
>>>                               |
>>> | Minimum request duration (5m)       | histogram_quantile(0, sum by 
>>> (le) (rate(http_request_duration_seconds_bucket[5m])))                     
>>>         | More stable, smoother estimate over a slightly longer window.     
>>>                               |
>>> | Maximum request duration (5m)       | histogram_quantile(1, sum by 
>>> (le) (rate(http_request_duration_seconds_bucket[5m])))                     
>>>         | Recommended when traffic is bursty or histogram series are 
>>> sparse.                             |
>>>
>>> Please confirm if the above answer is reliable or not. 
>>> On Wednesday, June 18, 2025 at 3:23:54 PM UTC+2 tejaswini vadlamudi 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I’m using Prometheus to monitor request durations via a histogram 
>>>> metric, e.g., http_request_duration_seconds_bucket. I would like to 
>>>> query:
>>>>
>>>>    - The minimum time taken by a request
>>>>    - The maximum time taken by a request
>>>>
>>>> …over a given time range (say, the last 1h or 24h).
>>>>
>>>> I understand that histogram buckets give cumulative counts of requests 
>>>> below certain durations, but I’m not sure how to extract the actual min or 
>>>> max values of request durations during a time window.
>>>>
>>>> Is this possible directly via PromQL? Or is there a recommended 
>>>> workaround (e.g., recording rules, external processing, or using 
>>>> histogram_quantile() in a specific way)?
>>>>
>>>> Thanks in advance for any guidance!
>>>>
>>>> Br,
>>>> Teja
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/prometheus-users/9dbf9f26-7d65-4730-977d-7ffb76133a02n%40googlegroups.com.

[prometheus-users] Re: Maximum and Minimum Request Duration on Prometheus Classic Histograms

Reply via email to