[prometheus-users] Re: Counts from rates

Brian Candler Thu, 29 Jun 2023 23:30:39 -0700

You can break down your query into parts to find out what's happening.

If you go to the PromQL web interface, you can enter a range query like


    requests_total[10m15s]

and you'll see the raw, timestamped data points in that time window. (You 
must be in the "Table" view rather than the "Graph" view).  Similarly you 
can do subqueries like

    rate(requests_total[5m15s])[10m:5m])

And finally you can average those values by hand, and compare to running 
avg_over_time on that expression. 

Of course, set the query evaluation time to be a fixed point in time, so 
they all align.

What I'd expect to see is:
- 41 data points in the first range query: let's call them x0 to x40 (from 
oldest to newest)
- a rate calculated between x0 and x20, which I'd expect equals 
(x20-x0)/300 in the absence of counter resets
- a rate calculated between x20 and x40, which I'd expect equals 
(x40-x20)/300

If you average these two rates you should get ((x20-x0)/300 + 
(x40-x20)/300)/2 = (x40-x0)/600 which is the rate you're looking for

By doing each of the steps by hand, you should be able to work out where 
your assumptions are falling down.

On Friday, 30 June 2023 at 00:28:34 UTC+1 Daniel Sabsay wrote:

> Thanks for the reply.
>
> > The underlying requirement you have is, I think, given a collection of 
> recorded rate[5m] values, can you turn this into a rate[1d] ?
>
> Yes, that’s correct.
>
> Regarding the 5m vs. 4m: I tried to adjust for that by increasing the rate 
> window by one scrape interval (in this case 15s). The result is still quite 
> a ways off:
>
> avg_over_time(rate(requests_total[315s])[10m:5m])*(10*60) => 1820 
> (expected 2153.84)
>
> I also tried this, to include more “slices” in the subquery. Still off:
>
> avg_over_time(rate(requests_total[75s])[10m:1m])*(10*60) => 2045.45 
> (expected 2153.84)
>
> I suspect there is some mathematical property I can’t articulate yet that 
> means this won’t ever be 100% accurate, even if we adjust for what promql 
> is doing. I interpreted your last comment as suggesting that the short time 
> period and choice of rate windows was the problem. While those things do 
> affect the result (see above), I can’t find a combination that produces the 
> expected result. Could you elaborate what you meant?
>
> On Thursday, June 29, 2023 at 12:27:43 AM UTC-7 Brian Candler wrote:
>
>> Not in the presence of counter resets, no.
>>
>> > My full question, reason for the question, and experiment is here: 
>> https://github.com/dsabsay/prom_rates_and_counters/blob/main/README.md
>>
>> "In other words, can one get the equivalent of increase(some_metric[1d]) by 
>> using the output of a recording rule like rate(some_metric[5m])?
>>
>> I think that increase(...) doesn't work in the way you think it does.
>>
>> increase(...) and rate(...) are the same thing, only differing by a 
>> factor of the time window.  That is: increase(some_metric[1d]) is *exactly* 
>> the same as rate(some_metric[1d])*86400.
>>
>> To find the exact difference between a metric now and 1d ago, you can use
>> some_metric - some_metric offset 1d
>>
>> However that does not work across counter resets, for obvious reasons.
>>
>> The underlying requirement you have is, I think, given a collection of 
>> recorded rate[5m] values, can you turn this into a rate[1d] ?  I think the 
>> avg_over_time() of those rates is the best you can do.  If these are 5m 
>> rates, then you'd want 5 minute steps: avg_over_time(    [ xx : 5m] )
>>
>> But you are testing this over very short time periods (10m) and therefore 
>> it's not going to be exact. In particular, rate([5m]) takes the rate 
>> between the first and last data points in a 5 minute window. This means 
>> that if you are scraping at 1 minute intervals, you're actually calculating 
>> a rate over a 4 minute period.
>>
>> On Thursday, 29 June 2023 at 07:22:43 UTC+1 Daniel Sabsay wrote:
>>
>>> Is it possible to accurately calculate original counts from pre-recorded 
>>> rates?
>>> My experiments suggest the answer is no. But I’m curious to get other 
>>> perspectives to see if I’ve overlooked something or if there is a more 
>>> effective way to approach this.
>>>
>>> My full question, reason for the question, and experiment is here: 
>>> https://github.com/dsabsay/prom_rates_and_counters/blob/main/README.md
>>>
>>> Thanks!
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/eb76d93e-c943-4b92-b74c-65df57f8bf38n%40googlegroups.com.

[prometheus-users] Re: Counts from rates

Reply via email to