Brian Candler wrote:
> 
> > Hello Brian! I don't quite understand why "(foo == N)[2d:1m]" or even 
> > "(foo == N)[2d:]" is allowed while "(foo == N)[2d]" is not?
> 
> 
> foo[2d] is a range vector.  It gives you all the individual timestamped 
> data points belonging to all timeseries for metric "foo" within a time 
> period from evaluation time T to T-2d.
> 
> However, range vectors can *only* be applied to pure metrics, not to 
> expressions.  "foo == N" is an expression which generates an instant vector 
> at some evaluation time T.
> 
> The reason for this limitation becomes clear when you consider expressions 
> which calculate across multiple timeseries, such as
>     sum(foo)
> or
>    foo / bar
> 
> Metrics "foo" and "bar" compromise multiple timeseries, identified by 
> different label sets.  However within each timeseries, the data points have 
> their own unique timestamps: the data points in foo{bar="a"} were not 
> necessarily scraped at the same time as foo{bar="b"}.

You probably meant "comprise" ?

> 
> Therefore, the only possible way to do arithmetic across timeseries is to 
> pick some arbitrary evaluation time T, take the value of those timeseries 
> at that same point T, and give the result timestamped with T.  A subquery 
> lets you repeat that across a time window: it scans across the window at 
> intervals of some step S, repeating the calculation at those times.
> 
> What is the value of a timeseries at time T, given that it may not have a 
> data point at exactly T? It's the value of the most recent data point on 
> *or before* time T, looking back no more than the staleness window (by 
> default 5 minutes)

Thank you, this was very educational albeit a bit difficult to grasp.

> > > What this does is evaluate the expression foo == N at the current time 
> > T, 
> > > at time T-1m, at time T-2m etc. In the results, this won't give you the 
> > > *exact* time that the data point occurred: it will give you a timestamp 
> > of 
> > > T-Nm, which will be up to 1 minute after the timestamp of the point 
> > > itself. (The value of a timeseries at time T is the value of the most 
> > > recent data point on or before time T). 
> >
> > Sounds fine with me if it does not skip/hide peaks but shows the time 
> > nearest to the peak. Does it?
> 
> 
> It won't be the time "nearest" the peak, but the first sampling time 
> *after* the peak.  That is, if you have a 15 second step in your subquery, 
> and a 15 second sampling interval, then the timestamp could be up to 14.99 
> seconds after the event.
> 
> You can prove this to yourself by comparing the timestamps of
> 
> foo[2d]
> foo[2d:15s]
> 
> Look for the corresponding peaks / data points.

I've noticed that if I take a larger resampling interval, like
"foo[2d:1h]", I lose all my peaks. Which is kind of understandable now
but the question "how to better find peaks" kind of remains.


-- 
Victor Sudakov VAS4-RIPE
http://vas.tomsk.ru/
2:5005/49@fidonet

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/YdQUqkBITFtbgDho%40admin.sibptus.ru.

Reply via email to