On Tuesday, 4 January 2022 at 08:07:14 UTC Victor Sudakov wrote:
> Hello Brian! I don't quite understand why "(foo == N)[2d:1m]" or even
> "(foo == N)[2d:]" is allowed while "(foo == N)[2d]" is not?
foo[2d] is a range vector. It gives you all the individual timestamped
data points belonging to all timeseries for metric "foo" within a time
period from evaluation time T to T-2d.
However, range vectors can *only* be applied to pure metrics, not to
expressions. "foo == N" is an expression which generates an instant vector
at some evaluation time T.
The reason for this limitation becomes clear when you consider expressions
which calculate across multiple timeseries, such as
sum(foo)
or
foo / bar
Metrics "foo" and "bar" compromise multiple timeseries, identified by
different label sets. However within each timeseries, the data points have
their own unique timestamps: the data points in foo{bar="a"} were not
necessarily scraped at the same time as foo{bar="b"}.
Therefore, the only possible way to do arithmetic across timeseries is to
pick some arbitrary evaluation time T, take the value of those timeseries
at that same point T, and give the result timestamped with T. A subquery
lets you repeat that across a time window: it scans across the window at
intervals of some step S, repeating the calculation at those times.
What is the value of a timeseries at time T, given that it may not have a
data point at exactly T? It's the value of the most recent data point on
*or before* time T, looking back no more than the staleness window (by
default 5 minutes)
>
> >
> > What this does is evaluate the expression foo == N at the current time
> T,
> > at time T-1m, at time T-2m etc. In the results, this won't give you the
> > *exact* time that the data point occurred: it will give you a timestamp
> of
> > T-Nm, which will be up to 1 minute after the timestamp of the point
> > itself. (The value of a timeseries at time T is the value of the most
> > recent data point on or before time T).
>
> Sounds fine with me if it does not skip/hide peaks but shows the time
> nearest to the peak. Does it?
It won't be the time "nearest" the peak, but the first sampling time
*after* the peak. That is, if you have a 15 second step in your subquery,
and a 15 second sampling interval, then the timestamp could be up to 14.99
seconds after the event.
You can prove this to yourself by comparing the timestamps of
foo[2d]
foo[2d:15s]
Look for the corresponding peaks / data points.
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/ed91d953-31cc-45bc-9ed2-4e8f57aa1c69n%40googlegroups.com.