On Friday, 9 December 2022 at 07:31:32 UTC [email protected] wrote:
> expression:
> windows_mscluster_resourcegroup_state {name!~"Available Storage"} != 0 or
> on() vector(0)
>
> The alert goes off non-stop.
>
Yes, that's correct.
PromQL expressions don't work like normal boolean expressions. They return
the presence or absence of values, not a true or false value. The presence
of *any* value will trigger an alert, and vector(0) generates a value all
of the time.
For example, suppose you have 5 timeseries for the metric
"node_filesystem_avail_bytes".
The PromQL expression "node_filesystem_avail_bytes" returns an instant
vector containing 5 values.
The PromQL expression "node_filesystem_avail_bytes < 10000000" returns an
instant vector containing between 0 and 5 values; you have filtered down to
just those timeseries whose values are less than the threshold.
If you use this as an alerting expression, then if the instant vector is
not empty, i.e. if 1 or more machines have a value less than the threshold,
then an alert is generated.
> How can I set the metric to send an alert when the value is different
> from 0 and is null?
>
There is no concept of "null" in PromQL. (Well, you can store a floating
point value of "NaN" in a timeseries, but that's not what we're discussing
here).
Either a timeseries is present, or it is not.
Hence I'm not really sure what you're trying to alert on. What do your
metrics look like?
Let me guess they look something like this:
windows_mscluster_resourcegroup_state{instance="foo",name="Available
Storage"} 123
windows_mscluster_resourcegroup_state{instance="foo",name="Broken Storage"}
0
windows_mscluster_resourcegroup_state{instance="bar",name="Available
Storage"} 0
windows_mscluster_resourcegroup_state{instance="bar",name="Broken Storage"}
4
Now, this alerting expression:
windows_mscluster_resourcegroup_state {name!~"Available Storage"} != 0
will only alert on the last one of these (it filters to labels which are
not "Available Storage", and then it filters to values which are not 0, and
only the fourth metric shown matches both conditions)
Similarly, "or" works differently to what you might expect.
foo or bar
will return a union of:
- all timeseries with metric name "foo", PLUS:
- all those timeseries with metric name "bar" which *don't* have exactly
the same label sets as the timeseries on the LHS (foo)
Since vector(0) has no labels, but the expression you gave on your LHS has
labels, this will *always* include vector(0) in the result set, and
therefore will always generate alerts.
The question is, what sort of "missing" values do you want to look for?
For example, are you trying to alert on instance "baz", which doesn't
generate *any* values for windows_mscluster_resourcegroup_state ? If so,
you either need to alert explicitly on this absence, or you need to
cross-reference to some other timeseries which refers to "baz" (such a
timeseries is often "up"). Otherwise, the PromQL expression for
windows_mscluster_resourcegroup_state has no way of knowing that you
*expect* a value for baz, but there isn't one.
So one possibility is:
absent(windows_mscluster_resourcegroup_state{instance="baz",name="Available
Storage"})
which will alert explicitly if there is no timeseries with that metric name
and those particular labels. But you've hard-coded the existence of a
machine called "baz" into your alerting rules.
Or are you trying to alert on any node which is being scraped by scrape job
"windows_exporter" but is not returning
windows_mscluster_resourcegroup_state with a particular label? The "up"
metric tells you whether something is being scraped, so the expression
might be along the lines of "... or on (instance) up"
If you show the *actual* metrics you are scraping (including the full label
sets), and an example of an *actual* condition you are trying to catch,
then we can help you write the expression.
For more hints:
https://www.robustperception.io/absent-alerting-for-jobs/
https://www.robustperception.io/existential-issues-with-metrics/
https://www.robustperception.io/staleness-and-promql/
https://www.robustperception.io/functions-to-avoid/
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/6f2fe456-1328-43d5-840d-923b695bb69en%40googlegroups.com.