On Friday, 9 December 2022 at 07:31:32 UTC [email protected] wrote:

> expression:
> windows_mscluster_resourcegroup_state {name!~"Available Storage"} != 0 or 
> on() vector(0)
>
> The alert goes off non-stop.
>

Yes, that's correct.

PromQL expressions don't work like normal boolean expressions.  They return 
the presence or absence of values, not a true or false value.  The presence 
of *any* value will trigger an alert, and vector(0) generates a value all 
of the time.

For example, suppose you have 5 timeseries for the metric 
"node_filesystem_avail_bytes".

The PromQL expression "node_filesystem_avail_bytes" returns an instant 
vector containing 5 values.

The PromQL expression "node_filesystem_avail_bytes < 10000000" returns an 
instant vector containing between 0 and 5 values; you have filtered down to 
just those timeseries whose values are less than the threshold.

If you use this as an alerting expression, then if the instant vector is 
not empty, i.e. if 1 or more machines have a value less than the threshold, 
then an alert is generated.

 

>  How can I set the metric to send an alert when the value is different 
> from 0 and is null?
>

There is no concept of "null" in PromQL.  (Well, you can store a floating 
point value of "NaN" in a timeseries, but that's not what we're discussing 
here).

Either a timeseries is present, or it is not.
 
Hence I'm not really sure what you're trying to alert on.  What do your 
metrics look like?

Let me guess they look something like this:

windows_mscluster_resourcegroup_state{instance="foo",name="Available 
Storage"} 123
windows_mscluster_resourcegroup_state{instance="foo",name="Broken Storage"} 
0
windows_mscluster_resourcegroup_state{instance="bar",name="Available 
Storage"} 0 
windows_mscluster_resourcegroup_state{instance="bar",name="Broken Storage"} 
4

Now, this alerting expression:

windows_mscluster_resourcegroup_state {name!~"Available Storage"} != 0

will only alert on the last one of these (it filters to labels which are 
not "Available Storage", and then it filters to values which are not 0, and 
only the fourth metric shown matches both conditions)

Similarly, "or" works differently to what you might expect.

foo or bar

will return a union of:
- all timeseries with metric name "foo", PLUS:
- all those timeseries with metric name "bar" which *don't* have exactly 
the same label sets as the timeseries on the LHS (foo)

Since vector(0) has no labels, but the expression you gave on your LHS has 
labels, this will *always* include vector(0) in the result set, and 
therefore will always generate alerts.

The question is, what sort of "missing" values do you want to look for?

For example, are you trying to alert on instance "baz", which doesn't 
generate *any* values for windows_mscluster_resourcegroup_state ?  If so, 
you either need to alert explicitly on this absence, or you need to 
cross-reference to some other timeseries which refers to "baz" (such a 
timeseries is often "up").  Otherwise, the PromQL expression for 
windows_mscluster_resourcegroup_state has no way of knowing that you 
*expect* a value for baz, but there isn't one.

So one possibility is:

absent(windows_mscluster_resourcegroup_state{instance="baz",name="Available 
Storage"})

which will alert explicitly if there is no timeseries with that metric name 
and those particular labels.  But you've hard-coded the existence of a 
machine called "baz" into your alerting rules.

Or are you trying to alert on any node which is being scraped by scrape job 
"windows_exporter" but is not returning 
windows_mscluster_resourcegroup_state with a particular label?  The "up" 
metric tells you whether something is being scraped, so the expression 
might be along the lines of "... or on (instance) up"

If you show the *actual* metrics you are scraping (including the full label 
sets), and an example of an *actual* condition you are trying to catch, 
then we can help you write the expression.

For more hints:

https://www.robustperception.io/absent-alerting-for-jobs/
https://www.robustperception.io/existential-issues-with-metrics/
https://www.robustperception.io/staleness-and-promql/
https://www.robustperception.io/functions-to-avoid/

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/6f2fe456-1328-43d5-840d-923b695bb69en%40googlegroups.com.

Reply via email to