Hi all.
A few months ago we introduced target down rules to keep track of targets
that were missing. The rules are relatively simple being something like e.g.
alert: target_down_slower_scraping_jobs
expr: up{job=~"monitoring-scripts-5m|monitoring-scripts-hourly"} == 0
for: 13m
labels:
severity: average
annotations:
// annotations here
A few days ago we wanted to introduce absence rules and we added them for
both targets and metrics. That is all ok but with a side effect that we
didn't consider, i.e. a metrics absent alert would of course spawn if the
corresponding target is down. Looking into it I've found this blog post
<https://www.robustperception.io/absent-alerting-for-scraped-metrics> proposing
to use unless binary operator but I'm not sure I've understood the usage
and its implications.
Unless returns the first metric unless we have some match for the second.
If I write something like
expr: up{job="node"} == 1 unless absent(check_success{check="xxxxx",stack=
"yyy",environment="zzz"})
I'm just going to return the upness if everything is fine with the node.
Isn't that wrong? I mean, that would result in an alert because the node is
up, which is not what we want. Even changing that to 0 would not solve the
problem since we would still return the absence. Maybe changing to zero and
inverting the two? But then wouldn't I have duplicated alerts for the
upness?
Is there a way to make sure absent rules take in account targets down? Or
should I approach the issue in some other different ways which I'm not
considering now?
Thanks in advance,
F.
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/7758b7cc-79e9-4b0a-b39f-bff6bcb62d4co%40googlegroups.com.