Is anyone able to share some worked examples based from Brian's blog post
(https://www.robustperception.io/using-time-series-as-alert-thresholds),
specifically related to setting differing disk space thresholds and
alerting from those?
I'm struggling to apply this article - perhaps I am finding the concept a
little abstract.
I have tried to construct the example illustrated below, but this doesn't
work and the recording rules error with "vector contains metrics with the
same labelset after applying rule labels".
I have three example systems, one should be monitored by the "default"
threshold in the alert rule definition (instance: pi4-1.home:9100), the
other two should have threshold set via recording rules (instance:
pi4-2.home:9100 adn pi4-3.home:9100). I want to set thresholds per
instance. Am I correct in thinking that I need a rule per instance as I am
setting the override on the instance label?
Alert:
- alert: HostOutOfDiskSpace
expr: |
# Alert on per instance thresholds, with a default
(node_filesystem_avail_bytes{mountpoint="/"} * 100) /
node_filesystem_size_bytes{mountpoint="/"}
< on (instance) group_left()
(
node_filesystem_threshold
or on(instance)
count by (instance)(node_filesystem_avail_bytes{mountpoint="/"}
* 100) / node_filesystem_size_bytes{mountpoint="/"} * 0 + 70
)
for: 5s
labels:
severity: critical
notification: slack
annotations:
summary: "{{ $labels.alertname }} on {{ $labels.instance }}"
description: "Disk is almost full {{ humanize $value }}% on {{
$labels.mountpoint }}"
Recording rules:
groups:
- name: example
rules:
- record: node_filesystem_threshold
expr: (node_filesystem_avail_bytes{mountpoint="/"} * 100) /
node_filesystem_size_bytes{mountpoint="/"} < 90
labels:
instance: pi4-2.home:9100
- record: node_filesystem_threshold
expr: (node_filesystem_avail_bytes{mountpoint="/"} * 100) /
node_filesystem_size_bytes{mountpoint="/"} < 90
labels:
instance: pi4-3.home:9100
This gives the error below:
[image: recording_rule_capture.PNG]
If anyone is able to help me build out/correct this example I would be most
grateful.
Thanks.
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/79fc2693-3fe1-4c9e-aacd-8bf7c66a5b3en%40googlegroups.com.