Have multiple datacenters - each running own Prometheus instance. During
maintenance - I'd like to inhibit all the alerts in datacenter A, and only
those alerts in datacenter B that are related to A being down. This would
be scenario for avoiding alerts for db broken replication between
datacenters.
Thinking about creating these rules in each dc:
- alert: MaintenanceMode
expr: maintenance_mode == 1
for: 1m
labels:
severity: warning
annotations:
summary: This is a maintenance alert for {{ $labels.instance }}.
- alert: SatelliteMaintenanceMode
expr: maintenance_mode == 2
for: 1m
labels:
severity: warning
annotations:
summary: This is a satellite maintenance alert for {{
$labels.instance }}.
Than in alertmanager inhibition section I would try with
- source_match:
alertname: MaintenanceMode
target_match_re:
severity: 'warning|critical'
- source_match:
alertname: SatelliteMaintenanceMode
target_match_re:
alertname: 'MySQLReplicationNotRunning'
channel_name: .*
Metrics would be set exclusively so that Maintenance Alert is triggered on
only one datacenter at the time. Concern is should each rule use the same
metric to trigger the inhibition or to create different metrics?
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/8106a8de-5a30-48c2-964f-1d6f2298dcabn%40googlegroups.com.