Hi,
We are using blackbox exporter on a remote location to monitor gateway
routers, hypervisors and virtual machines (router —> hypervisor —> virtual
machines). We are looking for something like below.
*Example 1:*
If a gateway router is down and alertmanager is firing, it should stop
alerting on hypervisor hosts and servers
*Example2:*
If a hypervisor is down, it should not alert on the virtual machines
On prometheus, we group routers in one group, hypervisor on another group
and also virtual machines as a single group.
*Example:*
job_name: 'blackbox_icmp-routers
job_name: 'blackbox_icmp-hypervisors
job_name: 'blackbox_icmp-virtualmachines
Alertmanager rules are defined based on each job
- name: RouterDown
rules:
- alert: R-InstanceDown
expr: probe_success{job="blackbox_icmp-routers} == 0
for: 1m
- name: HypervisorDown
rules:
- alert: H-InstanceDown
expr: probe_success{job="blackbox_icmp-hypervisors} == 0
for: 1m
- name: VirtualMachinesDown
rules:
- alert: V-InstanceDown
expr: probe_success{job="blackbox_icmp-virtualmachines} == 0
for: 1m
Alertmanager config as below:
route:
group_by: ['alertname']
receiver: ms-teams
repeat_interval: 5m
receivers:
- name: ms-teams
webhook_configs:
- url: 'http://monitoring:2000/alertmanager'
send_resolved: false
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
Any help is much appreciated.
Thanks
Sandosh
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/5c65e8ec-2412-4ded-9706-18f7f44c2bf8n%40googlegroups.com.