promethues (version 2.11.0)
promethues rules:
“””
- alert: ServiceQualityDecline
expr: (min(collectd_link_e2e_score) by
(hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName) -
min(collectd_link_e2e_score{} offset 5m) by
(hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName))
/min(collectd_link_e2e_score offset 5m) by
(hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName) > 0.6
for: 2m
labels:
severity: Emergency
annotations:
summary: "{{ $labels.neId }}: service quality has declined more than
60%."
description: "{{ $labels.neId }}: E2E score of {{ $labels.link }} is
`declined."
- alert: ServiceQualityDecline
expr: (min(collectd_link_e2e_score) by
(hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName) -
min(collectd_link_e2e_score{} offset 5m) by
(hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName))
/min(collectd_link_e2e_score offset 5m) by
(hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName) > 0.3
for: 2m
labels:
severity: Emergency
annotations:
summary: "{{ $labels.neId }}: service quality has declined more than
30%."
description: "{{ $labels.neId }}: E2E score of {{ $labels.link }} is
`declined."
“””
在2020年12月23日星期三 UTC+8 下午5:47:41<赵坏蛋> 写道:
> Most rules trigger alarms and alarm recovery are normal, but some alarms
> only receive the alarm message, and the recovery message is not received.
> And make sure that the alarms on promethues and altermanager are restored.
> The webhook did not receive the recovery message from the altermanager.
>
> Please help confirm whether this is a configuration problem or a bug.
> thank!
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/1002b561-b61c-45ad-adf4-45648e1785d2n%40googlegroups.com.