altermanager config:
"""
resolve_timeout: 2m
routes:
- match:
env: st
receiver: 'aiwan_alert'
group_by: ['alertname']
group_wait: 10s
group_interval: 1m
repeat_interval: 24h
continue: true
receivers:
- name: 'aiwan_alert'
webhook_configs:
- url: 'http://xxxxxxxx:8001/api/v1/mgr/alerts'
send_resolved: true
"""
在2020年12月23日星期三 UTC+8 下午5:53:02<赵坏蛋> 写道:
> promethues (version 2.11.0)
> promethues rules:
> “””
> - alert: ServiceQualityDecline
> expr: (min(collectd_link_e2e_score) by
> (hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName) -
> min(collectd_link_e2e_score{} offset 5m) by
> (hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName))
> /min(collectd_link_e2e_score offset 5m) by
> (hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName) > 0.6
> for: 2m
> labels:
> severity: Emergency
> annotations:
> summary: "{{ $labels.neId }}: service quality has declined more than
> 60%."
> description: "{{ $labels.neId }}: E2E score of {{ $labels.link }} is
> `declined."
> - alert: ServiceQualityDecline
> expr: (min(collectd_link_e2e_score) by
> (hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName) -
> min(collectd_link_e2e_score{} offset 5m) by
> (hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName))
> /min(collectd_link_e2e_score offset 5m) by
> (hostname,env,bond,companyId,siteName,neId,deviceId,dstNe,companyName) > 0.3
> for: 2m
> labels:
> severity: Critical
> annotations:
> summary: "{{ $labels.neId }}: service quality has declined more than
> 30%."
> description: "{{ $labels.neId }}: E2E score of {{ $labels.link }} is
> `declined."
> “””
>
> 在2020年12月23日星期三 UTC+8 下午5:47:41<赵坏蛋> 写道:
>
>> Most rules trigger alarms and alarm recovery are normal, but some alarms
>> only receive the alarm message, and the recovery message is not received.
>> And make sure that the alarms on promethues and altermanager are
>> restored. The webhook did not receive the recovery message from the
>> altermanager.
>>
>> Please help confirm whether this is a configuration problem or a bug.
>> thank!
>>
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/c1a89bf2-f9dd-46f0-8ab7-b7c96a9c82adn%40googlegroups.com.