First of all, thanks for your answer.

Scraping the Alertmanager is an interesting idea. However, although rather 
unlikely, Prometheus may be able to scrape it, but not send alerts to it.

In the meantime, I found another way on the Internet which should be more 
reliable:

- alert: PrometheusErrorSendingAlertsToSomeAlertmanagers
  annotations:
    description: '{{ printf "%.1f" $value }}% errors while sending alerts 
from Prometheus
      {{$labels.instance}} to Alertmanager {{$labels.alertmanager}}.'
    summary: Prometheus has encountered more than 1% errors sending alerts 
to a specific Alertmanager.
  expr: |
    (
      rate(prometheus_notifications_errors_total{job="prometheus"}[5m])
    /
      rate(prometheus_notifications_sent_total{job="prometheus"}[5m])
    )
    * 100
    > 1  # This is a percentage.
  for: 15m
  labels:
    severity: critical

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/d17d1295-84f5-41c3-b587-51d346a65614n%40googlegroups.com.

Reply via email to