On 25/11/2020 11:46, [email protected] wrote:
The alert formation doesn't seem to be a problem here, because it happens for different alerts randomly. Below is the alert for Exporter being down for which it has happened thrice today.

  - alert: ExporterDown
    expr: up == 0
    for: 10m
    labels:
      severity: "CRITICAL"
    annotations:
      summary: "Exporter down on *{{ $labels.instance }}*"
      description: "Not able to fetch application metrics from *{{ $labels.instance }}*"

- the ALERTS metric shows what is pending or firing over time
>> But the problem is that one of my ExporterDown alerts is active since the past 10 days, there is no genuine reason for the alert to go to a resolved state.

What do you have evaluation_interval set to in Prometheus, and resolve_timeout in Alertmanager?

Is the alert definitely being resolved, as in you are getting a resolved email/notification, or could it just be an email/notification for a long running alert? - you should get another email/notification every now and then based on repeat_interval.


--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/3177d97d-1d85-2d69-7dcd-c8565c3de791%40Jahingo.com.

Reply via email to