On 25/11/2020 11:46, [email protected] wrote:
The alert formation doesn't seem to be a problem here, because it
happens for different alerts randomly. Below is the alert for Exporter
being down for which it has happened thrice today.
- alert: ExporterDown
expr: up == 0
for: 10m
labels:
severity: "CRITICAL"
annotations:
summary: "Exporter down on *{{ $labels.instance }}*"
description: "Not able to fetch application metrics from *{{
$labels.instance }}*"
- the ALERTS metric shows what is pending or firing over time
>> But the problem is that one of my ExporterDown alerts is active
since the past 10 days, there is no genuine reason for the alert to go
to a resolved state.
What do you have evaluation_interval set to in Prometheus, and
resolve_timeout in Alertmanager?
Is the alert definitely being resolved, as in you are getting a resolved
email/notification, or could it just be an email/notification for a long
running alert? - you should get another email/notification every now and
then based on repeat_interval.
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/3177d97d-1d85-2d69-7dcd-c8565c3de791%40Jahingo.com.