Hi Stuart.

On Wed, 25 Nov, 2020, 6:56 pm Stuart Clark, <[email protected]>
wrote:

> On 25/11/2020 11:46, [email protected] wrote:
> > The alert formation doesn't seem to be a problem here, because it
> > happens for different alerts randomly. Below is the alert for Exporter
> > being down for which it has happened thrice today.
> >
> >   - alert: ExporterDown
> >     expr: up == 0
> >     for: 10m
> >     labels:
> >       severity: "CRITICAL"
> >     annotations:
> >       summary: "Exporter down on *{{ $labels.instance }}*"
> >       description: "Not able to fetch application metrics from *{{
> > $labels.instance }}*"
> >
> > - the ALERTS metric shows what is pending or firing over time
> > >> But the problem is that one of my ExporterDown alerts is active
> > since the past 10 days, there is no genuine reason for the alert to go
> > to a resolved state.
> >
> What do you have evaluation_interval set to in Prometheus, and
> resolve_timeout in Alertmanager?
>
>> My evaluation interval is 1m whereas my scrape timeout and scrape
interval are 25s. Resolve timeout in Alertmanager is 5m.

>
> Is the alert definitely being resolved, as in you are getting a resolved
> email/notification, or could it just be an email/notification for a long
> running alert? - you should get another email/notification every now and
> then based on repeat_interval.
>
>> Yes, I suspected that too in the beginning but I am logging each and
every alert notification and found that I am indeed getting resolved
notification for that alert and again firing notification the very next
second.

>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAFGi5vBO-T%3DxnZH5FSJBAKTLJp-%2BMDm4fWoHyc_HbwPh4UU3-g%40mail.gmail.com.

Reply via email to