Hi,

We have an environment where there is poor connectivity between the 
prometheus server and the alertmanager (IE. prometheus server in a remote 
location). 

When an alert fires from the rules installed on the local prometheus server 
(as best practices advise) we receive a notification and all is well. 
However when we lose connectivity for more than 5 minutes (the time 
configured in prometheus) the alert is "resolved" only to begin firing 
again when connectivity returns.

Does anyone know of any way to resolve this? It sounds like perhaps there 
should be an option for an "explicit" resolution from prometheus to 
alertmanager - IE a request from promeheus to alertmanager that 
specifically triggers a resolution otherwise it is assumed to be continuing 
to fire? Or should this be covered by an inhibition rule based on the 
prometheus server responding?

Thanks in advance :)

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/38532947-cfd6-4652-a75e-ebf7359ef846n%40googlegroups.com.

Reply via email to