Re: [prometheus-users] Re: Alert Duplication in HA Prometheus & Alertmanager setup.

Nolan Crooks Thu, 25 Mar 2021 12:12:22 -0700

I don't know about the original poster, but in my case, I have an 2 
instances of Prometheus using the same config file, with an external label 
set to "prom" on each. I have both pointed to 2 alertmanagers which are 
clustered, and if I create a silence in one I can see it appear in the 
other. However, I still am getting duplicate alerts. The first one will 
fire and the second will fire between 15-20s after. My group wait in 
Alertmanager is 45s, and the group interval is 1m. Scrape interval and 
evaluation intervals are both 5s. What could be causing this behaviour?
On Tuesday, March 23, 2021 at 3:38:00 PM UTC-7 [email protected] wrote:


> Do you have a replica label (in external_labels in Prometheus) that 
> distinguishes the two replicas so that the alerts no longer look the same 
> to Alertmanager?
>
> In that case, you would have to drop that first, see "Removing HA Replica 
> Labels from Alerts" under 
> https://training.promlabs.com/training/relabeling/writing-relabeling-rules/keeping-and-dropping-labels
>
> If that is not the problem, maybe your AM instances aren't talking to each 
> other correctly. If you create a silence in one of the AM replicas, does it 
> appear in the other? There should also be log messages about the peer 
> discovery, as well as on the /status page of Alertmanager.
>
> On Tue, Mar 23, 2021 at 11:11 PM Nolan Crooks <[email protected]> 
> wrote:
>
>> I am also having this issue. I am running identical instances of 
>> Prometheus on the same version though.
>> On Friday, November 6, 2020 at 8:02:31 AM UTC-8 [email protected] 
>> wrote:
>>
>>> Hi. I have a HA Prometheus setup, with 2 instances(x.x.x.x and y.y.y.y) 
>>> scraping exactly the same targets. On the respective machines, Alertmanager 
>>> is also running and a mesh is created. But I am observing that all the 
>>> alerts are getting duplicated and I am receiving every alert twice. 
>>> Alertmanager Version - 0.21.0.
>>> /usr/local/bin/alertmanager --config.file 
>>> /etc/alertmanager/alertmanager.yml --storage.path /mnt/vol2/alertmanager 
>>> --data.retention=120h --log.level=debug --web.listen-address=x.x.x.x:9093 
>>> --cluster.listen-address=x.x.x.x:9094 --cluster.peer=y.y.y.y:9094
>>>
>>> Oh, one thing that just popped into my head, for the temporary testing 
>>> period I am running different versions of Prometheus in the instances. 
>>> 2.12.0 in one and 2.20.1 in the other one. Could this also cause this?
>>>
>>> Thanks in advance!
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/c8245b99-376b-44c2-9a6f-433912a294den%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/c8245b99-376b-44c2-9a6f-433912a294den%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
> Julius Volz
> PromLabs - promlabs.com
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/57f6f5eb-c53e-4987-8717-ae600e3499d6n%40googlegroups.com.

Re: [prometheus-users] Re: Alert Duplication in HA Prometheus & Alertmanager setup.

Reply via email to