[prometheus-users] Spreading single alertmanager cluster nodes over multiple geographical regions

Al Thu, 27 Feb 2025 05:44:13 -0800

The alertmanager documentation states that each Prometheus instance should
send the alerts to each AM instance in a cluster:
https://github.com/prometheus/alertmanager/blob/main/README.md#high-availability
but from what I can see, these no explicit mention of distributing nodes
over a large geographical region (WAN instead of LAN)



Brian Candler also mentions in this post that we shouldn't attempt any
gossip or other network communications across regions :
https://groups.google.com/g/prometheus-users/c/vyHn-727Vp0

Unfortunately I can't seem to find any documentation clearly stating that
an alertmanager cluster spread over multiple regions (for example, 2 nodes
in a DC in North America and 2 other nodes in a DC in Europe) will not work
due to specific reasons.  If ia relatively high speed network exists
between birth regions and t's acceptable to potentially have a slightly
higher latency, wouldn't it be feasible to have a cluster distributed this
way?   Considering the eventually consistent nature of Gossip, why doesn't
this type of AM cluster more common?  I understand that the added latency
could potentially lead to duplicate alerts being sent to the destination
receiver, but given receivers such ss victorops would have the incident
triggered with the same ID, these should be essentially unaffected based on
my understanding?

The main purpose of this kind of configuration would be to adress the
following :
- have a single cluster to which silences to be managed
- to ensure global redundancy if one region should become unavailable


I would appreciate any feedback or advice on this topic.


Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/prometheus-users/CAGp9Lzv%3DgpC9%3DK2twT7hyGxO%2BGhVHGqvch48wmDOwsBJMpcH7A%40mail.gmail.com.

[prometheus-users] Spreading single alertmanager cluster nodes over multiple geographical regions

Reply via email to