On Thursday, 2 December 2021 at 10:35:55 UTC Dj Fox wrote:

> @Brian
> Do you confirm that one of the main reason that Alertmanager cluster needs 
> to handle the same set of alerts as the others (and hence be plugged to all 
> same prometheuses) is because of the way deduplication works?
>

Kind of, although I think it's more for redundancy.  If Prometheus only 
sent to one alertmanager and that one failed, then you wouldn't get your 
alert.  So it sends to all of them, and then those which are up should have 
a consistent view quickly.
 

> - It would be cool to be able to tell each alertmanager: this is my 
> "alert-family". The deduplication mechanism would then only occur among the 
> members sharing the same "alert-family" value. That way, we are not forced 
> to connect 20 Proms to every alertmanager anymore, and all of them can 
> still gossip the valuable "silences".
>

That's basically having multiple alertmanager clusters.
 

> - And/or being able to give to amtool a list of clusters so that it can 
> also handle several clusters.
>

Well, it is scriptable, and tools like karma will do this for you.

Instead you could follow MR's suggested approach and build a single global 
alertmanager cluster, say three nodes in total, and point all the other 
prometheus servers at those three.  I think that's reasonable: if a region 
becomes so isolated that it can't talk to that central alertmanager 
cluster, then it's probably so isolated that it couldn't send out an alert 
via E-mail either.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9c1c58f4-4662-4862-95d9-8a0987977d5dn%40googlegroups.com.

Reply via email to