Any thoughts on this, anyone? On Friday, 5 August 2022 at 11:38:06 UTC+1 Ionel Sirbu wrote:
> Hello all, > > We've recently configured our alertmanagers to be HA as per the specs: > - 3 instances, using a kubernetes statefulset; > - both TCP & UDP opened for the HA cluster port: > > > > > > > > > > > * ports: - containerPort: 8001 name: service protocol: > TCP - containerPort: 8002 name: ha-tcp protocol: TCP - > containerPort: 8002 name: ha-udp protocol: UDP* > > - all 3 instances point to instance 0 for clustering (I assumed there > wouldn't be a problem with instance 0 pointing to itself): > > > > > > > *spec: containers: - args: // ... - > --cluster.peer=testprom-am-0.testprom-am.default.svc.cluster.local:8002* > * image: quay.io/prometheus/alertmanager:v0.23.0 > <http://quay.io/prometheus/alertmanager:v0.23.0>* > > - prometheus points to the 3 alertmanager instances: > > > > > > > *alertmanagers: - static_configs: - targets: - > testprom-am-0.testprom-am.default.svc.cluster.local:8001 - > testprom-am-1.testprom-am.default.svc.cluster.local:8001 - > testprom-am-2.testprom-am.default.svc.cluster.local:8001* > > However, against all that, we keep getting errors like this rather often > (e.g. 124 within 30 minutes): > > *level=debug ts=2022-08-04T12:03:19.284Z caller=cluster.go:329 > component=cluster memberlist="2022/08/04 12:03:19 [DEBUG] memberlist: > Failed ping: 01G9M3WYRFHA0DCCWRVERYJX2A (timeout reached)\n"* > > Is that something to worry about? Is there anything more that needs to be > configured with regards to HA? > With the exception of a particular case, alerts seem to work just fine. > It's when we do a rolling upgrade to the kubernetes cluster that previous > alerts fire again all of a sudden. Any idea what could be causing that? > > Many thanks, > Ionel > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/cd51ce55-20f9-4e1c-8045-23a59584c611n%40googlegroups.com.

