Hi, i am using the amtool client in a Job inside my cluster.
An alert was fired and we got notification in our slack channel, i used the cli (in code that runs inside docker image from the Job) to create a silence according to `alertname` matcher and there was no failure. from a look in the AlertManager UI no silence was created, and i got resolved notification after 5 minutes since the fired notification. After ~10 minutes the alert was fired and resolved again (5 minutes difference). I wonder why the silence wasn't able to create? (not the first time it happens) Maybe it's some kind of a race condition? we can't silence alerts which are not in fired state right? (although the alert was in fired state while i tried to create the silence) The Alert rule: name: Orchestrator GRPC Failures for ExternalProcessor Service <http://localhost:9090/graph?g0.expr=ALERTS%7Balertname%3D%22Orchestrator%20GRPC%20Failures%20for%20ExternalProcessor%20Service%22%7D&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0.g0.range_input=1h.> expr: sum(increase(grpc_server_handled_total{grpc_code!~"OK|Canceled",grpc_service="envoy.service.ext_proc.v3.ExternalProcessor"}[5m])) > 0 <http://localhost:9090/graph?g0.expr=sum(increase(grpc_server_handled_total%7Bgrpc_code!~%22OK%7CCanceled%22%2Cgrpc_service%3D%22envoy.service.ext_proc.v3.ExternalProcessor%22%7D%5B5m%5D))%20%3E%200&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0.g0.range_input=1h.> for: 5m labels: severity: WARNING annotations: dashboard_url: p-R7Hw1Iz runbook_url: extension-orchestrator-dashboard summary: Failed gRPC calls detected in the Envoy External Processor within the last 5 minutes. <!subteam^S06E0CPPC5S> The code for creating the silence: func postSilence(amCli amclient.Client, matchers []*models.Matcher) error { startsAt := strfmt.DateTime(silenceStart) endsAt := strfmt.DateTime(silenceStart.Add(silenceDuration)) createdBy := creatorType comment := silenceComment silenceParams := silence.NewPostSilencesParams().WithSilence( &models.PostableSilence{ Silence: models.Silence{ Matchers: matchers, StartsAt: &startsAt, EndsAt: &endsAt, CreatedBy: &createdBy, Comment: &comment, }, }, ) err := amCli.PostSilence(silenceParams) if err != nil { return fmt.Errorf("failed on post silence: %w", err) } log.Print("Silence posted successfully") return nil } Thank in advance, Saar Zur SAP Labs -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/prometheus-users/60b275a6-f9b2-4bae-a9d2-95460f6b8cf0n%40googlegroups.com.