Re: [prometheus-users] Discuss Prmetheus alerts suppressing

Badreddin Aboubakr Fri, 19 Feb 2021 13:58:33 -0800

Unfortunately this won't work for us due to different reasons :
* Our environment is highly dynamic, we take servers in and out from
production more often
* We then would need an aggregator like (thanos-query) if we want to see
our data in one place
* We also want to avoid as much as we can the appearing and disappearing
time-series


On Fri, Feb 19, 2021 at 10:48 PM Stuart Clark <[email protected]>
wrote:

> On 19/02/2021 19:43, Badreddin Aboubakr wrote:
>
>
> Hello,
>
> We use Prometheus to monitor our infrastructure (hypervisors, gateways,
> storage servers, etc). Scrape targets are sourced from a Postgres database,
> which contains additional information about the “in production” state of
> the target. In the beginning we used to have a metadata metric which
> indicated the state of the server as an `enum` metric.
>
> By joining the state metric on each alerting rule and then dropping the
> alerts which have specific state, we were able to suppress un-needed alerts
>
> With the growth of number of alerting rules and the number of states,
> joining on these metrics in all alerting rules became so expensive that we
> wrote some recording rules which keeps evaluating the enum metric and
> produces enum metric with less cardinality (production (where alerts shall
> pass to their receivers) and everything else (Will be dropped at
> alertmanager step))
>
> so again we join on these metrics and drop alerts which have
> non-production.
>
> Now that is not going to scale but it was a temporary solution as our
> alerting rules are growing.
>
> So we discussed some solutions:
>
> * We can set silences and remove them on state change using alert manager
> API:
>
>    This approach is too dynamic however (I don’t know if alertmanager API
> was designed for this purpose and, maybe it’s )  Will that scale with
> number of silences and hosts
>
> * We can develop a kind of proxy which will be deployed between Prometheus
> and alertmanager, and drop alerts for hosts in non-production state:
>
>    This approach is dangerous as if the proxy fails, no alerts will reach
> alertmanager
>
> * put the proxy on the notification path: This will make it a bit
> complicated as the proxy has to understand receivers, etc
>
> PS: We still want to scrape and monitor the servers which are not in
> production state.
>
> We will be really thankful for any suggestions or ideas.
>
> Couldn't you run two sets of Prometheus servers to monitor the production
> infrastructure separately from the non-production. Then just don't have
> alerting rules or connect alertmanagers to the non-production servers.
>
> --
> Stuart Clark
>
>

-- 

Badreddin Aboubakr


GAPS (IONOS Cloud)

1&1 IONOS SE | Greifswalder Str. 207 | 10405 Berlin | Germany
Phone:
E-mail: [email protected] | Web: www.ionos.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 24498

Vorstand: Hüseyin Dogan, Dr. Martin Endreß, Claudia Frese, Henning Kettler,
Arthur Mai, Matthias Steinberg, Achim Weiß
Aufsichtsratsvorsitzender: Markus Kadelke

Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese
E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und
vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten
ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt
auf welche Weise auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient of this e-mail, you are hereby notified that
saving, distribution or use of the content of this e-mail in any way is
prohibited. If you have received this e-mail in error, please notify the
sender and delete the e-mail.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CALJcnC-JBzKXwnw3fSZBB%3Dyh4TiyijQukNBtuBObDaBYsbzb5Q%40mail.gmail.com.

Re: [prometheus-users] Discuss Prmetheus alerts suppressing

Reply via email to