Take a look also at the following projects: * Promxy <https://github.com/jacksontj/promxy> - it allows executing alerts over multiple Prometheus instances. See these docs <https://github.com/jacksontj/promxy/blob/master/README.md#how-do-i-use-alertingrecording-rules-in-promxy> for details. * VictoriaMetrics <https://github.com/VictoriaMetrics/VictoriaMetrics>+ vmalert <https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmalert>. Multiple Prometheus instances may write data into a centralized VictoriaMetrics via remote_write API, then vmalert may be used for alerting on top all the collected metrics in VictoriaMetrics.
On Wed, May 27, 2020 at 7:46 PM Rajesh Reddy Nachireddi < [email protected]> wrote: > Hi Ben, > > Does latest version of Cortex /Thanos supports the alerting with multiple > shards of prometheus ? > Thanos Ruler wasn't ready for production to evalute the expression across > the prometheus instances .. Do we have any docuemnet or blog about this ? > > Thanks, > Rajesh > > On Tue, May 26, 2020 at 11:37 AM Ben Kochie <[email protected]> wrote: > >> This is probably a case where you would want to look into Thanos or >> Cortex to provide a larger aggregation layer on top of multiple Prometheus >> servers. >> >> On Sun, May 17, 2020 at 11:53 AM Rajesh Reddy Nachireddi < >> [email protected]> wrote: >> >>> Hi, >>> >>> Basically, we have large networking setup with 10k devices. we are >>> hitting 1M metrics every second from 20 % of devices itself, so we have 5 >>> prom instances and one global proemtheus which uses remote read to handle >>> alert rule evaluations and thanos querier for visualisation on grafana. >>> >>> We have segregated devices with specific device ip ranges to each >>> Prometheus instances. >>> >>> So, we have one aggregator which is using remote read from all the >>> individual prom instances through remote read >>> >>> 1. will the remote read cause an issue w.r.t loading the large time >>> series over wire every 1 min ? >>> 2. Is it CPU or memory intensive ? >>> >>> What is best design strategy to handle these scale and alerting across >>> the devices or metrics ? >>> >>> Regards, >>> >>> Rajesh >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Prometheus Users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/prometheus-users/CAEyhnp%2BfG8YvciR4-30D%2BzsDzg_kF%2BKkJUavdbyGCxoz-97q_A%40mail.gmail.com >>> <https://groups.google.com/d/msgid/prometheus-users/CAEyhnp%2BfG8YvciR4-30D%2BzsDzg_kF%2BKkJUavdbyGCxoz-97q_A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/CAEyhnpJt4QoMxzcMPvMa8qyDra8LLR9Je4nJqPZek8jSGYPbwA%40mail.gmail.com > <https://groups.google.com/d/msgid/prometheus-users/CAEyhnpJt4QoMxzcMPvMa8qyDra8LLR9Je4nJqPZek8jSGYPbwA%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- Best Regards, Aliaksandr Valialkin, CTO VictoriaMetrics -- Best Regards, Aliaksandr Valialkin, CTO VictoriaMetrics -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAPbKnmBzx7oHm_rg4dpq8aGJmbJN_ev5szRNa%2BN_pjp13HabXQ%40mail.gmail.com.

