On 26 Aug 23:40, [email protected] wrote: > Hi Team, > > we predominantly use multiple consul_sd_configs and have a service regex to > scrape metrics across 1000's of target nodes > > Issues: > > Sometimes we observe that consul dns malfunctions and due to which > prometheus drops connectivity to consul server and while this happens all > the metric ingestion were getting dropped and getting into a non-healthy > state > > > - How do we mitigate this and whether do we have any metric to quantify > and alert when there is a disconnect during consul service discovery ?
You can use the prometheus_sd_consul_rpc_failures_total metric. > > > Any sort of pointers/help would be highly appreciated here. > > Regards, > Dinesh > > -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/70a767fc-88a7-4fb2-af01-5b95484cd389n%40googlegroups.com. -- Julien Pivotto @roidelapluie -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/20200827083320.GA92802%40oxygen.

