Hi Team, we predominantly use multiple consul_sd_configs and have a service regex to scrape metrics across 1000's of target nodes
Issues: Sometimes we observe that consul dns malfunctions and due to which prometheus drops connectivity to consul server and while this happens all the metric ingestion were getting dropped and getting into a non-healthy state - How do we mitigate this and whether do we have any metric to quantify and alert when there is a disconnect during consul service discovery ? Any sort of pointers/help would be highly appreciated here. Regards, Dinesh -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/70a767fc-88a7-4fb2-af01-5b95484cd389n%40googlegroups.com.

