You won't necessarily see all the failures on that graph. With a 5-second scrape interval, a 6 hour window contains 4,320 scrapes - more than the number of points fetched. Many of the points will be skipped over.
I suggest you graph this instead: min_over_time(probe_success[5m]) (Otherwise, you'd need to zoom in much closer and then scroll left and right) Once you've sorted that, it becomes easier to compare the two prometheus servers. Note: are these two servers talking to the *same* blackbox exporter - i.e. making remote connections over the network? Or does each prometheus server have its own blackbox exporter? If they are separate blackbox exporter instances then there's likely some difference between the two, or the environment in which they are running. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/32600890-13b1-45f5-a8cf-d3ef931065c0o%40googlegroups.com.

