I've recently started monitoring a large fleet of hardware devices using a 
combination of blackbox, snmp, node, and json exporters.
I started out using the *up* metric, but I noticed when using blackbox 
ping, *up* is *always* 1 even when the device is offline.  So I plan to 
switch to *probe_success* instead.  But I'm thinking about the implications 
of this when mixed with other exporters.  For example json-exporter does 
not offer a *probe_success* metric; instead it returns *up*=0 when the 
target times out.

My goal is to build a Grafana dashboard and alerts that monitors a 
combination of blackbox and other exporters.  For context, when certain 
devices crash, they remain pingable, but they return their failed state via 
REST API.  So I'm setting the json-exporter to an HTTP target endpoint.  
I'm struggling to come up with a unified way of monitoring all these 
different devices types.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/1746ad20-654f-499c-ae1d-28b84d3cb962n%40googlegroups.com.

Reply via email to