Thanks Brain.

>
> Can you give some more specific examples?  What metric are you joining 
> with - perhaps node_uname_info? >>
>
      - alert: HighCpuLoadCrit
        expr: (node_load15 > (2 * count without (cpu, mode) 
(node_cpu_seconds_total{mode="system"}))) ** on(instance) 
group_left(nodename) node_uname_info*
 

> Note that the "up" metric will still exist (with a value of 0) when a 
> scrape fails - this means:
> (a) you can join on it, and >>
>
 
    UP metrics will exist but if the node exporter itself is down, it won't 
expose the metric at that time right? So, I won't get the "nodename" label 
from node_uname_info.

(b) you can alert on this condition, i.e. scrape failed / node_exporter is 
> down.  This is a different condition than "blackbox_exporter says 
> host/service is down, but node_exporter is still being scraped".  Hence the 
> alerting rule for (up == 0) can be written to avoid the join.  There is 
> actually a benefit here: you'll only get one alert when the host goes down, 
> instead of lots. >>
> I am using up == 0 only and using it as inhibition rule also, but (up == 
> 0) itself won't give me the hostname. My main aim is to get the hostname 
> for every alert. But, when the server is actually down i.e node exporter 
> will also be down and again I won't get nodename label.
>
 
   Please correct me if I am wrong anywhere.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/e6344cd3-12ed-4c65-9d31-165d8a028131%40googlegroups.com.

Reply via email to