Hi,

On 8/20/20 12:47 PM, 'azha...@googlemail.com' via Prometheus Users wrote:
> I have 2 alerts 
> 
> - The first being to fire if CPU is more then 70% (WMI)
> 
> - The second  to report whether an instance is down
> 
> 100 - (avg by(instance) (rate(wmi_cpu_time_total{mode="idle"}[2m])) *
> 100) > 70
> <http://192.168.1.67:9090/graph?g0.expr=100+-+%28avg+by%28instance%29+%28rate%28wmi_cpu_time_total%7Bmode%3D%22idle%22%7D%5B2m%5D%29%29+%2A+100%29+%3E+70&g0.tab=1>
>  
> 
> 
> up == 0 <http://192.168.1.67:9090/graph?g0.expr=up+%3D%3D+0&g0.tab=1> 
> 
> 
> Post generating a CPU spike i can confirm that my client CPU is indeed 100% 
> 
> @echo off 
> :loop 
> goto loop 
> 
> however i get the second alert (up==0) firing and  reporting the
> instance is down despite it not being down. The strange thing is this is
> intermittent behavior as occasionally I do get the CPU firing alert
> instead of the instance down alert. 
>  
> 
> Im just wondering why when the CPU is clearly maxed out at 100% the 
> instance is reporting as down... and why sometimes this isn't the case.

So you are getting the Instance Down alert instead of the High CPU alert?

The up metric is special. It is generated by Prometheus itself and
always exists for anything which is a scrape target.

The fact that your CPU alert does not fire and that up == 0 probably
indicates that Prometheus fails to receive metrics from your
wmi_exporter. We may only speculate why that is. Maybe the load is so
high that the scrape times out?

You can check the Prometheus Web UI Targets page to see the last scrape
error for your target. If it is indeed a timeout ("deadline exceeded")
you could try increasing the scrape_timeout option to make Prometheus
wait longer for the exporter to reply.

Side note: If I remember correctly, the wmi_exporter has been renamed to
windows_exporter (along with the metrics). This might mean that you are
running an older version. Maybe updating helps if the newer version is
more performant (I don't know, just guessing).


Kind regards,
Christian

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/9e0a07ba-baf0-6a97-b652-06d7ecde1d17%40hoffmann-christian.info.

Reply via email to