On Wed, Mar 11, 2020, at 13:00, Yagyansh S. Kumar wrote:
> Hi. I have configured alert for CPU Load for my servers and my current 
> threshold is 8 for warning and 10 for critical.
> I want to make this threshold dynamic i.e I want the critical alert when the 
> CPU Load becomes greater than the number of CPU Cores of the machine.
> Eg. For a server with 8 CPU cores, I want a critical alert when CPU load > 8 
> and for a machine with 16 CPU cores, I want a critical alert when CPU Load > 
> 16.

 count without (cpu, mode) (node_cpu_seconds_total{mode="system"}))

is one way of counting the number of CPUs on a server, so:

 (node_load5 / count without (cpu, mode) 
(node_cpu_seconds_total{mode="system"})) > 1

would alert you when the (5 minute) load average is greater than the number of 
CPUs.

I find this alert to be noisy - your experience may differ.

-- 
Harald

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/055cec3f-0ee4-4154-959f-b0869d2d97c5%40www.fastmail.com.

Reply via email to