On Wed, Mar 11, 2020, at 13:00, Yagyansh S. Kumar wrote:
> Hi. I have configured alert for CPU Load for my servers and my current
> threshold is 8 for warning and 10 for critical.
> I want to make this threshold dynamic i.e I want the critical alert when the
> CPU Load becomes greater than the number of CPU Cores of the machine.
> Eg. For a server with 8 CPU cores, I want a critical alert when CPU load > 8
> and for a machine with 16 CPU cores, I want a critical alert when CPU Load >
> 16.
count without (cpu, mode) (node_cpu_seconds_total{mode="system"}))
is one way of counting the number of CPUs on a server, so:
(node_load5 / count without (cpu, mode)
(node_cpu_seconds_total{mode="system"})) > 1
would alert you when the (5 minute) load average is greater than the number of
CPUs.
I find this alert to be noisy - your experience may differ.
--
Harald
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/055cec3f-0ee4-4154-959f-b0869d2d97c5%40www.fastmail.com.