Maybe I'll refine the threshold even further but for now this works. Thanks
a lot for help.
On Wednesday, March 11, 2020 at 10:47:14 PM UTC+5:30, Harald Koch wrote:
>
>
>
> On Wed, Mar 11, 2020, at 13:00, Yagyansh S. Kumar wrote:
>
> Hi. I have configured alert for CPU Load for my servers and my current
> threshold is 8 for warning and 10 for critical.
> I want to make this threshold dynamic i.e I want the critical alert when
> the CPU Load becomes greater than the number of CPU Cores of the machine.
> Eg. For a server with 8 CPU cores, I want a critical alert when CPU load >
> 8 and for a machine with 16 CPU cores, I want a critical alert when CPU
> Load > 16.
>
>
> count without (cpu, mode) (node_cpu_seconds_total{mode="system"}))
>
> is one way of counting the number of CPUs on a server, so:
>
> (node_load5 / count without (cpu, mode)
> (node_cpu_seconds_total{mode="system"})) > 1
>
> would alert you when the (5 minute) load average is greater than the
> number of CPUs.
>
> I find this alert to be noisy - your experience may differ.
>
> --
> Harald
>
>
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/5d01403e-efd0-4024-aa9a-5c55d36fb26e%40googlegroups.com.