On 11 Mar 10:59, Yagyansh S. Kumar wrote: > I mean it doesn't work in giving me the actual Load value. > The expression you mentioned will give be perfect in defining the threshold > but the value that this expression will give will be (Actual Load Value - > Number of CPU Cores). > How do I still get the actual Load value to print in the alert with > threshold still being the Number of Cores.
The expression I gave you will do that.
node_load5 > count without (cpu, mode) (node_cpu_seconds_total{mode="system"})
>
> On Wednesday, March 11, 2020 at 11:23:09 PM UTC+5:30, Yagyansh S. Kumar
> wrote:
> >
> > Thanks for the response Julien, but I have already tried the query that
> > you have mentioned, but it doesn't work.
> >
> > On Wednesday, March 11, 2020 at 11:21:35 PM UTC+5:30, Julien Pivotto wrote:
> >>
> >> On 11 Mar 10:49, Yagyansh S. Kumar wrote:
> >> > I have one more small query.
> >> > If I use this expression to do my alerting, the value that I will get
> >> when
> >> > I use $value in my alert will be Load per CPU Core.
> >> > How to get the actual CPU Load value itself while using the expression
> >> > mentioned by you.
> >> >
> >> > On Wednesday, March 11, 2020 at 11:07:21 PM UTC+5:30, Yagyansh S. Kumar
> >> > wrote:
> >> > >
> >> > > Maybe I'll refine the threshold even further but for now this works.
> >> > > Thanks a lot for help.
> >> > >
> >> > > On Wednesday, March 11, 2020 at 10:47:14 PM UTC+5:30, Harald Koch
> >> wrote:
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Wed, Mar 11, 2020, at 13:00, Yagyansh S. Kumar wrote:
> >> > >>
> >> > >> Hi. I have configured alert for CPU Load for my servers and my
> >> current
> >> > >> threshold is 8 for warning and 10 for critical.
> >> > >> I want to make this threshold dynamic i.e I want the critical alert
> >> when
> >> > >> the CPU Load becomes greater than the number of CPU Cores of the
> >> machine.
> >> > >> Eg. For a server with 8 CPU cores, I want a critical alert when CPU
> >> load
> >> > >> > 8 and for a machine with 16 CPU cores, I want a critical alert
> >> when CPU
> >> > >> Load > 16.
> >> > >>
> >> > >>
> >> > >> count without (cpu, mode)
> >> (node_cpu_seconds_total{mode="system"}))
> >> > >>
> >> > >> is one way of counting the number of CPUs on a server, so:
> >> > >>
> >>
> >> Hi,
> >>
> >> you can use
> >>
> >> node_load5 > count without (cpu, mode)
> >> (node_cpu_seconds_total{mode="system"}))
> >>
> >> regards,
> >>
> >> > >> (node_load5 / count without (cpu, mode)
> >> > >> (node_cpu_seconds_total{mode="system"})) > 1
> >> > >>
> >> > >> would alert you when the (5 minute) load average is greater than the
> >> > >> number of CPUs.
> >> > >>
> >> > >> I find this alert to be noisy - your experience may differ.
> >> > >>
> >> > >> --
> >> > >> Harald
> >> > >>
> >> > >>
> >> > >>
> >> >
> >> > --
> >> > You received this message because you are subscribed to the Google
> >> Groups "Prometheus Users" group.
> >> > To unsubscribe from this group and stop receiving emails from it, send
> >> an email to [email protected].
> >> > To view this discussion on the web visit
> >> https://groups.google.com/d/msgid/prometheus-users/10f18943-1285-4354-9f90-d67d92cca7a9%40googlegroups.com.
> >>
> >>
> >>
> >>
> >> --
> >> (o- Julien Pivotto
> >> //\ Open-Source Consultant
> >> V_/_ Inuits - https://www.inuits.eu
> >>
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/4f7b3a71-7f0d-4d43-841f-0661134401d8%40googlegroups.com.
--
(o- Julien Pivotto
//\ Open-Source Consultant
V_/_ Inuits - https://www.inuits.eu
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/20200311180141.GA15997%40oxygen.
signature.asc
Description: PGP signature

