[prometheus-users] Alert for high-frequency changes of a metric?

Moses Moore Fri, 28 Feb 2020 08:22:42 -0800

(Looks like my previous ask of this question got spamblocked because I 
included a screenshot.  c'est la vie.)

I have alerts for when a metric's value passes above or below a threshold.
I can ask for the minimum or maximum over a time range, I can as for a
prediction based on the slope of a graph.

I have some resources that I know will fail soon after their metrics
fluctuate wildly over a short period of time. They may never exceed the
absolute value of 85% during their fluctuations, or they may exceed this
briefly but not long enough to cause concern if it was a smooth line.
I.E. If the samples over time were [30, 30, 31, 70, 5, 69, 6, 71, 5, 69,
null, null, null] I want to detect it before the metric goes absent
(because the resource crashed).

Setting the threshold at ">69" doesn't work because the value drops below
the threshold on the next scrape, closing the alert; besides, if it were at
a steady 69 that would be healthy.
Setting the threshold at "avg(metric[interval)" doesn't work because the
average of an oscillating metric will be well within the healthy range.
I thought of setting an alert for "max_over_time - min_over_time > 50" but
that would trigger on a smooth ascension -- a false positive.

What's the question should I ask Prometheus to detect a metric that
vibrates too much?

--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/dfdfc12b-767c-458a-b238-08d87cd3e7d1%40googlegroups.com.

[prometheus-users] Alert for high-frequency changes of a metric?

Reply via email to