Hello Christian,
On Thursday, July 2, 2020 at 7:32:30 AM UTC-5, Christian Hoffmann wrote:
>
> Hi,
>
> On 7/2/20 2:17 AM, LabTest Diagnostics wrote:
> > I've written some alerts for memory usage (for windows nodes) that look
> > like this:
> >
> > |
> >
> expr:100*(windows_os_physical_memory_free_bytes)/(windows_cs_physical_memory_bytes)<70
>
>
> > |
> >
> > Currently, any server that exceeds 70% of available mem should give us
> > an alert. This doesn't work for me as there are some nodes that
> > consistently clock over 80% of the memory.
> >
> > Is there a way to specify the threshold levels for alerts on a instance
> > basis?
>
> Yes, you can use time series as thresholds:
> https://www.robustperception.io/using-time-series-as-alert-thresholds
>
> Kind regards
> Christian
>
For my use case this looks right? (need more clarity on what the
"something" is, in the alert block)
groups:
- name: MemoryAlert
rules:
- record: Memory_Usage_Too_High
expr: 100*(windows_os_physical_memory_free_bytes)/(
windows_cs_physical_memory_bytes)<90
labels:
instance: Server1, Server2
- alert: MemoryUsageTooHigh
expr: |
# Alert based on per-team thresholds.
something #what is this?
> on (instance) group_left
(
Memory_Usage_Too_High
or on (instance)
count by (instance)(something) * 0 + 70 #For all other
instances/server memory usage shouldn't exceed 70%
)
Thank you!
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/33f702b5-514b-4182-991d-115a23cfc520o%40googlegroups.com.