During compaction 
<https://prometheus.io/docs/prometheus/latest/storage/#compaction> duration 
the prometheus had memory spike from 5GGB to 20GB based on metric 
*process_resident_memory_bytes.* Since we have limit of 21GB to the 
prometheus container(VM RAM is 23GB) we almost had OOMkill.

Full detail about the issue (spike graphs & environment detail) can be 
found -> https://github.com/prometheus/prometheus/issues/8357

*Questions:*
1. Does the *process_resident_memory_bytes[link 
<https://prometheus.io/docs/prometheus/1.8/storage/#helpful-metrics>]* is 
the right metric to monitor and alert about prometheus memory or should we 
use the k8s pod metric *container_memory_working_set_bytes[link 
<https://blog.freshtracks.io/a-deep-dive-into-kubernetes-metrics-part-3-container-resource-metrics-361c5ee46e66>]*
 of 
the prometheus pod?

2. Is the *process_resident_memory_bytes *metric can go above VM physical 
RAM or  OOM Kill will hit first?  (is there any memory that count in this 
metric can be evicted by the kernel to avoid OOMkill?)

3. Assuming prometheus compaction may cause huge memory spikes(as mentioned 
above and like issue1 <https://github.com/prometheus/prometheus/issues/4110> 
and issue2 <https://github.com/prometheus/prometheus/issues/6184>). *Is 
there a way to tune prometheus to avoid such huge spikes during compaction* 
(e.g: tune prometheus settings or to increase the instance RAM)?
    
Thanks 
Shay

Disclaimer

The information contained in this communication from the sender is 
confidential. It is intended solely for use by the recipient and others 
authorized to receive it. If you are not the recipient, you are hereby notified 
that any disclosure, copying, distribution or taking action in relation of the 
contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been 
automatically archived by Mimecast Ltd, an innovator in Software as a Service 
(SaaS) for business. Providing a safer and more useful place for your human 
generated data. Specializing in; Security, archiving and compliance. To find 
out more visit the Mimecast website.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/6501dfdd-d83d-4b55-908c-9204bea69680n%40googlegroups.com.

Reply via email to