[prometheus-users] Prometheus High RAM Investigation

Shubham Shrivastav Tue, 25 Jan 2022 21:58:01 -0800

Hi all, 

I've been investigating Prometheus memory utilization over the last couple 
of days.


Based on *pprof* command outputs, I do see a lot of memory utilized by 
*getOrSet* function, but according to docs, it's just for creating new 
series, so not sure what I can do about it.


Pprof "top" output: 
https://pastebin.com/bAF3fGpN

Also, to figure out if I have any metrics that I can remove I ran ./tsdb 
analyze on memory *(output here: https://pastebin.com/twsFiuRk)*

I did find some metrics having more cardinality than others but the 
difference was not very massive.

With ~100 nodes our RAM takes around 15 Gigs.

We're getting* average Metrics Per node: 8257*

Our estimation is around 200 nodes, which will make our RAM go through the 
roof.

Apart from distributing our load over multiple Prometheus nodes, are there 
any alertnatives?

TIA,
Shubham

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/201d2d39-f5ca-4dfa-948b-5c6b54aa4dc4n%40googlegroups.com.

[prometheus-users] Prometheus High RAM Investigation

Reply via email to