We see something similar for recent versions of Prometheus — but something else might cause it. What version are you at?
torsdag 29. oktober 2020 skrev Tim Schwenke <[email protected]>: > I think I will just kick the bucket and move my Promstack to a dedicated > single node AWS ECS cluster, this way I can use EBS for persistent storage > > Tim Schwenke schrieb am Montag, 26. Oktober 2020 um 14:10:49 UTC+1: > >> Thanks for your answers, @mierzwa and Ben >> >> @mierzwa, I have checked the robust perception blog post you have linked >> and it really seems like my Prometheus is using too much memory. I have >> only around 20k series, but with quite high cardinality over a long period >> of time. If we just look at a few hours or days the churn should be very >> low / zero and only increase when ever an API endpoint gets called the >> first time and so on. >> >> @Ben, the Prometheus is using AWS EFS for storage and so the disk size >> should be unlimited. It sits currently at just 13 GB while resident size is >> at around 2 GB and virtual memory around 7 GB. The node Prometheus is >> running on has 8 GB of RAM. Now that you have mentioned storage, I have >> read quite some time ago that Prometheus does not officially support NFS >> and I think the AWS EFS volume is mounted via NFS into the node. >> >> --------------------------------------------------------- >> >> Here is the info I get from the Prometheus web interface: >> >> https://trallnag.htmlsave.net/ >> >> This links to a html document that I saved >> >> >> >> [email protected] schrieb am Montag, 26. Oktober 2020 um 12:57:58 UTC+1: >> >>> It looks like the head chunks has been growing without compacting. Is >>> the disk full? What's in the logs? What's in the data directory? >>> >>> On Mon, Oct 26, 2020 at 11:45 AM Tim Schwenke <[email protected]> >>> wrote: >>> >>>> Is it expected that Prometheus will take all the memory it gets over >>>> time? I'm running Prometheus in a very "stable" environment with about 200 >>>> containers / targets in addition with Cadvisor and Node Exporter over 3 to >>>> 4 nodes. And now after a few weeks of uptime Prometheus reached the memory >>>> limits and I get memory limit hit events all the time. >>>> >>>> Here are screenshots of relevant dashboards: >>>> >>>> https://github.com/trallnag/random-data/issues/1#issuecomment-716463651 >>>> >>>> Could be something wrong with my setup or is it all ok? The performance >>>> of queries in Grafana etc is not impacted >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Prometheus Users" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/prometheus-users/f89538c2-c07b-424d-a377- >>>> 614f0098a337n%40googlegroups.com >>>> <https://groups.google.com/d/msgid/prometheus-users/f89538c2-c07b-424d-a377-614f0098a337n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/prometheus-users/06d6ba09-4558-4cbd-9efd- > f6f8bf31d90fn%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-users/06d6ba09-4558-4cbd-9efd-f6f8bf31d90fn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- -- David J. M. Karlsen - http://www.linkedin.com/in/davidkarlsen -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAGO7Ob3bHH6G5P2sJKWOQ1XevTbk2v2wJPd7uHLhpXmdyzxSuw%40mail.gmail.com.

