Re: [prometheus-users] Prometheus eats up all memory with constant load

Julien Pivotto Sun, 22 Nov 2020 13:09:28 -0800

On 22 Nov 12:15, Tim Schwenke wrote:
> @All, I have switched from EFS to EBS (so no more NFS) and the head chunks 
> have stopped growing. So it really seems to have something to do with that 
> even though AWS's EFS NFS is supposed to be completely POSIX compliant. 
> Weird.


There might have a large range of issues, like client caching, coming
into play with network storage.

> 
> @[email protected], I see, thanks. Your post is also helpful for getting 
> alert rules right
> [email protected] schrieb am Montag, 2. November 2020 um 09:47:39 UTC+1:
> 
> > Hi,
> >
> > That's expected with the applications built with the newer Go versions. On 
> > Linux, they will "reserve" whatever memory is available in the host, but 
> > it's actually might not be allocations. This might be a memory that the 
> > kernel can take whenever it needs, so those cases will not trigger OOMs. To 
> > see what happens look for working sets metric instead of RSS.
> >
> > See https://www.bwplotka.dev/2019/golang-memory-monitoring/ for details 🤗
> >
> > niedziela, 1 listopada 2020 o 23:29:13 UTC+1 [email protected] 
> > napisał(a):
> >
> >> The currently latest version v2.22.0
> >>
> >> [email protected] schrieb am Donnerstag, 29. Oktober 2020 um 
> >> 21:12:00 UTC+1:
> >>
> >>> We see something similar for recent versions of Prometheus — but 
> >>> something else might cause it. What version are you at?
> >>>
> >>> torsdag 29. oktober 2020 skrev Tim Schwenke <[email protected]>:
> >>>
> >>>> I think I will just kick the bucket and move my Promstack to a 
> >>>> dedicated single node AWS ECS cluster, this way I can use EBS for 
> >>>> persistent storage
> >>>>
> >>>> Tim Schwenke schrieb am Montag, 26. Oktober 2020 um 14:10:49 UTC+1:
> >>>>
> >>>>> Thanks for your answers, @mierzwa  and Ben
> >>>>>
> >>>>> @mierzwa, I have checked the robust perception blog post you have 
> >>>>> linked and it really seems like my Prometheus is using too much memory. 
> >>>>> I 
> >>>>> have only around 20k series, but with quite high cardinality over a 
> >>>>> long 
> >>>>> period of time. If we just look at a few hours or days the churn should 
> >>>>> be 
> >>>>> very low / zero and only increase when ever an API endpoint gets called 
> >>>>> the 
> >>>>> first time and so on.
> >>>>>
> >>>>> @Ben, the Prometheus is using AWS EFS for storage and so the disk size 
> >>>>> should be unlimited. It sits currently at just 13 GB while resident 
> >>>>> size is 
> >>>>> at around 2 GB and virtual memory around 7 GB. The node Prometheus is 
> >>>>> running on has 8 GB of RAM. Now that you have mentioned storage, I have 
> >>>>> read quite some time ago that Prometheus does not officially support 
> >>>>> NFS 
> >>>>> and I think the AWS EFS volume is mounted via NFS into the node.
> >>>>>
> >>>>> ---------------------------------------------------------
> >>>>>
> >>>>> Here is the info I get from the Prometheus web interface:
> >>>>>
> >>>>> https://trallnag.htmlsave.net/ 
> >>>>>
> >>>>> This links to a html document that I saved 
> >>>>>
> >>>>>
> >>>>>
> >>>>> [email protected] schrieb am Montag, 26. Oktober 2020 um 12:57:58 
> >>>>> UTC+1:
> >>>>>
> >>>>>> It looks like the head chunks has been growing without compacting. Is 
> >>>>>> the disk full? What's in the logs? What's in the data directory?
> >>>>>>
> >>>>>> On Mon, Oct 26, 2020 at 11:45 AM Tim Schwenke <[email protected]> 
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Is it expected that Prometheus will take all the memory it gets over 
> >>>>>>> time? I'm running Prometheus in a very "stable" environment with 
> >>>>>>> about 200 
> >>>>>>> containers / targets in addition with Cadvisor and Node Exporter over 
> >>>>>>> 3 to 
> >>>>>>> 4 nodes. And now after a few weeks of uptime Prometheus reached the 
> >>>>>>> memory 
> >>>>>>> limits and I get memory limit hit events all the time.
> >>>>>>>
> >>>>>>>  Here are screenshots of relevant dashboards:
> >>>>>>>
> >>>>>>>
> >>>>>>> https://github.com/trallnag/random-data/issues/1#issuecomment-716463651
> >>>>>>>
> >>>>>>> Could be something wrong with my setup or is it all ok? The 
> >>>>>>> performance of queries in Grafana etc is not impacted
> >>>>>>>
> >>>>>>> -- 
> >>>>>>> You received this message because you are subscribed to the Google 
> >>>>>>> Groups "Prometheus Users" group.
> >>>>>>> To unsubscribe from this group and stop receiving emails from it, 
> >>>>>>> send an email to [email protected].
> >>>>>>> To view this discussion on the web visit 
> >>>>>>> https://groups.google.com/d/msgid/prometheus-users/f89538c2-c07b-424d-a377-614f0098a337n%40googlegroups.com
> >>>>>>>  
> >>>>>>> <https://groups.google.com/d/msgid/prometheus-users/f89538c2-c07b-424d-a377-614f0098a337n%40googlegroups.com?utm_medium=email&utm_source=footer>
> >>>>>>> .
> >>>>>>>
> >>>>>> -- 
> >>>> You received this message because you are subscribed to the Google 
> >>>> Groups "Prometheus Users" group.
> >>>>
> >>> To unsubscribe from this group and stop receiving emails from it, send 
> >>>> an email to [email protected].
> >>>>
> >>> To view this discussion on the web visit 
> >>>> https://groups.google.com/d/msgid/prometheus-users/06d6ba09-4558-4cbd-9efd-f6f8bf31d90fn%40googlegroups.com
> >>>>  
> >>>> <https://groups.google.com/d/msgid/prometheus-users/06d6ba09-4558-4cbd-9efd-f6f8bf31d90fn%40googlegroups.com?utm_medium=email&utm_source=footer>
> >>>> .
> >>>>
> >>>
> >>>
> >>> -- 
> >>> --
> >>> David J. M. Karlsen - http://www.linkedin.com/in/davidkarlsen
> >>>
> >>>
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/prometheus-users/bf3431d6-1b0b-47e6-821c-4df379cca8a5n%40googlegroups.com.


-- 
Julien Pivotto
@roidelapluie

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/20201122210917.GA560969%40oxygen.

Re: [prometheus-users] Prometheus eats up all memory with constant load

Reply via email to