[prometheus-users] Re: Prometheus using a large amount of memory when managing storage.

2021-10-21 Thread Chad Sesvold

I was just tweaking the TSDB block durations to match prod.  I was reading 
that we might want to reduce the TSDB block durations to help free up 
memory.

We are seeing OOM in the system logs.  I am watching memory using the 
following command.

watch "ps ax -o pcpu,rss,ppid,pid,stime,args | grep 'prometheus/prometheus' 
| grep -v grep"

I am not sure what is meant by node.  I was running go profiling tool found 
in https://source.coveo.com/2021/03/03/prometheus-memory/ and the output 
was from the go tool.  

go tool pprof -symbolize=remote -inuse_space 
https://monitoring.prod.cloud.coveo.com/debug/pprof/heap
On Saturday, October 16, 2021 at 4:36:36 AM UTC-4 Brian Candler wrote:

> Is there a specific reason why you're tweaking the TSDB block durations?  
> That is, did you observe some problem with the defaults?  Otherwise I'd 
> suggest you just run with defaults.
>
> In any case, if the problem you're debugging is discrepancies between prod 
> and non-prod, you should be running with the same flags in both.
>
> On Friday, 15 October 2021 at 20:51:47 UTC+1 tass...@gmail.com wrote:
>
>> I did add some file targets back in none-prod  for a total of 900 checks 
>> and prometheus leveled out at about 22 GB.
>>
>
> Not sure what you mean by "900 checks" here.  Do you mean targets? 
> Metrics?  Alerting rules?
>
> And how are you determining the total RAM usage? (If you're getting OOM 
> killer messages then you're definitely hitting the RAM limit.  It's worth 
> mentioning that older versions of go tended not to hand back memory to the 
> OS as aggressively, but they did mark the pages as reclaimable and the OS 
> would reclaim these when under memory pressure.  But recent prometheus 
> binaries should be built with a recent version of go - assuming you're 
> using the official release binaries and not ones you've compiled yourself)
>  
>
>> Prod - TSDB Status - Head Stats
>> Number of Series=2 million
>> Number of Chunks=11million
>> Number of Label=59k
>> PairsCurrent Min Time=2021-10-15T16:00:00.006Z (163431366)
>> Current Max Time=2021-10-15T18:51:07.414Z (1634323867414)
>>
>> Non-Prod - TSDB Status - Head Stats
>> Number of Series=82k
>> Number of Chunks=400k
>> Number of Label=2k
>> PairsCurrent Min Time=2021-10-15T18:05:27.705Z (1634321127705)
>> Current Max Time=2021-10-15T18:50:58.200Z (1634323858200)
>>
>>
> That suggests the non-prod should be using a lot less RAM - the number of 
> head chunks in particular.
>  
>
>> Prod 
>> Showing nodes accounting for 3939.82MB, 73.70% of 5345.97MB total
>> Dropped 292 nodes (cum <= 26.73MB)
>>
>> Non-prod
>> Showing nodes accounting for 1.43GB, 91.54% of 1.56GB total
>> Dropped 133 nodes (cum <= 0.01GB)
>>
>
> What do you mean my "nodes" here?  And what are "dropped nodes"?
>
> I'm looking at prometheus 2.29.2 here, so maybe there are some new stats 
> in 2.30 that I can't see.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/14e693df-fdce-408f-94c7-bddadbe0e2d0n%40googlegroups.com.


Re: [prometheus-users] Re: Prometheus using a large amount of memory when managing storage.

2021-10-21 Thread Chad Sesvold
So I have been doing a little more testing.  I did find that we had some 
software installed on the non prod boxes that was causing some issues.  We 
were scaling metrics every 20 seconds.  My guess is that the software was 
slowing down prometheus writes.  I am guessing that I had a race condition 
of some kind.  Metrics were coming in at a higher rate then they could bw 
written to on the file system.  Once we disabled the software things seem 
to stabilize, but only after deleting all of the data.

The weird part is that with the 45 days worth of data there is still an 
issue starting prometheus with no targets.  I am wondering if prometheus 
was trying to update or convert the data store after going from 2.30.2 to 
2.30.3.  Prod is on version 2.30.0.  Then again Have rolled back and 
patched so many time it could have caused issues with the date store.  Then 
you add on to of it I am using NFS instead of a local file system.  

prometheus, version 2.30.3 (branch: HEAD, revision: 
f29caccc42557f6a8ec30ea9b3c8c089391bd5df)
build user:   root 5cff4265f0e3
build date:   20211005-16:10:52
go version:   go1.17.1
platform: linux/amd64

There are a couple of quick questions I have before considering this issues 
resolved.  I know that I can run sort_desc(scrape_duration_seconds) to get 
the amount of time to scrape the metrics.  This is helpful for determining 
scrape intervals.  Is there a metric that I can look at to tell if I am 
having a race condition when prometheus is writing metrics?  I am thinking 
there must be an easy way to rule out race conditions,
On Saturday, October 16, 2021 at 5:47:32 AM UTC-4 sup...@gmail.com wrote:

> On Fri, Oct 15, 2021 at 9:51 PM Chad Sesvold  wrote:
>
>> At this point I am the only one running queries.  When I have no target 
>> defined the memory seems to be flat.
>>
>> When I changed the follow in non-pro it seemed to stabilize the memory 
>> usage.
>>
>> --storage.tsdb.max-block-duration 15d
>> --storage.tsdb.min-block-duration 1h
>>
>
> These flags will actually make memory use worse. This will generate many 
> more TSDB blocks than normal, which will cause Prometheus to need more 
> memory to manage the indexes. However this is mostly needed for page cache 
> memory. See my next comment.
>  
>
>>
>> I will try copying the binaries and configs from prod to non-prod.  
>>
>> I am planning at looking at Thanos instead of an NFS mount.  That is 
>> going to take some time.
>>
>
> The retention and long-term storage in Prometheus has almost no effect on 
> RSS needed to run Prometheus. Prometheus only needs memory (RSS) to manage 
> the current 2 hours of data. After 2 hours, everything in memory is flushed 
> to disk and mapped-in using a technique called "mmap". This means disk 
> blocks are virtually mapped into memory (VSS). Then the Linux kernel uses 
> page cache to manage what data is loaded. You can have terabytes of data in 
> the TSDB and it only uses a small amount of RSS to manage the mappings.
>
> As Brian said, you need to look at go_memstats_alloc_bytes and 
> process_resident_memory_bytes for Prometheus. This will give you a better 
> idea on what is being used.
>  
>
>>
>> I did add some file targets back in none-prod  for a total of 900 checks 
>> and prometheus leveled out at about 22 GB.
>>
>> Prod - TSDB Status - Head Stats
>> Number of Series=2 million
>> Number of Chunks=11million
>> Number of Label=59k
>> PairsCurrent Min Time=2021-10-15T16:00:00.006Z (163431366)
>> Current Max Time=2021-10-15T18:51:07.414Z (1634323867414)
>>
>> Non-Prod - TSDB Status - Head Stats
>> Number of Series=82k
>> Number of Chunks=400k
>> Number of Label=2k
>> PairsCurrent Min Time=2021-10-15T18:05:27.705Z (1634321127705)
>> Current Max Time=2021-10-15T18:50:58.200Z (1634323858200)
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f42cbf6f-e106-4d99-8f58-21001267b706n%40googlegroups.com.


Re: [prometheus-users] Re: Prometheus using a large amount of memory when managing storage.

2021-10-16 Thread Ben Kochie
On Fri, Oct 15, 2021 at 9:51 PM Chad Sesvold  wrote:

> At this point I am the only one running queries.  When I have no target
> defined the memory seems to be flat.
>
> When I changed the follow in non-pro it seemed to stabilize the memory
> usage.
>
> --storage.tsdb.max-block-duration 15d
> --storage.tsdb.min-block-duration 1h
>

These flags will actually make memory use worse. This will generate many
more TSDB blocks than normal, which will cause Prometheus to need more
memory to manage the indexes. However this is mostly needed for page cache
memory. See my next comment.


>
> I will try copying the binaries and configs from prod to non-prod.
>
> I am planning at looking at Thanos instead of an NFS mount.  That is going
> to take some time.
>

The retention and long-term storage in Prometheus has almost no effect on
RSS needed to run Prometheus. Prometheus only needs memory (RSS) to manage
the current 2 hours of data. After 2 hours, everything in memory is flushed
to disk and mapped-in using a technique called "mmap". This means disk
blocks are virtually mapped into memory (VSS). Then the Linux kernel uses
page cache to manage what data is loaded. You can have terabytes of data in
the TSDB and it only uses a small amount of RSS to manage the mappings.

As Brian said, you need to look at go_memstats_alloc_bytes and
process_resident_memory_bytes for Prometheus. This will give you a better
idea on what is being used.


>
> I did add some file targets back in none-prod  for a total of 900 checks
> and prometheus leveled out at about 22 GB.
>
> Prod - TSDB Status - Head Stats
> Number of Series=2 million
> Number of Chunks=11million
> Number of Label=59k
> PairsCurrent Min Time=2021-10-15T16:00:00.006Z (163431366)
> Current Max Time=2021-10-15T18:51:07.414Z (1634323867414)
>
> Non-Prod - TSDB Status - Head Stats
> Number of Series=82k
> Number of Chunks=400k
> Number of Label=2k
> PairsCurrent Min Time=2021-10-15T18:05:27.705Z (1634321127705)
> Current Max Time=2021-10-15T18:50:58.200Z (1634323858200)
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmpdpsuPDvcogdmbO3Jbga4j80pemhmhecSDpVJWeJy8Mg%40mail.gmail.com.


[prometheus-users] Re: Prometheus using a large amount of memory when managing storage.

2021-10-16 Thread Brian Candler
Is there a specific reason why you're tweaking the TSDB block durations?  
That is, did you observe some problem with the defaults?  Otherwise I'd 
suggest you just run with defaults.

In any case, if the problem you're debugging is discrepancies between prod 
and non-prod, you should be running with the same flags in both.

On Friday, 15 October 2021 at 20:51:47 UTC+1 tass...@gmail.com wrote:

> I did add some file targets back in none-prod  for a total of 900 checks 
> and prometheus leveled out at about 22 GB.
>

Not sure what you mean by "900 checks" here.  Do you mean targets? 
Metrics?  Alerting rules?

And how are you determining the total RAM usage? (If you're getting OOM 
killer messages then you're definitely hitting the RAM limit.  It's worth 
mentioning that older versions of go tended not to hand back memory to the 
OS as aggressively, but they did mark the pages as reclaimable and the OS 
would reclaim these when under memory pressure.  But recent prometheus 
binaries should be built with a recent version of go - assuming you're 
using the official release binaries and not ones you've compiled yourself)
 

> Prod - TSDB Status - Head Stats
> Number of Series=2 million
> Number of Chunks=11million
> Number of Label=59k
> PairsCurrent Min Time=2021-10-15T16:00:00.006Z (163431366)
> Current Max Time=2021-10-15T18:51:07.414Z (1634323867414)
>
> Non-Prod - TSDB Status - Head Stats
> Number of Series=82k
> Number of Chunks=400k
> Number of Label=2k
> PairsCurrent Min Time=2021-10-15T18:05:27.705Z (1634321127705)
> Current Max Time=2021-10-15T18:50:58.200Z (1634323858200)
>
>
That suggests the non-prod should be using a lot less RAM - the number of 
head chunks in particular.
 

> Prod 
> Showing nodes accounting for 3939.82MB, 73.70% of 5345.97MB total
> Dropped 292 nodes (cum <= 26.73MB)
>
> Non-prod
> Showing nodes accounting for 1.43GB, 91.54% of 1.56GB total
> Dropped 133 nodes (cum <= 0.01GB)
>

What do you mean my "nodes" here?  And what are "dropped nodes"?

I'm looking at prometheus 2.29.2 here, so maybe there are some new stats in 
2.30 that I can't see.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/bca27a4f-7cf9-4bbc-a020-8b26a83acb38n%40googlegroups.com.


[prometheus-users] Re: Prometheus using a large amount of memory when managing storage.

2021-10-15 Thread Chad Sesvold
At this point I am the only one running queries.  When I have no target 
defined the memory seems to be flat.

When I changed the follow in non-pro it seemed to stabilize the memory 
usage.

--storage.tsdb.max-block-duration 15d
--storage.tsdb.min-block-duration 1h

I will try copying the binaries and configs from prod to non-prod.  

I am planning at looking at Thanos instead of an NFS mount.  That is going 
to take some time.

I did add some file targets back in none-prod  for a total of 900 checks 
and prometheus leveled out at about 22 GB.

Prod - TSDB Status - Head Stats
Number of Series=2 million
Number of Chunks=11million
Number of Label=59k
PairsCurrent Min Time=2021-10-15T16:00:00.006Z (163431366)
Current Max Time=2021-10-15T18:51:07.414Z (1634323867414)

Non-Prod - TSDB Status - Head Stats
Number of Series=82k
Number of Chunks=400k
Number of Label=2k
PairsCurrent Min Time=2021-10-15T18:05:27.705Z (1634321127705)
Current Max Time=2021-10-15T18:50:58.200Z (1634323858200)

Prod 
Showing nodes accounting for 3939.82MB, 73.70% of 5345.97MB total
Dropped 292 nodes (cum <= 26.73MB)

Non-prod
Showing nodes accounting for 1.43GB, 91.54% of 1.56GB total
Dropped 133 nodes (cum <= 0.01GB)
On Friday, October 15, 2021 at 10:50:24 AM UTC-4 Brian Candler wrote:

> Look at Status > TSDB Status from the web interface of both systems.  In 
> particular, what does the first entry ("Head Stats") show for each system?
>
> Do you have any idea of series churn, i.e. how many new series are being 
> created and deleted per hour?  (Although if you're scraping a subset of the 
> same targets on non-prod, then it shouldn't be any worse)
>
> Prometheus exposes stats about its internal memory usage (go_memstats_*), 
> can you see any difference between the two systems here?
>
> Are you hitting the non-production system with queries?  If so, can you 
> try not querying it for a while?
>
> Otherwise, you can try replicating the production system *exactly* in the 
> non-production one: same binaries, same configuration, same retention.  If 
> it works differently then it's something about the environment.
>
> I observe that NAS is *not* recommended as a storage backend for 
> prometheus.  See 
> https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects 
> (scroll to the yellow "CAUTION" box)
>
> On Friday, 15 October 2021 at 15:02:33 UTC+1 tass...@gmail.com wrote:
>
>> We have been running prometheus for 3 or 4 years now.  In production we 
>> have 6 month retention and in non-production we have a retention of 45 
>> days.  in production we are capturing 1.8 million metrics with 2,300 
>> targets.  In non-prod we are capturing 800 K metrics with 2,200 targets. 
>>  The configuration is the same between the environments.  Both production 
>> and non-prod servers have 4 CPU and 24 GB of memory.  Production is using 
>> 160% CPU and 5,5 GB of memory.  Non Prod is running out of memory even 
>> after increasing the server memory to 64 GB.  This seemed to happen after 
>> patching non-prod to 2.30.3.  Production is on 2.30.0.  We are using NAS 
>> storage.  Non-prod is 500GB and production is 4 TB of storage. 
>>
>> I have been doing several test in non production to isolate the issue to 
>> see if it is an issue with the number of targets or the storage.  I hav 
>> tried reducing the targets, and the retention time.  The results seem to be 
>> the same between 2.30.3 and 2.30.0.
>>
>> prometheus-2.30.3
>> 53MB No tagets clean storage
>> 41GB No tagets storage history
>> 5.5GB tagets clean storage
>> 42GB tagets storage history
>>
>> prometheus-2.30.0
>> 2MB No tagets clean storage
>> 50GB No tagets storage history
>>  4GB tagets clean storage
>> 47GB tagets storage history
>>
>> With less retention than production, non-prod with no targets is using 
>> 10x the memory as production even on the same hardware.  After adding 
>> targets, even with no history, the memory increase in non-prod until the OS 
>> kill prometheus due to out of memory.  I have increased the server from 24 
>> GB to 32 GB to 64GB and prometheus memory never stabilizes.  I have tried 
>> removing target and it does seem to help.
>>
>> There appears to be some sort of memory leak, but it is never aliens 
>> until it is aliens.  We are scraping most metrics every 15 second in 
>> production and have changed non-prod to every 30 second with the same 
>> results.  We are using consul for service discovery.  Not sure what else to 
>> look at.  Any suggestion on what to look at next?
>>
>> This is my first time posting.  So I figured I would ask the community 
>> rather than submitting a bug in githob.
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

[prometheus-users] Re: Prometheus using a large amount of memory when managing storage.

2021-10-15 Thread Brian Candler
Look at Status > TSDB Status from the web interface of both systems.  In 
particular, what does the first entry ("Head Stats") show for each system?

Do you have any idea of series churn, i.e. how many new series are being 
created and deleted per hour?  (Although if you're scraping a subset of the 
same targets on non-prod, then it shouldn't be any worse)

Prometheus exposes stats about its internal memory usage (go_memstats_*), 
can you see any difference between the two systems here?

Are you hitting the non-production system with queries?  If so, can you try 
not querying it for a while?

Otherwise, you can try replicating the production system *exactly* in the 
non-production one: same binaries, same configuration, same retention.  If 
it works differently then it's something about the environment.

I observe that NAS is *not* recommended as a storage backend for 
prometheus.  
See https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects 
(scroll to the yellow "CAUTION" box)

On Friday, 15 October 2021 at 15:02:33 UTC+1 tass...@gmail.com wrote:

> We have been running prometheus for 3 or 4 years now.  In production we 
> have 6 month retention and in non-production we have a retention of 45 
> days.  in production we are capturing 1.8 million metrics with 2,300 
> targets.  In non-prod we are capturing 800 K metrics with 2,200 targets. 
>  The configuration is the same between the environments.  Both production 
> and non-prod servers have 4 CPU and 24 GB of memory.  Production is using 
> 160% CPU and 5,5 GB of memory.  Non Prod is running out of memory even 
> after increasing the server memory to 64 GB.  This seemed to happen after 
> patching non-prod to 2.30.3.  Production is on 2.30.0.  We are using NAS 
> storage.  Non-prod is 500GB and production is 4 TB of storage. 
>
> I have been doing several test in non production to isolate the issue to 
> see if it is an issue with the number of targets or the storage.  I hav 
> tried reducing the targets, and the retention time.  The results seem to be 
> the same between 2.30.3 and 2.30.0.
>
> prometheus-2.30.3
> 53MB No tagets clean storage
> 41GB No tagets storage history
> 5.5GB tagets clean storage
> 42GB tagets storage history
>
> prometheus-2.30.0
> 2MB No tagets clean storage
> 50GB No tagets storage history
>  4GB tagets clean storage
> 47GB tagets storage history
>
> With less retention than production, non-prod with no targets is using 10x 
> the memory as production even on the same hardware.  After adding targets, 
> even with no history, the memory increase in non-prod until the OS kill 
> prometheus due to out of memory.  I have increased the server from 24 GB to 
> 32 GB to 64GB and prometheus memory never stabilizes.  I have tried 
> removing target and it does seem to help.
>
> There appears to be some sort of memory leak, but it is never aliens until 
> it is aliens.  We are scraping most metrics every 15 second in production 
> and have changed non-prod to every 30 second with the same results.  We are 
> using consul for service discovery.  Not sure what else to look at.  Any 
> suggestion on what to look at next?
>
> This is my first time posting.  So I figured I would ask the community 
> rather than submitting a bug in githob.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/61229289-872e-425e-83c2-7211f98f08ecn%40googlegroups.com.