Re: [prometheus-users] Prometheus RAM usage investigation

Omero Saienni Wed, 01 Feb 2023 00:28:28 -0800

Hi,

We are seeking to monitor nearly a thousand rapidly expanding Postgres 
databases using Prometheus. 
Currently, we have divided the targets into two Prometheus instances. 
One instance is monitoring the `pg_up` metric with instance labels only, 
and with metrics from Postgres and Operator disabled. 
However, we have noticed a significant increase in memory usage as we add 
more targets. 
The `go tool pprof` shows that the majority of memory consumption is due to 
the `labels (*Builder) Labels` function.


Measurement values show an exponential increase in memory usage, with a 
large portion of the memory consumed being from labels. 
For example, with 2091 time series and 360 labels, memory usage has reached 
8028 MiB with 4392 MiB consumed by label memory. 

We are unsure if this is normal behavior for Prometheus.

Here are the measurement values:

Number of SMons,Memory Used,PProf Memory Used for Labels,Number of 
Series,Number of Chunks,Number of Label Pairs
0,45 MiB,-,0,0,0
1,64 MiB,-,9,9,13
2,67 MiB,0.5 MiB (12%),15,15,14
5,80 MiB,6.2 MiB (19%),33,33,17
10,103 MiB,10 MiB (25%),63,63,22
15,123 MiB,20 MiB (39%),93,93,27
20,130 MiB,25 MiB (40%),123,123,32
30,189 MiB,30 MiB (42%),183,183,42
46,297 MiB,55 MiB (48%),273,273,57
348,8028 MiB,4392 MiB (82%),2091,2091,360

These were measured using, `kubectl top pods` and `go tool pprof 
https//prom-shard/debug/pprof/heap`

The second instance, which we used for comparison, is currently using 
approximately 9981 MiB.

Here are its measurement values:

Number of SMons,Memory Used,PProf Memory Used for Labels,Number of 
Series,Number of Chunks,Number of Label Pairs
77,9981 MiB,728 MiB (17%),1124830,2252751,47628

Here it makes sense as to where the memory is being consumed as there are a 
large amount of label pairs and time series in the HEAD. 

We would appreciate recommendations on the best way to set up Prometheus 
for this scenario?
Is this expected behaviour for Prometheus?

Thanks,
Omero
On Tuesday, 24 January 2023 at 23:44:34 UTC+13 Victor Hadianto wrote:

> > Also, what version(s) of prometheus are these two instances? 
>
> They are both the same:
> prometheus, version 2.37.0 (branch: HEAD, revision: 
> b41e0750abf5cc18d8233161560731de05199330)
>
> > The RAM usage of Prometheus depends on a number of factors. There's a 
> calculator embedded in this article, but it's pretty old now: 
> https://www.robustperception.io/how-much-ram-does-prometheus-2-x-need-for-cardinality-and-ingestion
>
> Thanks for this, I'll read & play around with that calculator for our 
> Prometheus instances (we have 9 in various clusters now).
>
> Regards,
> Victor
>
>
> On Tue, 24 Jan 2023 at 21:03, Brian Candler <b.ca...@pobox.com> wrote:
>
>> Also, what version(s) of prometheus are these two instances? Different 
>> versions of Prometheus are compiled using different versions of Go, which 
>> in turn have different degrees of aggressiveness in returning unused RAM to 
>> the operating system. Also remember Go is a garbage-collected language.
>>
>> The RAM usage of Prometheus depends on a number of factors. There's a 
>> calculator embedded in this article, but it's pretty old now:
>>
>> https://www.robustperception.io/how-much-ram-does-prometheus-2-x-need-for-cardinality-and-ingestion
>>
>> On Tuesday, 24 January 2023 at 09:29:47 UTC sup...@gmail.com wrote:
>>
>>> When you say "measured by Kubernetes", what metric specifically?
>>>
>>> There are several misleading metrics. What matters is 
>>> `container_memory_rss` or `container_memory_working_set_bytes`. The 
>>> `container_memmory_usage_bytes` is misleading because it includes page 
>>> cache values.
>>>
>>> On Tue, Jan 24, 2023 at 10:20 AM Victor H <vhad...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> We are running multiple Prometheus instances in Kubernetes (deployed 
>>>> using Prometheus Operator) and hope that someone can help us understanding 
>>>> why the RAM usage in a few of our instances are unexpectedly high (we 
>>>> think 
>>>> it's cardinality but not sure where to look)
>>>>
>>>> In Prometheus A, we have the following stat:
>>>>
>>>> Number of Series: 56486
>>>> Number of Chunks: 56684
>>>> Number of Label Pairs: 678
>>>>
>>>> tsdb analyze has the following result:
>>>>
>>>> /bin $ ./promtool tsdb analyze /prometheus/
>>>> Block ID: 01GQGMKZAF548DPE2DFZTF1TRW
>>>> Duration: 1h59m59.368s
>>>> Series: 56470
>>>> Label names: 26
>>>> Postings (unique label pairs): 678
>>>> Postings entries (total label pairs): 338705
>>>>
>>>> This instance uses roughly between 4Gb - 5Gb of RAM (measured by 
>>>> Kubernetes).
>>>>
>>>> From our reading, each time series should use around 8kb of RAM so for 
>>>> 56k series should be using a mere 500Mb.
>>>>
>>>> On a different Prometheus instance (let's call it Prometheus Central) 
>>>> we have 1,1m series and it's using 9Gb - 10Gb which is roughly what is 
>>>> expected.
>>>>
>>>> We're curious about this instance and we believe it's cardinality. We 
>>>> have a lot more targets in Prometheus A. I also note that the Posting 
>>>> entries (total label pairs) is 338k but I'm not sure where to look for 
>>>> this.
>>>>
>>>> The top entries from tsdb analyze is right at the bottom of this post. 
>>>> The "most common label pairs" entries have alarmingly high count, I wonder 
>>>> if this contributes the high "total label pairs" and consequently higher 
>>>> than expected RAM usage.
>>>>
>>>> When calculating the expected RAM usage, is the "total label pairs" is 
>>>> the number we need to use rather than the "total series"
>>>>
>>>> Thanks,
>>>> Victor
>>>>
>>>>
>>>> Label pairs most involved in churning:
>>>> 296 activity_type=none
>>>> 258 workflow_type=PodUpdateWorkflow
>>>> 163 __name__=temporal_request_latency_bucket
>>>> 104 workflow_type=GenerateSPVarsWorkflow
>>>> 95 operation=RespondActivityTaskCompleted
>>>> 89 __name__=temporal_activity_execution_latency_bucket
>>>> 89 __name__=temporal_activity_schedule_to_start_latency_bucket
>>>> 65 workflow_type=PodInitWorkflow
>>>> 53 operation=RespondWorkflowTaskCompleted
>>>> 49 __name__=temporal_workflow_endtoend_latency_bucket
>>>> 49 __name__=temporal_workflow_task_schedule_to_start_latency_bucket
>>>> 49 __name__=temporal_workflow_task_execution_latency_bucket
>>>> 49 __name__=temporal_workflow_task_replay_latency_bucket
>>>> 39 activity_type=UpdatePodConnectionsActivity
>>>> 38 le=+Inf
>>>> 38 le=0.02
>>>> 38 le=0.1
>>>> 38 le=0.001
>>>> 38 activity_type=GenerateSPVarsActivity
>>>> 38 le=5
>>>>
>>>> Label names most involved in churning:
>>>> 734 __name__
>>>> 734 job
>>>> 724 instance
>>>> 577 activity_type
>>>> 577 workflow_type
>>>> 541 le
>>>> 177 operation
>>>> 95 datname
>>>> 53 datid
>>>> 31 mode
>>>> 29 namespace
>>>> 21 state
>>>> 12 quantile
>>>> 11 container
>>>> 11 service
>>>> 11 pod
>>>> 11 endpoint
>>>> 10 scrape_job
>>>> 4 alertname
>>>> 4 severity
>>>>
>>>> Most common label pairs:
>>>> 23012 activity_type=none
>>>> 20060 workflow_type=PodUpdateWorkflow
>>>> 12712 __name__=temporal_request_latency_bucket
>>>> 8092 workflow_type=GenerateSPVarsWorkflow
>>>> 7440 operation=RespondActivityTaskCompleted
>>>> 6944 __name__=temporal_activity_execution_latency_bucket
>>>> 6944 __name__=temporal_activity_schedule_to_start_latency_bucket
>>>> 5100 workflow_type=PodInitWorkflow
>>>> 4140 operation=RespondWorkflowTaskCompleted
>>>> 3864 __name__=temporal_workflow_task_replay_latency_bucket
>>>> 3864 __name__=temporal_workflow_endtoend_latency_bucket
>>>> 3864 __name__=temporal_workflow_task_schedule_to_start_latency_bucket
>>>> 3864 __name__=temporal_workflow_task_execution_latency_bucket
>>>> 3080 activity_type=UpdatePodConnectionsActivity
>>>> 3004 le=0.5
>>>> 3004 le=0.01
>>>> 3004 le=0.1
>>>> 3004 le=1
>>>> 3004 le=0.001
>>>> 3004 le=0.002
>>>>
>>>> Label names with highest cumulative label value length:
>>>> 8312 scrape_job
>>>> 4279 workflow_type
>>>> 3994 rule_group
>>>> 2614 __name__
>>>> 2478 instance
>>>> 1564 job
>>>> 434 datname
>>>> 248 activity_type
>>>> 139 mode
>>>> 128 operation
>>>> 109 version
>>>> 97 pod
>>>> 88 state
>>>> 68 service
>>>> 45 le
>>>> 44 namespace
>>>> 43 slice
>>>> 31 container
>>>> 28 quantile
>>>> 18 alertname
>>>>
>>>> Highest cardinality labels:
>>>> 138 instance
>>>> 138 scrape_job
>>>> 84 __name__
>>>> 75 workflow_type
>>>> 71 datname
>>>> 70 job
>>>> 19 rule_group
>>>> 14 le
>>>> 10 activity_type
>>>> 9 mode
>>>> 9 quantile
>>>> 6 state
>>>> 6 operation
>>>> 5 datid
>>>> 4 slice
>>>> 2 container
>>>> 2 pod
>>>> 2 alertname
>>>> 2 version
>>>> 2 service
>>>>
>>>> Highest cardinality metric names:
>>>> 12712 temporal_request_latency_bucket
>>>> 6944 temporal_activity_execution_latency_bucket
>>>> 6944 temporal_activity_schedule_to_start_latency_bucket
>>>> 3864 temporal_workflow_task_schedule_to_start_latency_bucket
>>>> 3864 temporal_workflow_task_replay_latency_bucket
>>>> 3864 temporal_workflow_task_execution_latency_bucket
>>>> 3864 temporal_workflow_endtoend_latency_bucket
>>>> 2448 pg_locks_count
>>>> 1632 pg_stat_activity_count
>>>> 908 temporal_request
>>>> 690 prometheus_target_sync_length_seconds
>>>> 496 temporal_activity_execution_latency_count
>>>> 350 go_gc_duration_seconds
>>>> 340 pg_stat_database_tup_inserted
>>>> 340 pg_stat_database_temp_bytes
>>>> 340 pg_stat_database_xact_commit
>>>> 340 pg_stat_database_xact_rollback
>>>> 340 pg_stat_database_tup_updated
>>>> 340 pg_stat_database_deadlocks
>>>> 340 pg_stat_database_tup_returned
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Prometheus Users" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to prometheus-use...@googlegroups.com.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/prometheus-users/59f74cb9-3135-4fc3-a7e7-9bec02a3143an%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/prometheus-users/59f74cb9-3135-4fc3-a7e7-9bec02a3143an%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>> -- 
>>
> You received this message because you are subscribed to a topic in the 
>> Google Groups "Prometheus Users" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/prometheus-users/_yUpPWtFaQA/unsubscribe
>> .
>> To unsubscribe from this group and all its topics, send an email to 
>> prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/9a2d7848-4f4f-43b9-90f4-765367f33c47n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/9a2d7848-4f4f-43b9-90f4-765367f33c47n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/55ba50ea-b2f1-48e0-b144-a5b174369c0fn%40googlegroups.com.

Re: [prometheus-users] Prometheus RAM usage investigation

Reply via email to