Seconded. Clark's right. My current scenario is kinda the same (pod memory keep floating between 12~16GB) and I'm currently working on identifying why do I have so much targets (17k at the moment) and doing all the records I can to relieve stress from dashboards and rules.
Em quarta-feira, 24 de junho de 2020 10:41:36 UTC-3, Stuart Clark escreveu: > > On 24/06/2020 14:23, wang dong wrote: > > Hi prometheus expert, > > > > we have a production cluster, 5 masters, 20 workers. And we run our > > service in this cluster. > > And we install prometheus 2.8.0 with a helm chart. > > After one year running, we recently keep getting OOM of prometheus > > pod. From the prometheus stats dashboard, > > we got the peak RSS 20 GB when clients access to our service. > > We have been keeping increasing mem again and again. Now, the limit > > mem of this container is 32 GB and CPU is 1. > > > > I am not sure how huge we will increase the resource. But 32GB is > > really big for a pod/container. > > > > > > So I wonder if this is limit of prometheus and we hit it? Or is there > > any best practice we should comply > > to make our service available to our clients. Thanks in advance. > > > Memory usage is due to both the targets you scrape and the queries you > perform. > > To reduce the memory used for scraping, reduce the scrape interval or > the number of targets/metrics being ingested. > > For query memory reduction, look at the recording rules & API queries - > if you have to process a lot of time series or a long duration more > memory will be used. > > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a59ee1dd-a332-4233-b446-0760194df177o%40googlegroups.com.

