On 24/06/2020 14:23, wang dong wrote:
Hi prometheus expert,
we have a production cluster, 5 masters, 20 workers. And we run our
service in this cluster.
And we install prometheus 2.8.0 with a helm chart.
After one year running, we recently keep getting OOM of prometheus
pod. From the prometheus stats dashboard,
we got the peak RSS 20 GB when clients access to our service.
We have been keeping increasing mem again and again. Now, the limit
mem of this container is 32 GB and CPU is 1.
I am not sure how huge we will increase the resource. But 32GB is
really big for a pod/container.
So I wonder if this is limit of prometheus and we hit it? Or is there
any best practice we should comply
to make our service available to our clients. Thanks in advance.
Memory usage is due to both the targets you scrape and the queries you
perform.
To reduce the memory used for scraping, reduce the scrape interval or
the number of targets/metrics being ingested.
For query memory reduction, look at the recording rules & API queries -
if you have to process a lot of time series or a long duration more
memory will be used.
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/e6db72e6-c434-ff57-591a-a934451ff76e%40Jahingo.com.