* Get a bigger server * Reduce the number of metrics you collect * Shard your server Probably some combination of all of these.
On Wed, Jul 8, 2020 at 8:21 PM Viktor Radnai <[email protected]> wrote: > Hi Ben, Julien and all, > > To follow up on my issue from last week, the OOM loop does occur even with > Prometheus 2.19.2. > > This time around the instance has just enough memory to complete WAL > replay but it OOMs immediately after that, this could be an improvement or > just a coincidence. The WAL folder is about 16GB and the OOM occurs at > around 43GB (due to the Kubernetes worker running out of memory). Anything > else I could try? > > Thanks, > Vik > > On Wed, 1 Jul 2020 at 19:10, Viktor Radnai <[email protected]> > wrote: > >> Hi Julien, >> >> Thanks for clarifying that. In that case I'll see if the issue will recur >> with 2.19.2 in the next few weeks. >> >> Vik >> >> On Wed, 1 Jul 2020 at 19:08, Julien Pivotto <[email protected]> >> wrote: >> >>> When 2.19 will run then it will create mmaped head which will improve >>> that. >>> >>> I agree that starting 2.19 with a 2.18 wal won't make a change. >>> >>> Le mer. 1 juil. 2020 à 19:55, Viktor Radnai <[email protected]> a >>> écrit : >>> >>>> Hi again Ben, >>>> >>>> Unfortunately upgrading to 2.19.2 does not solve the startup issue. >>>> Prometheus gets OOMKilled before even starting to parse the last 25 >>>> segments which represent the last 50 minutes worth of data. Based on this >>>> the estimated memory requirement should be somewhere between 60-70GB but >>>> the iworker node only has 52GB. The other Prometheus pod currently consumes >>>> 7.7GB. >>>> >>>> The left of the graph is 2.18.1, the right is 2.19.2. I inadvertently >>>> reinstated a previously set 40GB memory limit and updated the replicaset to >>>> increase it back to 50GB -- this is the reason for the second Prometheus >>>> restart and the slightly higher plateau for the last two OOMs. >>>> >>>> Unless there is a way to move some WAL segments out and the restore >>>> them later, I'll try to delete the last 50 minutes worth of segments to get >>>> the pod to come up. >>>> >>>> Thanks, >>>> Vik >>>> >>>> On Wed, 1 Jul 2020 at 16:39, Viktor Radnai <[email protected]> >>>> wrote: >>>> >>>>> Hi Ben, >>>>> >>>>> We are running 2.18.1 -- I will upgrade to 2.19.2 and see if this >>>>> solves the problem. I currently have one of the two replicas in production >>>>> crashlooping so I'll try to roll this out in the next few hours and report >>>>> back. >>>>> >>>>> Thanks, >>>>> Vik >>>>> >>>>> On Wed, 1 Jul 2020 at 16:32, Ben Kochie <[email protected]> wrote: >>>>> >>>>>> What version of Prometheus do you have deployed? We've made several >>>>>> major improvements to WAL handling and startup in the last couple of >>>>>> releases. >>>>>> >>>>>> I would recommend upgrading to 2.19.2 if you haven't. >>>>>> >>>>>> On Wed, Jul 1, 2020 at 5:06 PM Viktor Radnai <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> We have a recurring problem with Prometheus repeatedly getting >>>>>>> OOMKilled on startup while trying to process the write ahead log. I >>>>>>> tried >>>>>>> to look through Github issues but there was no solution or currently >>>>>>> open >>>>>>> issue as far as I could see. >>>>>>> >>>>>>> We are running on Kubernetes in GKE using the prometheus-operator >>>>>>> Helm chart, using Google Cloud's Preemptible VMs. These VMs get killed >>>>>>> every 24 hours maximum, so our Prometheus pods also get killed and >>>>>>> automatically migrated by Kubernetes (the data is on a persistent >>>>>>> volume of >>>>>>> course). To avoid loss of metrics, we run two identically configured >>>>>>> replicas with their own storage, scraping all the same targets. >>>>>>> >>>>>>> We monitor numerous GCE VMs that do batch processing, running >>>>>>> anywhere between a few minutes to several hours. This workload is >>>>>>> bursty, >>>>>>> fluctuating between tens and hundreds of VMs active at any time, so >>>>>>> sometimes the Prometheus wal folder grows to between 10-15GB in size. >>>>>>> Prometheus usually handles this workload with about half a CPU core and >>>>>>> 8GB >>>>>>> of RAM and if left to its own devices, the wal folder will shrink again >>>>>>> when the load decreases. >>>>>>> >>>>>>> The problem is that when there is a backlog and Prometheus is >>>>>>> restarted (due to the preemptive VM going away), it will use several >>>>>>> times >>>>>>> more RAM to recover the wal folder. This often exhausts all the >>>>>>> available >>>>>>> memory on the Kubernetes worker, so Prometheus is killed by the OOM >>>>>>> killed >>>>>>> over and over again, until I log in and delete the wal folder, losing >>>>>>> several hours of metrics. I have already doubled the size of the VMs >>>>>>> just >>>>>>> to accommodate Prometheus and I am reluctant to do this again. Running >>>>>>> non-preemptive VMs would triple the cost of these instances and >>>>>>> Prometheus >>>>>>> might still get restarted when we roll out an update -- so this would >>>>>>> probably not even solve the issue properly. >>>>>>> >>>>>>> I don't know if there is something special in our use case, but I >>>>>>> did come across a blog describing the same high memory usage behaviour >>>>>>> on >>>>>>> startup. >>>>>>> >>>>>>> I feel that unless there is a fix I can do, this would warrant >>>>>>> either a bug or feature request -- Prometheus should be able to recover >>>>>>> without operator intervention or losing metrics. And for a process >>>>>>> running >>>>>>> on Kubernetes, we should be able to set memory "request" and "limit" >>>>>>> values >>>>>>> that are close to actual expected usage, rather than 3-4 times the >>>>>>> steady >>>>>>> state usage just to accommodate the memory requirements of the startup >>>>>>> phase. >>>>>>> >>>>>>> Please let me know what information I should provide, if any. I have >>>>>>> some graph screenshots that would be relevant. >>>>>>> >>>>>>> Many thanks, >>>>>>> Vik >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Prometheus Users" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/prometheus-users/CANx-tGgY3vJ-dzyOjYMAu1dRvdsfO83Ux_Y0g7XAeKzPTmGWLQ%40mail.gmail.com >>>>>>> <https://groups.google.com/d/msgid/prometheus-users/CANx-tGgY3vJ-dzyOjYMAu1dRvdsfO83Ux_Y0g7XAeKzPTmGWLQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> My other sig is hilarious >>>>> >>>> >>>> >>>> -- >>>> My other sig is hilarious >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Prometheus Users" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/prometheus-users/CANx-tGj6rBmimfUVGwuWD1%3D03fdvkCeYOote1huXBN2Kh2n08A%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/prometheus-users/CANx-tGj6rBmimfUVGwuWD1%3D03fdvkCeYOote1huXBN2Kh2n08A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> >> >> -- >> My other sig is hilarious >> > > > -- > My other sig is hilarious > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABbyFmr%2BzHBQnjT%3Duw707fJr5F9bA3vM%3DJPMXQ9Y3GGwrdh9Kw%40mail.gmail.com.

