On Mon, Mar 13, 2023 at 1:21 PM Israel Brewster <ijbrews...@alaska.edu> wrote: > > I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit more > memory constrained than I would like, such that every week or so the various > processes running on the machine will align badly and the OOM killer will > kick in, killing off postgresql, as per the following journalctl output: > > Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process > of this unit has been killed by the OOM killer. > Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed with > result 'oom-kill'. > Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed 5d > 17h 48min 24.509s CPU time. > > And the service is no longer running. > > When this happens, I go in and restart the postgresql service, and everything > is happy again for the next week or two. > > Obviously this is not a good situation. Which leads to two questions: > > 1) is there some tweaking I can do in the postgresql config itself to prevent > the situation from occurring in the first place? > 2) My first thought was to simply have systemd restart postgresql whenever it > is killed like this, which is easy enough. Then I looked at the default unit > file, and found these lines: > > # prevent OOM killer from choosing the postmaster (individual backends will > # reset the score to 0) > OOMScoreAdjust=-900 > # restarting automatically will prevent "pg_ctlcluster ... stop" from working, > # so we disable it here. Also, the postmaster will restart by itself on most > # problems anyway, so it is questionable if one wants to enable external > # automatic restarts. > #Restart=on-failure > > Which seems to imply that the OOM killer should only be killing off > individual backends, not the entire cluster to begin with - which should be > fine. And also that adding the restart=on-failure option is probably not the > greatest idea. Which makes me wonder what is really going on? >
Related, we (a FOSS project) used to have a Linux server with a LAMP stack on GoDaddy. The machine provided a website and wiki. It was very low-end. I think it had 512MB or 1 GB RAM and no swap file. And no way to enable a swap file (part of an upsell). We paid about $2 a month for it. MySQL was killed several times a week. It corrupted the database on a regular basis. We had to run the database repair tools daily. We eventually switched to Ionos for hosting. We got a VM with more memory and a swap file for about $5 a month. No more OOM kills. If possible, you might want to add more memory (or a swap file) to the machine. It will help sidestep the OOM problem. You can also add vm.overcommit_memory = 2 to stop Linux from oversubscribing memory. The machine will act like a Solaris box rather than a Linux box (which takes some getting used to). Also see https://serverfault.com/questions/606185/how-does-vm-overcommit-memory-work . Jeff