Hi Frits, Joe, Thank you both for you insights
*Current situation:* *cat /proc/sys/vm/overcommit_memory* 0 *cat /proc/sys/vm/overcommit_ratio* 50 $ *cat /proc/sys/vm/swappiness* 60 *Workload*: Multi-tenant PostgreSQL *uname -r* 4.18.0-477.83.1.el8_8.x86_64 *free -h* total used free shared buff/cache available Mem: 249Gi 4.3Gi 1.7Gi 22Gi 243Gi 221Gi Swap: 0B 0B 0B if we set overcommit_memory = 2, what should we set the overcommit_ration value to ? Can you pls suggest ? Is there a rule of thumb to go with ? *Our goal is to not run into OOM issues, no memory wastage and also not starve kernel ? * Thanks! On Wed, Aug 6, 2025 at 3:47 AM Frits Hoogland <frits.hoogl...@gmail.com> wrote: > Joe, > > Can you name any technical reason why not having swap for a database is an > actual bad idea? > > Memory always is limited. Swap was invented to overcome a situation where > the (incidental) memory usage of paged in memory was could (regularly) get > higher than physical memory would allow, and thus have the (clear) > workaround of having swap to 'cushion' the memory shortage issue by > allowing a "second level" memory storage on disk. > Still, this does not making memory unlimited. Swap extends the physical > memory available with the amount of swap. There still is a situation where > you can run out of memory when swap is added, simply by paging in more > memory than physical memory and swap. > > Today, most systems are not memory constrained anymore, or: it is possible > to get a server with enough physical memory to hold your common needed > total memory need. > And given the latency sensitive nature of databases in general, which > includes postgres, for any serious deployment you should get a server with > enough memory to host your workload, and configure postgres not to overload > the memory. > > If you do oversubscribe on (physical) memory, you will get pain somewhere, > there is no way around that. > The article in defense of swap in essence is saying that if you happen to > oversubscribe on memory, sharing the pain between anonymous and file is > better. > I would say you are already in a bad place if that happens, which is > especially bad for databases, and databases should allow you to make memory > usage predictable. > > However, what I found is that with 4+ kernels (4.18 to be precise; rhel > 8), the kernel can try to favour file pages in certain situations making > anonymous memory getting paged out even if swappiness is set to 1 or 0, and > if there is a wealth of inactive file memory. It seems to have to do with > workingset protection(?) mechanisms, but given the lack of clear statistics > I can't be sure about that. What it does lead to in my situations is a > constant rate of swapping in and out in certain situations, whilst there is > no technical reason for linux to do so because there is enough available > memory. > > My point of view has been that vm.overcommit_memory set to 2 was the way > to go, because that allows linux to limit based on a set limit on > allocation time, which guarantees way to make the database never run out of > memory. > it does guarantees linux to never run out of memory, absolutely. > However, this limit is hard, and is applied for the process at both > usermode and system mode (kernel level), and thus can enforce not providing > memory at times where it's not safe to do so, and thus corrupt execution. I > have to be honest, I have not seen this myself, but trustworthy sources > have reported this repeatedly, which I am inclined to believe. This means > postgres execution can corrupt/terminate in unlucky situations, which is > impacts availability. > > > *Frits Hoogland* > > > > > On 5 Aug 2025, at 20:52, Joe Conway <m...@joeconway.com> wrote: > > On 8/5/25 13:01, Priya V wrote: > > *Environment:* > *PostgreSQL Versions:* Mix of 13.13 and 15.12 (upgrades in progress > to be at 15.12 currently both are actively in use) > > > PostgreSQL 13 end of life after November 13, 2025 > > *OS / Kernel:* RHEL 7 & RHEL 8 variants, kernels in the 4.14–4.18 range > > > RHEL 7 has been EOL for quite a while now. Note that you have to watch out > for collation issues/corrupted indexes after OS upgrades due to collations > changing with newer glibc versions. > > *Swap:* Currently none > > > bad idea > > *Workload:* Highly mixed — OLTP-style internal apps with > unpredictable query patterns and connection counts > *Goal:* Uniform, safe memory settings across the fleet to avoid > kernel or database instability > > > We’re considering: > *|vm.overcommit_memory = 2|* for strict accounting > > > yes > > Increasing |vm.overcommit_ratio| from 50 → 80 or 90 to better > reflect actual PostgreSQL usage (e.g., |work_mem| reservations that > aren’t fully used) > > > work_mem does not reserve memory -- it is a maximum that might be used in > memory for a particular operation > > *Our questions for those running large PostgreSQL fleets:* > 1. > What |overcommit_ratio| do you find safe for PostgreSQL without > causing kernel memory crunches? > > > Read this: > > https://www.cybertec-postgresql.com/en/what-you-should-know-about-linux-memory-overcommit-in-postgresql/ > > 2. > Do you prefer |overcommit_memory = 1| or |= 2| for production stability? > > > Use overcommit_memory = 2 for production stability > > 3. > How much swap (if any) do you keep in large-memory servers where > PostgreSQL is the primary workload? Is having swap configured a good > idea or not ? > > > You don't necessary need a large amount of swap, but you definitely should > not disable it. > > Some background on that: > https://chrisdown.name/2018/01/02/in-defence-of-swap.html > > 4. > Any real-world cases where kernel accounting was too strict or too > loose for PostgreSQL? > > > In my experience the biggest issues are when postgres is running in a > memory constrained cgroup. If you want to constrain memory with cgroups, > use cgroup v2 (not 1) and use memory.high to constrain it, not memory.max. > > 5. What settings to go with if we are not planning on using swap ? > > > IMHO do not disable swap on Linux, at least not on production, ever. > > We’d like to avoid both extremes: > Too low a ratio → PostgreSQL backends failing allocations even with > free RAM > > > Have you actually seen this or are you theorizing? > > Too high a ratio → OOM killer terminating PostgreSQL under load spikes > > > If overcommit_memory = 2, overcommit_ratio is reasonable (less than 100, > maybe 80 or so as you suggested), and swap is not disabled, and you are not > running in a memory constrained cgroup, I would be very surprised if you > will ever get hit by the OOM killer. And if you do, things are so bad the > database was probably dying anyway. > > HTH, > > -- > Joe Conway > PostgreSQL Contributors Team > Amazon Web Services: https://aws.amazon.com > > > >