I didn't realize that about overcommit_ratio. It was at 50, I've changed it to 95. I'll see if that clears up the problem moving forward.
# cat /proc/meminfo MemTotal: 30827220 kB MemFree: 153524 kB MemAvailable: 17941864 kB Buffers: 6188 kB Cached: 24560208 kB SwapCached: 0 kB Active: 20971256 kB Inactive: 8538660 kB Active(anon): 12460680 kB Inactive(anon): 36612 kB Active(file): 8510576 kB Inactive(file): 8502048 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 50088 kB Writeback: 160 kB AnonPages: 4943740 kB Mapped: 7571496 kB Shmem: 7553176 kB Slab: 886428 kB SReclaimable: 858936 kB SUnreclaim: 27492 kB KernelStack: 4208 kB PageTables: 188352 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 15413608 kB Committed_AS: 14690544 kB VmallocTotal: 34359738367 kB VmallocUsed: 59012 kB VmallocChunk: 34359642367 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 31465472 kB DirectMap2M: 0 kB # sysctl -a: vm.admin_reserve_kbytes = 8192 vm.block_dump = 0 vm.dirty_background_bytes = 0 vm.dirty_background_ratio = 10 vm.dirty_bytes = 0 vm.dirty_expire_centisecs = 3000 vm.dirty_ratio = 20 vm.dirty_writeback_centisecs = 500 vm.drop_caches = 0 vm.extfrag_threshold = 500 vm.hugepages_treat_as_movable = 0 vm.hugetlb_shm_group = 0 vm.laptop_mode = 0 vm.legacy_va_layout = 0 vm.lowmem_reserve_ratio = 256 256 32 vm.max_map_count = 65530 vm.min_free_kbytes = 22207 vm.min_slab_ratio = 5 vm.min_unmapped_ratio = 1 vm.mmap_min_addr = 4096 vm.nr_hugepages = 0 vm.nr_hugepages_mempolicy = 0 vm.nr_overcommit_hugepages = 0 vm.nr_pdflush_threads = 0 vm.numa_zonelist_order = default vm.oom_dump_tasks = 1 vm.oom_kill_allocating_task = 0 vm.overcommit_kbytes = 0 vm.overcommit_memory = 2 vm.overcommit_ratio = 50 vm.page-cluster = 3 vm.panic_on_oom = 0 vm.percpu_pagelist_fraction = 0 vm.scan_unevictable_pages = 0 vm.stat_interval = 1 vm.swappiness = 0 vm.user_reserve_kbytes = 131072 vm.vfs_cache_pressure = 100 vm.zone_reclaim_mode = 0 On Tue, Oct 21, 2014 at 3:46 PM, Tomas Vondra <t...@fuzzy.cz> wrote: > > Dne 22 Říjen 2014, 0:25, Montana Low napsal(a): > > I'm running postgres-9.3 on a 30GB ec2 xen instance w/ linux kernel > > 3.16.3. > > I receive numerous Error: out of memory messages in the log, which are > > aborting client requests, even though there appears to be 23GB available > > in > > the OS cache. > > > > There is no swap on the box. Postgres is behind pgbouncer to protect from > > the 200 real clients, which limits connections to 32, although there are > > rarely more than 20 active connections, even though postgres > > max_connections is set very high for historic reasons. There is also a 4GB > > java process running on the box. > > > > > > > > > > relevant postgresql.conf: > > > > max_connections = 1000 # (change requires restart) > > shared_buffers = 7GB # min 128kB > > work_mem = 40MB # min 64kB > > maintenance_work_mem = 1GB # min 1MB > > effective_cache_size = 20GB > > > > > > > > sysctl.conf: > > > > vm.swappiness = 0 > > vm.overcommit_memory = 2 > > This means you have 'no overcommit', so the amount of memory is limited by > overcommit_ratio + swap. The default value for overcommit_ratio is 50% > RAM, and as you have no swap that effectively means only 50% of the RAM is > available to the system. > > If you want to verify this, check /proc/meminfo - see the lines > CommitLimit (the current limit) and Commited_AS (committed address space). > Once the committed_as reaches the limit, it's game over. > > There are different ways to fix this, or at least improve that: > > (1) increasing the overcommit_ratio (clearly, 50% is way too low - > something 90% might be more appropriate on 30GB RAM without swap) > > (2) adding swap (say a small ephemeral drive, with swappiness=10 or > something like that) > > Tomas >