I am trying to configure Postgres (version 9.5) to take advantage of very
large memory environment. Server configuration has 256GB RAM, 12 cores and
2 SSDs on RAID0 and runs Ubuntu. Swap is set at 4GB.

The machine is used for data warehouse operations. Typically only 1
statement runs at a time (but these can be very complex, for example doing
rolling window functions with partition by several fields, over 30 metrics
over 40 million rows of data. Number of joins in statements rarely exceeds
5 (for big tables the number of joins is smaller). Window functions on
large tables are common.

No other resource intensive processes are running at the same time.

All tables get recreated from scratch every day. If DB goes kaput, it's a
minor inconvenience to restore from backups synced to the cloud.

I started off with these settings which I compiled from a variety of
sources including pgtune (not sure how good they are, I'm not a system
administrator):

shared_buffers = 65024MB
work_mem = 1680MB
maintenance_work_mem = 10GB
fsync = off
wal_buffers = 16MB
max_wal_size = 8GB
min_wal_size = 4GB
checkpoint_completion_target = 0.9
random_page_cost = 2.0
effective_cache_size = 192GB
default_statistics_target = 500

The first time I spotted something wrong was this 40 million row table
mentioned above. Looking at the resources on Ubuntu, as soon as the
statement started memory usage went up dramatically. Within a minute it
went to 100% (yes, the whole 256GB!) and postgres crashed with the
message *FATAL:
the database system is in recovery mode*.

I've tried various different settings, more notably:

   - I've reduced shared_buffers to 10GB but kept work_mem at 1600MB.
   - I've added the following lines to /etc/sysctl.conf (pinched from
   google searches):

vm.swappiness = 0 vm.overcommit_memory = 2 vm.overcommit_ratio = 95
vm.dirty_ratio = 2 vm.dirty_background_ratio = 1

Query again crashed, this time with message *"out of memory DETAIL: Failed
on request of size 112"*.

With these settings, this is the screenshot as memory usage approaches 100%:
 https://www.evernote.com/l/AJIE90HcZwVG_o2KqjYIOn72eQHQx2pc0QI

I've then tried different settings for work_mem, not changing anything else.

work_mem = 400MB -> query runs fine but memory usage in the system doesn't
exceed 1.3%

work_mem = 500MB -> usage hits 100% and postgres crashes out of memory.

So looks like work_mem is to blame. *However, can someone explain why at
400MB Postgres does not seem to take advantage of the shedload of available
memory in the system?!*

Looking for suggestions here. I'm not a DB system administrator, I'm just
an analyst who wants to get their analysis done fast and efficiently hence
the hardware spec! What combination of settings can I try to make sure
postgres makes full use of the available memory (without blindly trying
various combinations)? How can I investigate what's limiting postgres from
doing so?

I've done some reading but it's hard to tell what advice might apply to
2016 hardware.

Is there something else I need to configure on the Ubuntu side?

Getting really desperate here so any help is greatly appreciated!

Thanks

Carmen

Reply via email to