Tuning for bulk loading:

Make sure the Linux kernel paramters in /proc/sys/vm related to the page cache 
are set well.
Set swappiness to 0 or 1.
Make sure you understand and configure /proc/sys/vm/dirty_background_ratio
and /proc/sys/vm/dirty_ratio well.
With enough RAM the default on some kernel versions is way, way off (40% of RAM 
with dirty pages!  yuck).
http://www.westnet.com/~gsmith/content/linux-pdflush.htm
If postgres is doing a lot of caching for you you probably want dirty_ratio at 
10% or less, and you'll want the OS to start flushing to disk sooner rather 
than later.  A dirty_background_ratio of 3% with 24GB of RAM  is 720MB -- a 
pretty big buffer.  I would not personally want this buffer to be larger than 5 
seconds of max write speed of the disk I/O.

You'll need to tune your background writer to be aggressive enough to actually 
write data fast enough so that checkpoints don't suck, and tune your checkpoint 
size and settings as well.  Turn on checkpoint logging on the database and run 
tests while looking at the output of those.  Ideally, most of your batch writes 
have made it to the OS before the checkpoint, and the OS has actually started 
moving most of it to disk.  If your settings are wrong,  you'll have the data 
buffered twice, and most or nearly all of it will be in memory when the 
checkpoint happens, and the checkpoint will take a LONG time.  The default 
Linux settings + default postgres settings + large shared_buffers will almost 
guarantee this situation for bulk loads.  Both have to be configured with 
complementary settings.  If you have a large postgres buffer, the OS buffer 
should be small and write more aggressively.  If you have a small postgres 
buffer, the OS can be more lazy and cache much more.


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ryan Hansen
Sent: Wednesday, November 26, 2008 2:10 PM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Memory Allocation

Hey all,

This may be more of a Linux question than a PG question, but I'm wondering if 
any of you have successfully allocated more than 8 GB of memory to PG before.

I have a fairly robust server running Ubuntu Hardy Heron, 24 GB of memory, and 
I've tried to commit half the memory to PG's shared buffer, but it seems to 
fail.  I'm setting the kernel shared memory accordingly using sysctl, which 
seems to work fine, but when I set the shared buffer in PG and restart the 
service, it fails if it's above about 8 GB.  I actually have it currently set 
at 6 GB.

I don't have the exact failure message handy, but I can certainly get it if 
that helps.  Mostly I'm just looking to know if there's any general reason why 
it would fail, some inherent kernel or db limitation that I'm unaware of.

If it matters, this DB is going to be hosting and processing hundreds of GB and 
eventually TB of data, it's a heavy read-write system, not transactional 
processing, just a lot of data file parsing (python/bash) and bulk loading.  
Obviously the disks get hit pretty hard already, so I want to make the most of 
the large amount of available memory wherever possible.  So I'm trying to tune 
in that direction.

Any info is appreciated.

Thanks!

Reply via email to