On Mar 11, 2010, at 12:39 AM, Greg Smith wrote:

> 
> Giving all the buffers to the database doesn't work for many reasons:
> -Need a bunch leftover for clients to use (i.e. work_mem)
> -Won't be enough OS cache for non-buffer data the database expects 
> cached reads and writes will perform well onto (some of the non-database 
> files it uses)
> -Database checkpoints will turn into a nightmare, because there will be 
> so much more dirty data that could have been spooled regularly out to 
> the OS and then to disk by backends that doesn't ever happen.
> -Not having enough writes for buffering backend writes means less chanes 
> to do write combining and elevator seek sorting, which means average I/O 
> will drop.
> 
> The alternate idea is to make shared_buffers small.  I see people 
> happilly running away in the 128MB - 256MB range sometimes.  The benefit 
> over just using the default of <32MB is obvious, but you're already past 
> a good bit of the diminishing marginal returns just by the 8X increase. 
> 

The DB usage pattern influences this sort of decision too.  One that does large 
bulk inserts can prefer larger shared buffers, provided its bg_writer is tuned 
well (10GB - 16GB for a 64GB server).  
Temp table usage benefits from it as well -- I believe that one created as "ON 
COMMIT DROP" has a better chance of not being written to the data disk before 
being dropped with more work_mem.
If you have a mixed read workload that has occasional very large sequential 
scans, you will want to make sure shared_buffers is large enough to hold the 
most important index and randomly accessed data.

Linux is more sensitive to letting sequential scans kick out data from page 
cache than Postgres.


----------
Lastly, a word of caution on Linux.  Before the recent changes to memory 
accounting and paging (~ kernel 2.28 ish?).  Shared_buffers are only accounted 
for in part of the equations for paging.  On one hand, the system sees shared 
memory as available to be swapped out (even though it won't) and on the other 
hand it senses memory pressure from it.  So if you for example, set shared_mem 
to 75% of your RAM the system will completely freak out and kswapd and other 
processes will go through long periods of 100% CPU utilization.  
An example:
32GB RAM, 16GB shared_buffers, CentOS 5.4:
With the default os 'swappiness' of '60' the system will note that less than 
60% is used by pagecache and favor swapping out postgres backends aggressively. 
 If either by turning down the swappiness or opening enough processes to 
consume more RAM on the system (to ~ 80% or so) the kernel will start spending 
a LOT of CPU, often minutes at a time, trying to free up memory.   From my 
understanding, it will keep searching the postgres shared_buffers space for 
pages to swap out even though it can't do so.  So for example, there might be 
16GB shared mem (which it won't page out), 10GB other process memory, and 6GB 
actual cahced files in page cache.  It sees the ratio of 6GB files to 26GB 
processes and heavily favors attacking the 26GB -- but scans the whole set of 
process memory and finds all pages are recently used or can't be paged out.

Anyhow, the latest linux kernels claim to fix this, and Solaris/OpenSolaris or 
BSD's don't have this problem.   On OpenSolaris there are some benchmarks out 
there that showing that 90% of memory allocated to shared_buffers can work 
well.  On Linux, that is dangerous.  Combine the poor memory management when 
there is a lot of shared memory with the fact that 50% is bad for 
double-buffering, and the Linux suggestion becomes the typical 'at least 128MB, 
but never more than 25% of RAM'.


> Improves keep coming as shared_buffers cache size increases for many 
> workloads, but eventually you can expect to go to far if you try to push 
> everything in there.  Only question is whether that happens at 40%, 60%, 
> or something higher.
> 
> -- 
> Greg Smith  2ndQuadrant US  Baltimore, MD
> PostgreSQL Training, Services and Support
> g...@2ndquadrant.com   www.2ndQuadrant.us
> 
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to