I certainly wouldn't recommend using 1/2 of RAM right away. There's a
good chance it would be a waste of memory - for example due to double
buffering, which effectively reduces "total" cache hit ratio.
Double buffering is often mentioned in context of tuning shared buffers.
Is there a tool to actually measure the amount of double buffering
happening in the system?
Those evictions are performed either by backends or bgwriter, both of
which are less efficient than checkpointer. Not only can checkpointer
perform various optimizations (e.g. sorting buffers to make the writes
more sequential), but it also writes each dirty buffer just once. With
smaller shared_buffers the page may have be written multiple times.
In the case when shared_buffers cover most of RAM, most of writes should
happen by checkpointer, and cache hit ratio should be high. So a
hypothetical question: Could shared_buffers=200GB on a 250 GB RAM server
ever be a reasonable setting? (assuming there are no other applications
running except postgres, and 50GB is enough for allocating
work_mem/maintenance_work_mem and for serving queries)
The best thing you can do is set shared buffers to some conservative
value (say, 4-8GB), let the system run for a day or two, compute the
cache hit ratio using metrics in pg_stat_database, and then decide if
you need to resize shared buffers.
Gradual increases are a good approach in general. And yes, having
buffers_checkpoint > buffers_clean > buffers_backend
is a good idea too. Together with the cache hit ratio it's probably a
more sensible metric than looking at usagecount directly.
Thanks! While increasing shared_buffers we'll be looking at changes in
cache hit ratio too.
Regards,
Vitaliy