Re: [HACKERS] shared_buffers documentation

Greg Smith Fri, 16 Apr 2010 18:48:15 -0700

Robert Haas wrote:

Well, why can't they just hang out as dirty buffers in the OS cache,
which is also designed to solve this problem?

If the OS were guaranteed to be as suitable for this purpose as theapproach taken in the database, this might work. But much like theclock sweep approach should outperform a simpler OS cachingimplementation in many common workloads, there are a couple of spotswhere making dirty writes the OS's problem can fall down:

1) That presumes that OS write coalescing will solve the problem for youby merging repeat writes, which depending on implementation it might not.

2) On some filesystems, such as ext3, any write with an fsync behind itwill flush the whole write cache out and defeat this optimization.Since the spread checkpoint design has some such writes going to thedata disk in the middle of the currently processing checkpoing, in thosesituations that's likely to push the first write of that block to diskbefore it can be combined with a second. If you'd have kept it in thebuffer cache it might survive as long as a full checkpoint cycle longer..

3) The "timeout" as it were for shared buffers is driven by the distancebetween checkpoints, typically as long as 5 minutes. The longest afilesystem will hold onto a write is probably less. On Linux it'stypically 30 seconds before the OS considers a write important to getout to disk, longest case; if you've already filled a lot of RAM withwrites it can be substantially less.

I guess the obvious question is whether Windows "doesn't need" more
shared memory than that, or whether it "can't effectively use" more
memory than that.

It's probably can't effectively use. We know for a fact thatapplications where blocks regularly accumulate high usage counts andhave repeat read/writes to them, which includes pgbench, benefit inseveral easy to measure ways from using larger amounts of databasebuffer cache. There's just plain old less churn of buffers going in andout of there. The alternate explanation of "Windows is just so muchbetter at read/write caching that you should give it most of the RAManyway" doesn't really sound as probable as the more commonly proposedtheory "Windows doesn't handle large blocks of shared memory well".

Note that there's no discussion of the why behind this is in the commityou just did, just the description of what happens. The reasons why areleft undefined, which I feel is appropriate given we really don't knowfor sure. Still waiting for somebody to let loose the Visual Studioprofiler and measure what's causing the degradation at larger sizes.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
[email protected]   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] shared_buffers documentation

Reply via email to