On 07/16/2012 02:39 PM, Robert Haas wrote:
Unfortunately, there are lots of important operations (like bulk
loading, SELECT * FROM bigtable, and VACUUM notverybigtable) that
inevitably end up writing out their own dirty buffers.  And even when
the background writer does write something, it's not always clear that
this is a positive thing.  Here's Greg Smith commenting on the
more-is-worse phenonmenon:


You can add "crash recovery" to the list of things where the interaction with the OS write cache matters a lot too, something I just took a beating and learned from recently. Since the recovery process is essentially one giant unified backend, how effectively the background writer and/or checkpointer move writes from recovery to themselves is really important. It's a bit easier to characterize than a complicated mixed set of clients, which has given me a couple of ideas to chase down.

What I've been doing for much of the last month (instead of my original plan of reviewing patches) is moving toward the bottom of characterizing that under high pressure. It provides an even easier way to compare multiple write strategies at the OS level than regular pgbench-like benchmarks. Recovery playback with a different tuning becomes as simple as rolling back to a simple base backup and replaying all the WAL, possibly including some number of bulk operations that showed up. You can measure that speed instead of transaction-level throughput. I'm seeing the same ~100% difference in performance between various Linux tunings on recovery as I was getting on VACUUM tests, and it's a whole lot easier to setup and (ahem) replicate the results. I'm putting together a playback time benchmark based on this observation.

The fact that I have servers all over the place now with >64GB worth of RAM has turned the topic of how much dirty memory should be used for write caching into a hot item for me again in general too. If I live through 9.3 development, I expect to have a lot more ideas about how to deal with this whole area play out in the upcoming months. I could really use a cool day to sit outside thinking about it right now.

Jeff Janes and I came up with what I believe to be a plausible
explanation for the problem:


I kinda think we ought to be looking at fixing that for 9.2, and
perhaps even back-patching further, but nobody else seemed terribly
excited about it.

FYI, I never rejected any of that thinking, I just haven't chewed on what you two were proposing. If that's still something you think should be revisited for 9.2, I'll take a longer look at it. My feeling on this so far has really been that the write blocking issues are much larger than the exact logic used by the background writer during the code you were highlighting, which I always saw as more active/important during idle periods. This whole area needs to get a complete overhaul during 9.3 though, especially since there are plenty of people who want to fit checksum writes into that path too.

Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to