On 07/16/2012 02:39 PM, Robert Haas wrote:
Unfortunately, there are lots of important operations (like bulk
loading, SELECT * FROM bigtable, and VACUUM notverybigtable) that
inevitably end up writing out their own dirty buffers. And even when
the background writer does write something, it's not always clear that
this is a positive thing. Here's Greg Smith commenting on the
more-is-worse phenonmenon:
http://archives.postgresql.org/pgsql-hackers/2012-02/msg00564.php
You can add "crash recovery" to the list of things where the interaction
with the OS write cache matters a lot too, something I just took a
beating and learned from recently. Since the recovery process is
essentially one giant unified backend, how effectively the background
writer and/or checkpointer move writes from recovery to themselves is
really important. It's a bit easier to characterize than a complicated
mixed set of clients, which has given me a couple of ideas to chase down.
What I've been doing for much of the last month (instead of my original
plan of reviewing patches) is moving toward the bottom of characterizing
that under high pressure. It provides an even easier way to compare
multiple write strategies at the OS level than regular pgbench-like
benchmarks. Recovery playback with a different tuning becomes as simple
as rolling back to a simple base backup and replaying all the WAL,
possibly including some number of bulk operations that showed up. You
can measure that speed instead of transaction-level throughput. I'm
seeing the same ~100% difference in performance between various Linux
tunings on recovery as I was getting on VACUUM tests, and it's a whole
lot easier to setup and (ahem) replicate the results. I'm putting
together a playback time benchmark based on this observation.
The fact that I have servers all over the place now with >64GB worth of
RAM has turned the topic of how much dirty memory should be used for
write caching into a hot item for me again in general too. If I live
through 9.3 development, I expect to have a lot more ideas about how to
deal with this whole area play out in the upcoming months. I could
really use a cool day to sit outside thinking about it right now.
Jeff Janes and I came up with what I believe to be a plausible
explanation for the problem:
http://archives.postgresql.org/pgsql-hackers/2012-03/msg00356.php
I kinda think we ought to be looking at fixing that for 9.2, and
perhaps even back-patching further, but nobody else seemed terribly
excited about it.
FYI, I never rejected any of that thinking, I just haven't chewed on
what you two were proposing. If that's still something you think should
be revisited for 9.2, I'll take a longer look at it. My feeling on this
so far has really been that the write blocking issues are much larger
than the exact logic used by the background writer during the code you
were highlighting, which I always saw as more active/important during
idle periods. This whole area needs to get a complete overhaul during
9.3 though, especially since there are plenty of people who want to fit
checksum writes into that path too.
--
Greg Smith 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers