On Fri, 31 Aug 2007, Jan Wieck wrote:
Again, the original theory for the bgwriter wasn't moving writes out of the
critical path, but smoothing responsetimes that tended to go completely down
the toilet during checkpointing, causing all the users to wake up and
overload the system entirely.
As far as I'm concerned, that function of the background writer has been
replaced by the load distributed checkpoint features now controlled by
checkpoint_completion_target, which is believed to be a better solution in
several respects. I'm been trying to motivate people happily using the
current background writer to confirm or deny that during beta, while
there's still time to put the all-scan portion that was removed back
The open issue I'm working on is whether the LRU cleaner running in
advance of the Strategy point is still a worthwhile addition on top of
My own tests with pgbench that I'm busy wrapping up today haven't provided
many strong conclusions here; the raw data is now on-line at
http://www.westnet.com/~gsmith/content/bgwriter/ , am working on
summarizing it usefully and bundling the toolchain I used to run all
those. I'll take a look at whether TCP-W provides a helpfully different
view here because as far as I'm aware that's a test neither myself or
Heikki has tried yet to investigate this area.
It is well known that any kind of bgwriter configuration other than OFF does
increase the total IO cost. But you will find that everyone who has SLA's
that define maximum response times will happily increase the IO bandwidth to
give an aggressively configured bgwriter room to work.
The old background writer couldn't be configured to be aggressive enough
to satisfy some SLAs because of interactions with the underlying operating
system write caches. It actually made things worse in some situations
because at the point when you hit a checkpoint, the OS/disk controller
caches were already filled to capacity with writes of active pages, many
of which were now being written again. Had you just left the background
writer off those caches would have had less data in them and better been
able to absorb the storm of writes that come with the checkpoint. This is
particularly true in the situtation where you have a large caching disk
controller that might chew GB worth of shared_buffers almost instantly
were it mostly clean when the checkpoint storm begins, but if the
background writer has been busy pounding at it then it's already full of
data at checkpoint time.
We just talked about this for a bit at Bruce's back in July; the hardware
you did your development against and what people are deploying nowadays
are so different that the entire character of the problem has changed.
The ability of the processors and memory to create dirty pages has gone up
by at least one order of magnitude, and the sophistication of the disk
controller on a high-end PostgreSQL server is pretty high now; the speed
of the underlying disks haven't kept pace, and that gap has been making
this particular problem worse every year.
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend