On Fri, 31 Aug 2007, Jan Wieck wrote:
Again, the original theory for the bgwriter wasn't moving writes out of the critical path, but smoothing responsetimes that tended to go completely down the toilet during checkpointing, causing all the users to wake up and overload the system entirely.
As far as I'm concerned, that function of the background writer has been replaced by the load distributed checkpoint features now controlled by checkpoint_completion_target, which is believed to be a better solution in several respects. I'm been trying to motivate people happily using the current background writer to confirm or deny that during beta, while there's still time to put the all-scan portion that was removed back again.
The open issue I'm working on is whether the LRU cleaner running in advance of the Strategy point is still a worthwhile addition on top of that.
My own tests with pgbench that I'm busy wrapping up today haven't provided many strong conclusions here; the raw data is now on-line at http://www.westnet.com/~gsmith/content/bgwriter/ , am working on summarizing it usefully and bundling the toolchain I used to run all those. I'll take a look at whether TCP-W provides a helpfully different view here because as far as I'm aware that's a test neither myself or Heikki has tried yet to investigate this area.
It is well known that any kind of bgwriter configuration other than OFF does increase the total IO cost. But you will find that everyone who has SLA's that define maximum response times will happily increase the IO bandwidth to give an aggressively configured bgwriter room to work.
The old background writer couldn't be configured to be aggressive enough to satisfy some SLAs because of interactions with the underlying operating system write caches. It actually made things worse in some situations because at the point when you hit a checkpoint, the OS/disk controller caches were already filled to capacity with writes of active pages, many of which were now being written again. Had you just left the background writer off those caches would have had less data in them and better been able to absorb the storm of writes that come with the checkpoint. This is particularly true in the situtation where you have a large caching disk controller that might chew GB worth of shared_buffers almost instantly were it mostly clean when the checkpoint storm begins, but if the background writer has been busy pounding at it then it's already full of data at checkpoint time.
We just talked about this for a bit at Bruce's back in July; the hardware you did your development against and what people are deploying nowadays are so different that the entire character of the problem has changed. The ability of the processors and memory to create dirty pages has gone up by at least one order of magnitude, and the sophistication of the disk controller on a high-end PostgreSQL server is pretty high now; the speed of the underlying disks haven't kept pace, and that gap has been making this particular problem worse every year.
-- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend