Re: [HACKERS] Final background writer cleanup for 8.3

Greg Smith Fri, 31 Aug 2007 09:21:04 -0700

On Fri, 31 Aug 2007, Jan Wieck wrote:

Again, the original theory for the bgwriter wasn't moving writes out of thecritical path, but smoothing responsetimes that tended to go completely downthe toilet during checkpointing, causing all the users to wake up andoverload the system entirely.

As far as I'm concerned, that function of the background writer has beenreplaced by the load distributed checkpoint features now controlled bycheckpoint_completion_target, which is believed to be a better solution inseveral respects. I'm been trying to motivate people happily using thecurrent background writer to confirm or deny that during beta, whilethere's still time to put the all-scan portion that was removed backagain.

The open issue I'm working on is whether the LRU cleaner running inadvance of the Strategy point is still a worthwhile addition on top ofthat.

My own tests with pgbench that I'm busy wrapping up today haven't providedmany strong conclusions here; the raw data is now on-line athttp://www.westnet.com/~gsmith/content/bgwriter/ , am working onsummarizing it usefully and bundling the toolchain I used to run allthose. I'll take a look at whether TCP-W provides a helpfully differentview here because as far as I'm aware that's a test neither myself orHeikki has tried yet to investigate this area.

It is well known that any kind of bgwriter configuration other than OFF doesincrease the total IO cost. But you will find that everyone who has SLA'sthat define maximum response times will happily increase the IO bandwidth togive an aggressively configured bgwriter room to work.

The old background writer couldn't be configured to be aggressive enoughto satisfy some SLAs because of interactions with the underlying operatingsystem write caches. It actually made things worse in some situationsbecause at the point when you hit a checkpoint, the OS/disk controllercaches were already filled to capacity with writes of active pages, manyof which were now being written again. Had you just left the backgroundwriter off those caches would have had less data in them and better beenable to absorb the storm of writes that come with the checkpoint. This isparticularly true in the situtation where you have a large caching diskcontroller that might chew GB worth of shared_buffers almost instantlywere it mostly clean when the checkpoint storm begins, but if thebackground writer has been busy pounding at it then it's already full ofdata at checkpoint time.

We just talked about this for a bit at Bruce's back in July; the hardwareyou did your development against and what people are deploying nowadaysare so different that the entire character of the problem has changed.The ability of the processors and memory to create dirty pages has gone upby at least one order of magnitude, and the sophistication of the diskcontroller on a high-end PostgreSQL server is pretty high now; the speedof the underlying disks haven't kept pace, and that gap has been makingthis particular problem worse every year.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Re: [HACKERS] Final background writer cleanup for 8.3

Reply via email to