Re: [HACKERS] Just-in-time Background Writer Patch+Test Results

Greg Smith Thu, 06 Sep 2007 09:40:26 -0700

On Thu, 6 Sep 2007, Kevin Grittner wrote:

If you exposed the scan_whole_pool_seconds as a tunable GUC, that would
allay all of my concerns about this patch.  Basically, our problems were
resolved by getting all dirty buffers out to the OS cache within two
seconds

Unfortunately it wouldn't make my concerns about your system go away orI'd have recommended exposing it specifically to address your situation.I have been staring carefully at your configuration recently, and I wouldwager that you could turn off the LRU writer altogether and still meetyour requirements in 8.2. Here's what you've got right now:

shared_buffers = 160MB (=20000 buffers)
bgwriter_lru_percent = 20.0
bgwriter_lru_maxpages = 200
bgwriter_all_percent = 10.0
bgwriter_all_maxpages = 600

With the default delay of 200ms, this has the LRU-writer scanning thewhole pool every 1 second, while the all-writer scans every twoseconds--assuming they don't hit the write limits. If some event were todirty the whole pool in 200ms, it might take as much as 6.7 seconds towrite everything out (20000 / 600 * 200 ms) via the all-scan. Theall-scan is already gone in 8.3. Your LRU scan will take much longer thanthat to clear everything out. At least (20000 / 200 * 200ms) 20 secondsto clear a fully dirty cache.

But in fact, it's impossible to even bound how long it will take beforethe LRU writer (which is the only part this new patch tries to improve)gets around to writing even a single dirty buffer no matter whatbgwriter_lru_percent (8.2) or scan_whole_pool_seconds (JIT patch) is setto.

There's a second low-level issue involved here. When a page becomesdirty, that implies it was also recently used, which means the LRU writerwon't touch it. That page can't be written out by the LRU writer until anentire pass has been made over the shared_buffer pool while looking forbuffers to allocate for new activity. When the allocation clock-sweeppasses over the newly dirtied buffer again, its usage count will drop byone and it will no longer be considered recently used. At that point theLRU writer can write it out. So unless there is other allocation activitygoing on, the scan_whole_pool_seconds mechanism will never provide thebound on time to scan and write everything you hope it will.

And if there's other allocations going on, the much more powerful JITmechanism will scan the whole pool plenty fast if you bump the alreadyexposed multiplier tunable up. In my tests where the buffer cache wasfilled with mostly dirty buffers that couldn't be re-used (somethingrelatively easy to trigger with pgbench tests), I've actually watched thenew code scan >90% of the buffer cache looking for those few reusablebuffers in the pool in a single invocation. This would be like settingbgwriter_lru_percent=90.0 in the old configuration, but it only gets thataggressive when the distribution of pages in the buffer cache demands it,and when it has reason to believe going that fast will be helpful.

The completely understandable line of thinking that led to your requesthere is one of my concerns with exposing scan_whole_pool_seconds as atunable. It may suggest to people that if they set the number very low,it will assure all dirty buffers will be scanned and written within thattime bound. That's certainly not the case; both the maxpages and theusage count information will actually drive the speed that mechanism plodsthrough the buffer cache. It really isn't useful for scanning fast.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Just-in-time Background Writer Patch+Test Results

Reply via email to