Re: [PATCHES] Automatic adjustment of bgwriter_lru_maxpages

Heikki Linnakangas Sun, 13 May 2007 09:29:34 -0700

Greg Smith wrote:

The original code came from before there was a pg_stat_bgwriter. Thefirst patch (buf-alloc-stats) takes the two most interesting pieces ofdata the original patch collected, the number of buffers allocatedrecently and the number that the clients wrote out, and ties all thatinto the new stats structure. With this patch applied, you can get afeel for things like churn/turnover in the buffer pool that were veryhard to quantify before. Also, it makes it easy to measure how wellyour background writer is doing at writing buffers so the clients don'thave to. Applying this would complete one of my personal goals for the8.3 release, which was having stats to track every type of buffer write.
I split this out because I think it's very useful to have regardless ofwhether the automatic tuning portion is accepted, and I think thesesmaller patches make the review easier. The main thing I wouldrecommend someone check is how am_bg_writer is (mis?)used here. Ispliced some of the debugging-only code from the original patch, and Ican't tell if the result is a robust enough approach to solving theproblem of having every client indirectly report their activity to thebackground writer. Other than that, I think this code is ready forreview and potentially comitting.

This looks good to me in principle. StrategyReportWrite incrementsnumClientWrites without holding the BufFreeListLock, that's a racecondition. The terminology needs some adjustment; clients don't writebuffers, backends do.


Splitting the patch to two is a good idea.

The second patch (limit-lru) adds on top of that a constraint of the LRUwriter so that it doesn't do any more work than it has to. Note that Ileft verbose debugging code in here because I'm much less confident thispatch is complete.
It predicts upcoming buffer allocations using a 16-period weightedmoving average of recent activity, which you can think of as the last3.2 seconds at the default interval. After testing a few systems thatseemed a decent compromise of smoothing in both directions. I found the2X overallocation fudge factor of the original patch way too aggressive,and just pick the larger of the most recent allocation amount or thesmoothed value. The main thing that throws off the allocationestimation is when you hit a checkpoint, which can give a big spikeafter the background writer returns to BgBufferSync and notices all thebuffers that were allocated during the checkpoint write; the code thentries to find more buffers it can recycle than it needs to. Since thecheckpoint itself normally leaves a large wake of reusable buffersbehind it, I didn't find this to be a serious problem.

Can you tell more about the tests you performed? That algorithm seemsdecent, but I wonder why the simple fudge factor wasn't good enough? Iwould've thought that a 2x or even bigger fudge factor would still beonly a tiny fraction of shared_buffers, and wouldn't really affectperformance.

The load distributed checkpoint patch should mitigate the checkpointspike problem by continuing the LRU scan throughout the checkpoint.

There's another communication issue here, which is that SyncOneBufferneeds to return more information about the buffer than it currently doesonce it gets it locked. The background writer needs to know more thanjust if it was written to tune itself. The original patch used a clevertrick for this which worked but I found confusing. I happen to have abunch of other background writer tuning code I'm working on, and I hadto come up with a more robust way to communicate buffer internals backvia this channel. I used that code here, it's a bitmask setup similarto how flags like BM_DIRTY are used. It's overkill for solving thisparticular problem, but I think the interface is clean and it helpssupport future enhancements in intelligent background writing.

Uh, that looks pretty ugly to me. The normal way to return multiplevalues is to pass a pointer as an argument, though that can get ugly aswell if there's a lot of return values. What combinations of the flagsare valid? Would an enum be better? Or how about moving the checks fordirty and pinned buffers from SyncOneBuffer to the callers?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Re: [PATCHES] Automatic adjustment of bgwriter_lru_maxpages

Reply via email to