On Tue, Feb 28, 2012 at 9:49 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, Feb 28, 2012 at 11:46 AM, Robert Haas <robertmh...@gmail.com> wrote:
>>
>> This is an interesting hypothesis which I think we can test.  I'm
>> thinking of writing a quick patch (just for testing, not for commit)
>> to set a new buffer flag BM_BGWRITER_CLEANED to every buffer the
>> background writer cleans.   Then we can keep a count of how often such
>> buffers are dirtied before they're evicted, vs. how often they're
>> evicted before they're dirtied.  If any significant percentage of them
>> are redirtied before they're evicted, that would confirm this
>> hypothesis.  At any rate I think the numbers would be interesting to
>> see.
>
> Patch attached.
>
...
> That doesn't look bad at all.  Then I reset the stats, tried it again,
> and got this:
>
> LOG:  bgwriter_clean: 3863 evict-before-dirty, 198 dirty-before-evict
> LOG:  bgwriter_clean: 3861 evict-before-dirty, 199 dirty-before-evict
> LOG:  bgwriter_clean: 3978 evict-before-dirty, 218 dirty-before-evict
> LOG:  bgwriter_clean: 3928 evict-before-dirty, 204 dirty-before-evict
> LOG:  bgwriter_clean: 3956 evict-before-dirty, 207 dirty-before-evict
> LOG:  bgwriter_clean: 3906 evict-before-dirty, 222 dirty-before-evict
> LOG:  bgwriter_clean: 3912 evict-before-dirty, 197 dirty-before-evict
> LOG:  bgwriter_clean: 3853 evict-before-dirty, 200 dirty-before-evict
>
> OK, that's not so good, but I don't know why it's different.

I don't think reseting the stats has anything to do with it, it is
just that the shared_buffers warmed up over time.

On my testing, this dirty-before-evict is because the bgwriter is
riding too far ahead of the clock sweep, because of
scan_whole_pool_milliseconds.  Because it is far ahead, that leaves a
lot of run between the two pointers for re-dirtying cache hits to
land.

Not only is 2 minutes likely to be too small of a value for large
shared_buffers, but min_scan_buffers doesn't live up to its name.  It
is not the minimum buffers to scan, it is the minimum to find/make
reusable.  If lots of buffers have a nonzero usagecount (and if your
data doesn't fix in shared_buffers, it is hard to see how more than
half of the buffers can have zero usagecount) or are pinned, you are
scanning a lot more than min_scan_buffers.

If I disable that, then the bgwriter remains "just in time", just
slightly ahead of the clock-sweep, and the dirty-before-evict drops a
lot.

If scan_whole_pool_milliseconds is to be used at all, it seems like it
should not be less than checkpoint_timeout.  If I don't want
checkpoints trashing my IO, why would I want someone else to do it
instead?

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to