On Thu, 6 Sep 2007, Decibel! wrote:
I don't know that there should be a direct correlation, but ISTM that scan_whole_pool_seconds should take checkpoint intervals into account somehow.
Any direct correlation is weak at this point. The LRU cleaner has a small impact on checkpoints, in that it's writing out buffers that may make the checkpoint quicker. But this particular write trickling mechanism is not aimed directly at flushing the whole pool; it's more about smoothing out idle periods a bit.
Also, computing the checkpoint interval is itself tricky. Heikki had to put some work into getting something that took into account both the timeout and segments mechanisms to gauge progress, and I'm not sure I can directly re-use that because it's really only doing that while the checkpoint is active. I'm not saying it's a bad idea to have the expected interval as an input to the model, just that it's not obvious to me how to do it and whether it would really help.
I like the idea of not having that as a GUC, but I'm doubtful that it can be hard-coded like that. What if checkpoint_timeout is set to 120? Or 60? Or 2000?
Someone using 60 or 120 has checkpoint problems way bigger than the LRU cleaner can be expected to help with. How fast the reusable buffers it can write are pushed out is the least of their problems. Also, I'd expect that the only cases using such a low value for a good reason are doing so because they have enormous amounts of activity on their system, and in that case the primary JIT mechanism should dominate how the LRU cleaner treats them. scan_whole_pool_seconds doesn't do anything if the primary mechanism was already planning to scan more buffers than it aims for.
Someone who has very infrequent checkpoints and therefore low activity, like your 2000 case, can expect that the LRU cleaner will lap and catch up to the strategy point about 2 minutes after any activity and then follow directly behind it with the way I've set this up. If that's cleaning the buffer cache too aggressively, I think those in that situation would be better served by constraining the maxpages parameter; that's directly adjusting what I'd expect their real issue is, how fast pages can flush to disk, rather than the secondary one of how fast the pool is being scanned.
I picked 2 minutes for that value because it's as slow as I can make it and still serve its purpose, while not feeling to me like it's too fast for a relatively idle system even if someone set maxpages=1000.
-- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq