On Sun, Jul 14, 2013 at 3:13 PM, Greg Smith <g...@2ndquadrant.com> wrote: > Accordingly, the current behavior--no delay--is already the best possible > throughput. If you apply a write timing change and it seems to increase > TPS, that's almost certainly because it executed less checkpoint writes. > It's not a fair comparison. You have to adjust any delaying to still hit > the same end point on the checkpoint schedule. That's what my later > submissions did, and under that sort of controlled condition most of the > improvements went away.
This is all valid logic, but I don't think it's makes the patch a bad idea. What KONDO Mitsumasa is proposing (or proposed at one point, upthread), is that when an fsync takes a long time, we should wait before issuing the next fsync, and the delay should be proportional to how long the previous fsync took. On a system that's behaving well, where fsyncs are always fast, that's going to make very little difference. On a system where fsync is sometimes very very slow, that might result in the checkpoint overrunning its time budget - but SO WHAT? I mean, yes, we want checkpoints to complete in the time specified, but if the I/O system is completely flogged, I suspect most people would prefer to overrun the checkpoint's time budget rather than have all foreground activity grind to a halt until the checkpoint finishes. As I'm pretty sure you've pointed out in the past, when this situation develops, the checkpoint may be doomed to overrun whether we like it or not. We should view this as an emergency pressure release valve; if we think not everyone will want it, then make it a GUC. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers