On Wed, Sep 9, 2015 at 10:35 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> ... How often such a workload actually has to replace a *dirty* clog >> buffer obviously depends on how often you checkpoint, but if you're >> getting ~28k TPS you can completely fill 32 clog buffers (1 million >> transactions) in less than 40 seconds, and you're probably not >> checkpointing nearly that often. > > But by the same token, at that kind of transaction rate, no clog page is > actively getting dirtied for more than a couple of seconds. So while it > might get swapped in and out of the SLRU arena pretty often after that, > this scenario seems unconvincing as a source of repeated fsyncs.
Well, if you're filling ~1 clog page per second, you're doing ~1 fsync per second too. Or if you are not, then you are thrashing the progressively smaller and smaller number of clean slots ever-harder until no clean pages remain and you're forced to fsync something - probably, a bunch of things all at once. > Like Andres, I'd want to see a more realistic problem case before > expending a lot of work here. I think the question here isn't whether the problem case is realistic - I am quite sure that a pgbench workload is - but rather how much of a problem it actually causes. I'm very sure that individual pgbench backends experience multi-second stallls as a result of this. What I'm not sure about is how frequently it happens, and how much of an effect it has on overall latency. I think it would be worth someone's time to try to write some good instrumentation code here and figure that out. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers