On 2016-03-30 15:50:21 -0400, Robert Haas wrote: > On Thu, Mar 10, 2016 at 8:29 PM, Andres Freund <and...@anarazel.de> wrote: > > Allow to trigger kernel writeback after a configurable number of writes. > > While testing out Dilip Kumar's relation extension patch today, I > discovered (with some help from Andres) that this causes nasty > regressions when doing parallel COPY on hydra (3.2.6-3.fc16.ppc64, > lousy disk subsystem). What I did was (1) run pgbench -i -s 100, (2) > copy the results to a file, (3) truncate and drop the indexes on the > original table, and (4) try copying in one or more copies of the data > from the file. Typical command line: > > time pgbench -n -f f -t 1 -c 4 -j 4 && psql -c "select > pg_size_pretty(pg_relation_size('pgbench_accounts'));" && time psql -c > checkpoint && psql -c "truncate pgbench_accounts; checkpoint;" > > With default settings against > 96f8373cad5d6066baeb7a1c5a88f6f5c9661974, pgbench takes 9 to 9.5 > minutes and the subsequent checkpoint takes 9 seconds. After setting > , it takes 1 minute and 11 seconds and the subsequent checkpoint takes > 11 seconds. With a single copy of the data (that is, -c 1 -j 1 but > otherwise as above), it takes 28-29 seconds with default settings and > 26-27 seconds with backend_flush_after=0, bgwriter_flush_after=0. So > the difference is rather small with a straight-up COPY, but with 4 > copies running at the same time, it's near enough to an order of > magnitude. > > Andres reports that on his machine, non-zero *_flush_after settings > make things faster, not slower, so apparently this is > hardware-dependent or kernel-dependent. Nevertheless, it seems to me > that we should try to get some broader testing here to see which > experience is typical.
Indeed. On SSDs I see about a 25-35% gain, on HDDs about 5%. If I increase the size of backend_flush_after to 64 (like it's for bgwriter) I however do get about 15% for HDDs as well. I wonder if the default value of backend_flush_after is too small for some scenarios. I've reasoned that backend_flush_after should have a *lower* default value than e.g. checkpointer or bgwriter, because there's many concurrent writers increasing the total amount of unflushed dirty writes. Which is true for OLTP write workloads; but less so for bulk load. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers