Should io_method=worker remain the default?

Jeff Davis Tue, 02 Sep 2025 23:54:06 -0700


Has there already been a discussion about leaving the default as
io_method=worker? There was an Open Item for this, which was closed as
"Won't Fix", but the links don't explain why as far as I can see.


I tested a concurrent scan-heavy workload (see below) where the data
fits in memory, and "worker" seems to be 30% slower than "sync" with
default settings.

I'm not suggesting that AIO overall is slow -- on the contrary, I'm
excited about AIO. But if it regresses in some cases, we should make a
conscious choice about the default and what kind of tuning advice needs
to be offered.

I briefly tried tuning to see if a different io_workers value would
solve the problem, but no luck.

The good news is that io_uring seemed to solve the problem.
Unfortunately, that's platform-specific, so it can't be the default. I
didn't dig in very much, but it seemed to be at least as good as "sync"
mode for this workload.

Regards,
        Jeff Davis



Test summary: 32 connections each perform repeated sequential scans.
Each connection scans a different 1GB partition of the same table. I
used partitioning and a predicate to make it easier to script in
pgbench.


Test details:

Machine:
   AMD Ryzen 9 9950X 16-Core Processor
   64GB RAM
   Local storage, NVMe SSD
   Ubuntu 24.04 (Linux 6.11, liburing 2.5)

   Note: the storage didn't matter much, because the data fits in
memory. To get consistent results, when changing between data
directories for the 17 and 18 tests, I had to drop the filesystem cache
first to make room, then run a few scans to warm it with the data from
the right data directory.

For simplicity I disabled parallel query, but that didn't seem to have
a big effect. Everything else was set to the default.

Setup (checksums enabled):

   => create table t(sid int8, c0 int8, c1 int8, c2 int8, c3 int8, c4
int8, c5 int8, c6 int8, c7 int8) partition by range (sid);

   $ (for i in `seq 0 31`; do
         echo "create table t$(printf "%02d" $i) partition of t for
values from ($i) to ($((i+1)));";
      done) | ./bin/psql postgres
   $ (for i in `seq 0 31`; do 
         echo "insert into t$(printf "%02d" $i) select $i, 0, 1, 2, 3,
4, 5, 6, 7 from generate_series(0, 10000000);";
      done) | ./bin/psql postgres

   => vacuum analyze; checkpoint;

Script count.sql:

  SELECT COUNT(*) FROM t WHERE sid=:client_id;

pgbench:

  ./bin/pgbench --dbname=postgres -M prepared -n -c 32 -T 60 \
     -f count.sql

Results:

PG17:
  tps = 36.209048

PG18 (io_method=sync)
  tps = 34.014890

PG18 (io_method=worker io_workers=3)
  tps = 23.938509

PG18 (io_method=worker io_workers=16)
  tps = 16.734360

PG18 (io_method=io_uring)
  tps = 35.546825

Should io_method=worker remain the default?

Reply via email to