Hi, On Wed, Mar 11, 2026 at 7:28 AM Andres Freund <[email protected]> wrote: > > Hi, > > On 2026-03-10 21:23:26 +0800, Xuneng Zhou wrote: > > On Tue, Mar 10, 2026 at 6:28 PM Michael Paquier <[email protected]> wrote: > > Thanks for running the benchmarks! The performance gains for hash, > > gin, bloom_vacuum, and wal_logging is insignificant, likely because > > these workloads are not I/O-bound. The default number of I/O workers > > is three, which is fairly conservative. When I ran the benchmark > > script with a higher number of I/O workers, some runs showed improved > > performance. > > FWIW, another thing that may be an issue is that you're restarting postgres > all the time, as part of drop_caches(). That means we'll spend time reloading > catalog metadata and initializing shared buffers (the first write to a shared > buffers page is considerably more expensive than later ones, as the backing > memory needs to be initialized first). > > I found it useful to use the pg_buffercache extension (specifically > pg_buffercache_evict_relation()) to just drop the relation that is going to be > tested from shared_buffers.
Good point. I'll switch to using pg_buffercache_evict_relation() to evict only the target relation, keeping the cluster running. That should reduce measurement noise to some extend. > > > > pgstattuple_large base= 12429.3ms patch= 11916.8ms 1.04x > > > ( 4.1%) (reads=206945->12983, io_time=6501.91->32.24ms) > > > > > pgstattuple_large base= 12642.9ms patch= 11873.5ms 1.06x > > > ( 6.1%) (reads=206945->12983, io_time=6516.70->143.46ms) > > > > Yeah, this looks somewhat strange. The io_time has been reduced > > significantly, which should also lead to a substantial reduction in > > runtime. > > It's possible that the bottleneck just moved, e.g to the checksum computation, > if you have data checksums enabled. > > It's also worth noting that likely each of the test reps measures > something different, as likely > psql_run "$ROOT" "$PORT" -c "UPDATE heap_test SET data = data || '!' WHERE > id % 5 = 0;" > > leads to some out-of-page updates. > > You're probably better off deleting some of the data in a transaction that is > then rolled back. That will also unset all-visible, but won't otherwise change > the layout, no matter how many test iterations you run. > > > I'd also guess that you're seeing a relatively small win because you're > updating every page. When reading every page from disk, the OS can do > efficient readahead. If there are only occasional misses, that does not work. > Yeah, the repeated UPDATE changes the table layout across reps. I'll switch to: BEGIN; DELETE FROM heap_test WHERE id % N = 0; ROLLBACK; This clears the visibility map bits without altering the physical layout, so every rep measures the same table state. > > > method=io_uring > > pgstattuple_large base= 5551.5ms patch= 3498.2ms 1.59x > > ( 37.0%) (reads=206945→12983, io_time=2323.49→207.14ms) > > > > I ran the benchmark for this test again with io_uring, and the result > > is consistent with previous runs. I’m not sure what might be > > contributing to this behavior. > > What does a perf profile show? Is the query CPU bound? The runtime in my run of pgstattuple was reduced significantly due to the reduction in I/O time. I don’t think running perf on my setup would reveal anything particularly meaningful. The script has an option to run with perf, so perhaps Michael could try it to see whether the query becomes CPU-bound, if he’s interested and has time. > > Another code path that showed significant performance improvement is > > pgstatindex [1]. I've incorporated the test into the script too. Here > > are the results from my testing: > > > > method=worker io-workers=12 > > pgstatindex_large base= 233.8ms patch= 54.1ms 4.32x > > ( 76.8%) (reads=27460→1757, io_time=213.94→6.31ms) > > > > method=io_uring > > pgstatindex_large base= 224.2ms patch= 56.4ms 3.98x > > ( 74.9%) (reads=27460→1757, io_time=204.41→4.88ms) > > Nice! > > > Greetings, > > Andres Freund -- Best, Xuneng
