Hi, On 2026-03-10 19:28:29 +0900, Michael Paquier wrote: > On Tue, Mar 10, 2026 at 02:06:12PM +0800, Xuneng Zhou wrote: > > Here’s v5 of the patchset. The wal_logging_large patch has been > > removed, as no performance gains were observed in the benchmark runs. > > Looking at the numbers you are posting, it is harder to get excited > about the hash, gin, bloom_vacuum and wal_logging.
It's perhaps worth emphasizing that, to allow real world usage of direct IO, we'll need streaming implementation for most of these. Also, on windows the OS provided readahead is ... not aggressive, so you'll hit IO stalls much more frequently than you'd on linux (and some of the BSDs). It might be a good idea to run the benchmarks with debug_io_direct=data. That'll make them very slow, since the write side doesn't yet use AIO and thus will do a lot of synchronous writes, but it should still allow to evaluate the gains from using read stream. The other thing that's kinda important to evaluate read streams is to test on higher latency storage, even without direct IO. Many workloads are not at all benefiting from AIO when run on a local NVMe SSD with < 10us latency, but are severely IO bound when run on a cloud storage disk with 0.5ms - 4ms latency. To be able to test such higher latencies locally, I've found it quite useful to use dm_delay above a fast disk. See [1]. > The worker method seems more efficient, may show that we are out of noise > level. I think that's more likely to show that memory bandwidth, probably due to checksum computations, is a factor. The memory copy (from the kernel page cache, with buffered IO) and the checksum computations (when checksums are enabled) are parallelized by worker, but not by io_uring. Greetings, Andres Freund [1] https://docs.kernel.org/admin-guide/device-mapper/delay.html Assuming /dev/md0 is mounted to /srv, and a delay of 1ms should be introduced for it: umount /srv && dmsetup create delayed --table "0 $(blockdev --getsz /dev/md0) delay /dev/md0 0 1" /dev/md0 && mount /dev/mapper/delayed /srv/ To update the amount of delay to 3ms the following can be used: dmsetup suspend delayed && dmsetup reload delayed --table "0 $(blockdev --getsz /dev/md0) delay /dev/md0 0 3" /dev/md0 && dmsetup resume delayed (I will often just update the delay to 0 for comparison runs, as that doesn't require remounting)
