Re: [PATCH RFC v1 0/3] aio-poll: improve aio-polling efficiency

Stefan Hajnoczi Tue, 03 Feb 2026 13:13:15 -0800

On Fri, Jan 23, 2026 at 01:15:04PM -0600, JAEHOON KIM wrote:
> On 1/19/2026 12:16 PM, Stefan Hajnoczi wrote:
> > On Tue, Jan 13, 2026 at 11:48:21AM -0600, Jaehoon Kim wrote:
> > > We evaluated the patches on an s390x host with a single guest using 16
> > > virtio block devices backed by FCP multipath devices in a separate-disk
> > > setup, with the I/O scheduler set to 'none' in both host and guest.
> > > 
> > > The fio workload included sequential and random read/write with varying
> > > numbers of jobs (1,4,8,16) and io_depth of 8. The tests were conducted
> > > with single and dual iothreads, using the newly introduced poll-weight
> > > parameter to measure their impact on CPU cost and throughput.
> > > 
> > > Compared to the baseline, across four FIO workload patterns (sequential
> > > R/W, random R/W), and averaged over FIO job counts of 1, 4, 8, and 16,
> > > throughput decreased slightly (-3% to -8% for one iothread, -2% to -5%
> > > for two iothreads), while CPU usage on the s390x host dropped
> > > significantly (-10% to -25% and -7% to -12%, respectively).
> > Hi Jaehoon,
> > I would like to run the same fio benchmarks on a local NVMe drive (<10us
> > request latency) to see how that type of hardware configuration is
> > affected. Are the scripts and fio job files available somewhere?
> > 
> > Thanks,
> > Stefan
> 
> Thank you for your reply.
> The fio scripts are not available in a location you can access, but there is 
> nothing particularly special in the settings.
> I’m sharing below the methodology and test setup used by our performance team.
> 
> Guest Setup
> ----------------------
> - 12 vCPUs, 4 GiB memory
> - 16 virtio disks based on the FCP multipath devices in the host
> 
> FIO test parameters
> -----------------------
> - FIO Version: fio-3.33
> - Filesize: 2G
> - Blocksize: 8K / 128K
> - Direct I/O: 1
> - FIO I/O Engine: libaio
> - NUMJOB List: 1, 4, 8, 16
> - IODEPTH: 8
> - Runtime (s): 150
> 
> Two FIO samples for random read
> --------------------------------
> fio --direct=1 --name=test --numjobs=16 
> --filename=base.0.0:base.1.0:base.2.0:base.3.0:base.4.0:base.5.0:base.6.0:base.7.0:base.8.0:base.9.0:base.10.0:base.11.0:base.12.0:base.13.0:base.14.0:base.15.0
>  --size=32G  --time_based --runtime=4m --readwrite=randread --ioengine=libaio 
> --iodepth=8 --bs=8k
> fio --direct=1 --name=test --numjobs=4  
> --filename=subw1/base.0.0:subw4/base.3.0:subw8/base.7.0:subw12/base.11.0:subw16/base.15.0
>                                                                         
> --size=8G   --time_based --runtime=4m --readwrite=randread --ioengine=libaio 
> --iodepth=8 --bs=8k
> 
> 
> additional notes
> ----------------
> - Each file is placed on a separate disk device mounted under subw<n> as 
> specified in --filename=....
> - We execute one warmup run, then two measurement runs and calculate the 
> average


Hi Jaehoon,
I ran fio benchmarks on an Intel Optane SSD DC P4800X Series drive (<10
microsecond latency). This is with just 1 drive.

The 8 KiB block size results show something similar to what you
reported: there are IOPS (or throughput) regressions and CPU utilization
improvements.

Although the CPU improvements are welcome, I think the default behavior
should only be changed if the IOPS regressions can be brought below 5%.

The regressions seem to happen regardless of whether 1 or 2 IOThreads
are configured. CPU utilization is different (98% vs 78%) depending on
the number of IOThreads, so the regressions happen across a range of CPU
utilizations.

The 128 KiB block size results are not interesting because the drive
already saturates at numjobs=1. This is expected since the drive cannot
go much above ~2 GiB/s throughput.

You can find the Ansible playbook, libvirt domain XML, fio
command-lines, and the fio/sar data here:

https://gitlab.com/stefanha/virt-playbooks/-/tree/aio-polling-efficiency

Please let me know if you'd like me to rerun the benchmark with new
patches or a configuration change.

Do you want to have a video call to discuss your work and how to get the
patches merged?

Host
----
CPU: Intel Xeon Silver 4214 CPU @ 2.20GHz
RAM: 32 GiB

Guest
-----
vCPUs: 8
RAM: 4 GiB
Disk: 1 virtio-blk aio=native cache=none

IOPS
----
rw        bs   numjobs iothreads iops   diff
randread  8k   1       1         163417 -7.8%
randread  8k   1       2         165041 -2.4%
randread  8k   4       1         221508 -0.64%
randread  8k   4       2         251298 0.008%
randread  8k   8       1         222128 -0.51%
randread  8k   8       2         249489 -2.6%
randread  8k   16      1         230535 -0.18%
randread  8k   16      2         246732 -0.22%
randread  128k 1       1          17616 -0.11%
randread  128k 1       2          17678 0.027%
randread  128k 4       1          17536 -0.27%
randread  128k 4       2          17610 -0.031%
randread  128k 8       1          17369 -0.42%
randread  128k 8       2          17433 -0.071%
randread  128k 16      1          17215 -0.61%
randread  128k 16      2          17269 -0.22%
randwrite 8k   1       1         156597 -3.1%
randwrite 8k   1       2         157720 -3.8%
randwrite 8k   4       1         218448 -0.5%
randwrite 8k   4       2         247075 -5.1%
randwrite 8k   8       1         220866 -0.75%
randwrite 8k   8       2         260935 -0.011%
randwrite 8k   16      1         230913 0.23%
randwrite 8k   16      2         261125 -0.01%
randwrite 128k 1       1          16009 0.094%
randwrite 128k 1       2          16070 0.035%
randwrite 128k 4       1          16073 -0.62%
randwrite 128k 4       2          16131 0.059%
randwrite 128k 8       1          16106 0.092%
randwrite 128k 8       2          16153 0.048%
randwrite 128k 16      1          16102 -0.0091%
randwrite 128k 16      2          16160 0.048%

IOThread CPU usage
------------------
iothreads before  after
1         98.7    95.81
2         78.43   66.13

Stefan

signature.asc
Description: PGP signature

Re: [PATCH RFC v1 0/3] aio-poll: improve aio-polling efficiency

Reply via email to