On Fri, Jan 23, 2026 at 01:15:04PM -0600, JAEHOON KIM wrote: > On 1/19/2026 12:16 PM, Stefan Hajnoczi wrote: > > On Tue, Jan 13, 2026 at 11:48:21AM -0600, Jaehoon Kim wrote: > > > We evaluated the patches on an s390x host with a single guest using 16 > > > virtio block devices backed by FCP multipath devices in a separate-disk > > > setup, with the I/O scheduler set to 'none' in both host and guest. > > > > > > The fio workload included sequential and random read/write with varying > > > numbers of jobs (1,4,8,16) and io_depth of 8. The tests were conducted > > > with single and dual iothreads, using the newly introduced poll-weight > > > parameter to measure their impact on CPU cost and throughput. > > > > > > Compared to the baseline, across four FIO workload patterns (sequential > > > R/W, random R/W), and averaged over FIO job counts of 1, 4, 8, and 16, > > > throughput decreased slightly (-3% to -8% for one iothread, -2% to -5% > > > for two iothreads), while CPU usage on the s390x host dropped > > > significantly (-10% to -25% and -7% to -12%, respectively). > > Hi Jaehoon, > > I would like to run the same fio benchmarks on a local NVMe drive (<10us > > request latency) to see how that type of hardware configuration is > > affected. Are the scripts and fio job files available somewhere? > > > > Thanks, > > Stefan > > Thank you for your reply. > The fio scripts are not available in a location you can access, but there is > nothing particularly special in the settings. > I’m sharing below the methodology and test setup used by our performance team. > > Guest Setup > ---------------------- > - 12 vCPUs, 4 GiB memory > - 16 virtio disks based on the FCP multipath devices in the host > > FIO test parameters > ----------------------- > - FIO Version: fio-3.33 > - Filesize: 2G > - Blocksize: 8K / 128K > - Direct I/O: 1 > - FIO I/O Engine: libaio > - NUMJOB List: 1, 4, 8, 16 > - IODEPTH: 8 > - Runtime (s): 150 > > Two FIO samples for random read > -------------------------------- > fio --direct=1 --name=test --numjobs=16 > --filename=base.0.0:base.1.0:base.2.0:base.3.0:base.4.0:base.5.0:base.6.0:base.7.0:base.8.0:base.9.0:base.10.0:base.11.0:base.12.0:base.13.0:base.14.0:base.15.0 > --size=32G --time_based --runtime=4m --readwrite=randread --ioengine=libaio > --iodepth=8 --bs=8k > fio --direct=1 --name=test --numjobs=4 > --filename=subw1/base.0.0:subw4/base.3.0:subw8/base.7.0:subw12/base.11.0:subw16/base.15.0 > > --size=8G --time_based --runtime=4m --readwrite=randread --ioengine=libaio > --iodepth=8 --bs=8k > > > additional notes > ---------------- > - Each file is placed on a separate disk device mounted under subw<n> as > specified in --filename=.... > - We execute one warmup run, then two measurement runs and calculate the > average
Hi Jaehoon, I ran fio benchmarks on an Intel Optane SSD DC P4800X Series drive (<10 microsecond latency). This is with just 1 drive. The 8 KiB block size results show something similar to what you reported: there are IOPS (or throughput) regressions and CPU utilization improvements. Although the CPU improvements are welcome, I think the default behavior should only be changed if the IOPS regressions can be brought below 5%. The regressions seem to happen regardless of whether 1 or 2 IOThreads are configured. CPU utilization is different (98% vs 78%) depending on the number of IOThreads, so the regressions happen across a range of CPU utilizations. The 128 KiB block size results are not interesting because the drive already saturates at numjobs=1. This is expected since the drive cannot go much above ~2 GiB/s throughput. You can find the Ansible playbook, libvirt domain XML, fio command-lines, and the fio/sar data here: https://gitlab.com/stefanha/virt-playbooks/-/tree/aio-polling-efficiency Please let me know if you'd like me to rerun the benchmark with new patches or a configuration change. Do you want to have a video call to discuss your work and how to get the patches merged? Host ---- CPU: Intel Xeon Silver 4214 CPU @ 2.20GHz RAM: 32 GiB Guest ----- vCPUs: 8 RAM: 4 GiB Disk: 1 virtio-blk aio=native cache=none IOPS ---- rw bs numjobs iothreads iops diff randread 8k 1 1 163417 -7.8% randread 8k 1 2 165041 -2.4% randread 8k 4 1 221508 -0.64% randread 8k 4 2 251298 0.008% randread 8k 8 1 222128 -0.51% randread 8k 8 2 249489 -2.6% randread 8k 16 1 230535 -0.18% randread 8k 16 2 246732 -0.22% randread 128k 1 1 17616 -0.11% randread 128k 1 2 17678 0.027% randread 128k 4 1 17536 -0.27% randread 128k 4 2 17610 -0.031% randread 128k 8 1 17369 -0.42% randread 128k 8 2 17433 -0.071% randread 128k 16 1 17215 -0.61% randread 128k 16 2 17269 -0.22% randwrite 8k 1 1 156597 -3.1% randwrite 8k 1 2 157720 -3.8% randwrite 8k 4 1 218448 -0.5% randwrite 8k 4 2 247075 -5.1% randwrite 8k 8 1 220866 -0.75% randwrite 8k 8 2 260935 -0.011% randwrite 8k 16 1 230913 0.23% randwrite 8k 16 2 261125 -0.01% randwrite 128k 1 1 16009 0.094% randwrite 128k 1 2 16070 0.035% randwrite 128k 4 1 16073 -0.62% randwrite 128k 4 2 16131 0.059% randwrite 128k 8 1 16106 0.092% randwrite 128k 8 2 16153 0.048% randwrite 128k 16 1 16102 -0.0091% randwrite 128k 16 2 16160 0.048% IOThread CPU usage ------------------ iothreads before after 1 98.7 95.81 2 78.43 66.13 Stefan
signature.asc
Description: PGP signature
