On Fri, Aug 08, 2025 at 02:01:27PM +0200, Igor Mammedov wrote: > v3: > * hpet: replace explicit atomics with use seqlock API (PeterX) > * introduce cpu_test_interrupt() (Paolo) > and use it tree wide for checking interrupts > * don't take BQL for setting exit_request, use qatomic_set() instead. > (Paolo) > * after above change, relace conditional BQL with unconditional > to simlify things a bit (Paolo) > * drop not needed barriers (Paolo) > * minor tcg:cpu_handle_interrupt() cleanup > > v2: > * Make both read and write pathes BQL-less (Gerd) > * Refactor HPET to handle lock-less access correctly > when stopping/starting counter in parallel. (Peter Maydell) > * Publish kvm-unit-tests HPET bench/torture test [1] to verify > HPET lock-less handling
nice acpi things: Acked-by: Michael S. Tsirkin <m...@redhat.com> > When booting WS2025 with following CLI > 1) -M q35,hpet=off -cpu host -enable-kvm -smp 240,sockets=4 > the guest boots very slow and is sluggish after boot > or it's stuck on boot at spinning circle (most of the time). > > pref shows that VM is experiencing heavy BQL contention on IO path > which happens to be ACPI PM timer read access. A variation with > HPET enabled moves contention to HPET timer read access. > And it only gets worse with increasing number of VCPUs. > > Series prevents large VM vCPUs contending on BQL due to PM|HPET timer > access and lets Windows to move on with boot process. > > Testing lock-less IO with HPET micro benchmark [2] shows approx 80% > better performance than the current BLQ locked path. > [chart https://ibb.co/MJY9999 shows much better scaling of lock-less > IO compared to BQL one.] > > In my tests, with CLI WS2025 guest wasn't able to boot within 30min > on both hosts > * 32 core 2NUMA nodes > * 448 cores 8NUMA nodes > With ACPI PM timer in BQL-free read mode, guest boots within approx: > * 2min > * 1min > respectively. > > With HPET enabled boot time shrinks ~2x > * 4m13 -> 2m21 > * 2m19 -> 1m15 > respectively. > > 2) "[kvm-unit-tests PATCH v4 0/5] x86: add HPET counter tests" > > https://lore.kernel.org/kvm/20250725095429.1691734-1-imamm...@redhat.com/T/#t > PS: > Using hv-time=on cpu option helps a lot (when it works) and > lets [1] guest boot fine in ~1-2min. Series doesn't make > a significant impact in this case. > > PS2: > Tested series with a bunch of different guests: > RHEL-[6..10]x64, WS2012R2, WS2016, WS2022, WS2025 > > PS3: > dropped mention of https://bugzilla.redhat.com/show_bug.cgi?id=1322713 > as it's not reproducible with current software stack or even with > the same qemu/seabios as reported (kernel versions mentioned in > the report were interim ones and no longer available, > so I've used nearest released at the time for testing) > > Igor Mammedov (10): > memory: reintroduce BQL-free fine-grained PIO/MMIO > acpi: mark PMTIMER as unlocked > hpet: switch to fain-grained device locking > hpet: move out main counter read into a separate block > hpet: make main counter read lock-less > introduce cpu_test_interrupt() that will replace open coded checks > x86: kvm: use cpu_test_interrupt() instead of oppen coding checks > kvm: i386: irqchip: take BQL only if there is an interrupt > use cpu_test_interrupt() instead of oppen coding checks tree wide > tcg: move interrupt caching and single step masking closer to user > > include/hw/core/cpu.h | 12 ++++++++ > include/system/memory.h | 10 +++++++ > accel/tcg/cpu-exec.c | 25 +++++++--------- > accel/tcg/tcg-accel-ops.c | 3 +- > hw/acpi/core.c | 1 + > hw/timer/hpet.c | 38 +++++++++++++++++++----- > system/cpus.c | 3 +- > system/memory.c | 15 ++++++++++ > system/physmem.c | 2 +- > target/alpha/cpu.c | 8 ++--- > target/arm/cpu.c | 20 ++++++------- > target/arm/helper.c | 16 +++++----- > target/arm/hvf/hvf.c | 6 ++-- > target/avr/cpu.c | 2 +- > target/hppa/cpu.c | 2 +- > target/i386/hvf/hvf.c | 4 +-- > target/i386/hvf/x86hvf.c | 21 +++++++------ > target/i386/kvm/kvm.c | 46 ++++++++++++++--------------- > target/i386/nvmm/nvmm-all.c | 24 +++++++-------- > target/i386/tcg/system/seg_helper.c | 2 +- > target/i386/whpx/whpx-all.c | 34 ++++++++++----------- > target/loongarch/cpu.c | 2 +- > target/m68k/cpu.c | 2 +- > target/microblaze/cpu.c | 2 +- > target/mips/cpu.c | 6 ++-- > target/mips/kvm.c | 2 +- > target/openrisc/cpu.c | 3 +- > target/ppc/cpu_init.c | 2 +- > target/ppc/kvm.c | 2 +- > target/rx/cpu.c | 3 +- > target/rx/helper.c | 2 +- > target/s390x/cpu-system.c | 2 +- > target/sh4/cpu.c | 2 +- > target/sh4/helper.c | 2 +- > target/sparc/cpu.c | 2 +- > target/sparc/int64_helper.c | 4 +-- > 36 files changed, 193 insertions(+), 139 deletions(-) > > -- > 2.47.1