On Mon, 25 Aug 2025 12:46:07 +0200 Philippe Mathieu-Daudé <phi...@linaro.org> wrote:
> On 14/8/25 18:05, Igor Mammedov wrote: > > when kernel-irqchip=split is used, QEMU still hits BQL > > contention issue when reading ACPI PM/HPET timers > > (despite of timer[s] access being lock-less). > > > > So Windows with more than 255 cpus is still not able to > > boot (since it requires iommu -> split irqchip). > > > > Problematic path is in kvm_arch_pre_run() where BQL is taken > > unconditionally when split irqchip is in use. > > > > There are a few parts that BQL protects there: > > 1. interrupt check and injecting > > > > however we do not take BQL when checking for pending > > interrupt (even within the same function), so the patch > > takes the same approach for cpu->interrupt_request checks > > and takes BQL only if there is a job to do. > > > > 2. request_interrupt_window access > > CPUState::kvm_run::request_interrupt_window doesn't need BQL > > as it's accessed by its own vCPU thread. > > > > 3. cr8/cpu_get_apic_tpr access > > the same (as #2) applies to CPUState::kvm_run::cr8, > > and APIC registers are also cached/synced (get/put) within > > the vCPU thread it belongs to. > > > > Taking BQL only when is necessary, eleminates BQL bottleneck on > > IO/MMIO only exit path, improoving latency by 80% on HPET micro > > benchmark. > > > > This lets Windows to boot succesfully (in case hv-time isn't used) > > when more than 255 vCPUs are in use. > > > > Signed-off-by: Igor Mammedov <imamm...@redhat.com> > > Reviewed-by: Peter Xu <pet...@redhat.com> > > --- > > v3: > > * drop net needed pair of () in cpu->interrupt_request & > > CPU_INTERRUPT_HARD > > check > > * Paolo Bonzini <pbonz...@redhat.com> > > * don't take BQL when setting exit_request, use qatomic_set() instead > > * after above simplification take/release BQL unconditionally > > * drop smp_mb() after run->cr8/run->request_interrupt_window update > > --- > > target/i386/kvm/kvm.c | 12 +++++------- > > 1 file changed, 5 insertions(+), 7 deletions(-) > > > > /* Force the VCPU out of its inner loop to process any INIT requests > > * or (for userspace APIC, but it is cheap to combine the checks here) > > @@ -5489,10 +5486,10 @@ void kvm_arch_pre_run(CPUState *cpu, struct kvm_run > > *run) > > if (cpu_test_interrupt(cpu, CPU_INTERRUPT_INIT | CPU_INTERRUPT_TPR)) { > > if (cpu_test_interrupt(cpu, CPU_INTERRUPT_INIT) && > > !(env->hflags & HF_SMM_MASK)) { > > - cpu->exit_request = 1; > > + qatomic_set(&cpu->exit_request, 1); > > } > > if (cpu_test_interrupt(cpu, CPU_INTERRUPT_TPR)) { > > - cpu->exit_request = 1; > > + qatomic_set(&cpu->exit_request, 1); > > } > > } > > Interesting. IMHO to avoid future similar problems, we should declare > CPUState::exit_request a "private" field and access it via a helper, > where atomicity is enforced. I did only bare minimum here, while Paolo took over sanitizing/fixing exit_request part see https://patchew.org/QEMU/20250808185905.62776-1-pbonz...@redhat.com/ so this suggestion better fits there. > The only non-accelerator use is in PPC sPAPR: > > hw/ppc/spapr_hcall.c:512: cs->exit_request = 1; > hw/ppc/spapr_hcall.c:534: cs->exit_request = 1; > hw/ppc/spapr_hcall.c:627: cs->exit_request = 1; > > FYI last week we noticed a similar problem with CPUState::thread_kicked > when using HVF and I'll post in a few days a series containing this > change: > > -- >8 -- > diff --git a/system/cpus.c b/system/cpus.c > index 26fb3bd69c3..39daf85bae7 100644 > --- a/system/cpus.c > +++ b/system/cpus.c > @@ -464,10 +464,10 @@ void qemu_wait_io_event(CPUState *cpu) > > void cpus_kick_thread(CPUState *cpu) > { > - if (cpu->thread_kicked) { > + if (qatomic_read(&cpu->thread_kicked)) { > return; > } > - cpu->thread_kicked = true; > + qatomic_set(&cpu->thread_kicked, true); > > --- > > It only affects HVF because this is the single access to thread_kicked > out of accelerator code: > > $ git grep -w thread_kicked > include/hw/core/cpu.h:484: bool thread_kicked; > system/cpus.c:449: qatomic_set_mb(&cpu->thread_kicked, false); > system/cpus.c:476: if (cpu->thread_kicked) { > system/cpus.c:479: cpu->thread_kicked = true; > target/arm/hvf/hvf.c:1825: qatomic_set_mb(&cpu->thread_kicked, false); > > (Call introduced in commit 219c101fa7f "arm/hvf: Add a WFI handler"). >