On Mon, Nov 10, 2025 at 10:14:49PM -0800, Dongli Zhang wrote: > Date: Mon, 10 Nov 2025 22:14:49 -0800 > From: Dongli Zhang <[email protected]> > Subject: [PATCH v7 0/9] target/i386/kvm/pmu: PMU Enhancement, Bugfix and > Cleanup > X-Mailer: git-send-email 2.43.5 > > This patchset addresses four bugs related to AMD PMU virtualization. > > 1. The PerfMonV2 is still available if PERCORE if disabled via > "-cpu host,-perfctr-core". > > 2. The VM 'cpuid' command still returns PERFCORE although "-pmu" is > configured. > > 3. The third issue is that using "-cpu host,-pmu" does not disable AMD PMU > virtualization. When using "-cpu EPYC" or "-cpu host,-pmu", AMD PMU > virtualization remains enabled. On the VM's Linux side, you might still > see: > > [ 0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver. > > instead of: > > [ 0.596381] Performance Events: PMU not available due to virtualization, > using software events only. > [ 0.600972] NMI watchdog: Perf NMI watchdog permanently disabled > > To address this, KVM_CAP_PMU_CAPABILITY is used to set KVM_PMU_CAP_DISABLE > when "-pmu" is configured. > > 4. The fourth issue is that unreclaimed performance events (after a QEMU > system_reset) in KVM may cause random, unwanted, or unknown NMIs to be > injected into the VM. > > The AMD PMU registers are not reset during QEMU system_reset. > > (1) If the VM is reset (e.g., via QEMU system_reset or VM kdump/kexec) while > running "perf top", the PMU registers are not disabled properly. > > (2) Despite x86_cpu_reset() resetting many registers to zero, kvm_put_msrs() > does not handle AMD PMU registers, causing some PMU events to remain > enabled in KVM. > > (3) The KVM kvm_pmc_speculative_in_use() function consistently returns true, > preventing the reclamation of these events. Consequently, the > kvm_pmc->perf_event remains active. > > (4) After a reboot, the VM kernel may report the following error: > > [ 0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS > detected, complain to your hardware vendor. > [ 0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR > c0010200 is 530076) > > (5) In the worst case, the active kvm_pmc->perf_event may inject unknown > NMIs randomly into the VM kernel: > > [...] Uhhuh. NMI received for unknown reason 30 on CPU 0. > > To resolve these issues, we propose resetting AMD PMU registers during the > VM reset process Hi Dongli,
Except for Patch 1 & 2 which need compatibility options (if you think it's okay, I could help take these 2 and fix them when v11.0's compat array is ready). The other patches still LGTM. Maybe it's better to have a v8 excluding patch 1 & 2? Regards, Zhao
