The commit is pushed to "branch-rh10-6.12.0-55.52.1.5.x.vz10-ovz" and will 
appear at [email protected]:openvz/vzkernel.git
after rh10-6.12.0-55.52.1.5.28.vz10
------>
commit eeaebe10aab0d338fdce3d727b60e81e334d426a
Author: Denis V. Lunev <[email protected]>
Date:   Tue May 26 19:18:23 2026 +0200

    x86/kvm: skip kvm_pmu_trigger_event() when guest PMU is disabled
    
    kvm_pmu_trigger_event() is called from several KVM x86 hot paths to feed
    the emulated INSTRUCTIONS_RETIRED / BRANCH_INSTRUCTIONS_RETIRED PMC events:
    
      * kvm_skip_emulated_instruction() - once per emulated-instruction skip
        (CPUID exits, hypercalls, fastpath MSR writes, ...)
      * x86_emulate_instruction() - on every successfully emulated instruction
        (MMIO, real-mode emulation), with a second call for branch instructions
      * nested_vmx_run() - on guest VMRUN, counts as a branch instruction
    
    The function does a stack bitmap allocation, a bitmap_and against
    pmu->global_ctrl, indirect calls to pmc_event_is_allowed() and
    cpl_is_matched(), and an empty kvm_for_each_pmc() walk before returning
    when nothing is programmed. On guests with PMU disabled at the per-VM level
    (<pmu state='off'/>), all of this work is dead-weight on every traversal.
    
    Guard each call site with a check on kvm->arch.enable_pmu. The unlikely()
    hint keeps the hot path straight-line and places the (rarely-taken) call
    out-of-line for better I-cache density. Adjacent calls in
    x86_emulate_instruction() share one guard.
    
    The CPUID-skip site is hottest, so most of the measured impact comes from
    there - but gating the remaining three call sites for consistency avoids a
    "one gated, three not" state.
    
    Measured on at_cpu_cpuid (-k 0, VSTOR-131579 methodology), guest PMU off,
    on top of a Nova-side default-PMU-off VM config:
    
      AMD Zen2 (epyc01, EPYC 7302):     +2.7%, ~57 cyc/VMEXIT saved
      Intel Ice Lake-SP (hw-hdd01):     +0.6%, ~26 cyc/VMEXIT saved
    
    The AMD bias is consistent with the per-call retpoline + unret + Safe-RET
    tax on Zen2/Zen3 with default mitigations: removing an entire call/return
    sequence from the AMD hot path saves more than the same removal on Intel
    with Enhanced/Auto IBRS.
    
    https://virtuozzo.atlassian.net/browse/VSTOR-132543
    https://virtuozzo.atlassian.net/browse/VSTOR-132471
    
    Feature: kvm
    Signed-off-by: Denis V. Lunev <[email protected]>
---
 arch/x86/kvm/vmx/nested.c |  3 ++-
 arch/x86/kvm/x86.c        | 11 +++++++----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 5df2c6bffab83..76589a1664682 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3684,7 +3684,8 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool 
launch)
                return 1;
        }
 
-       kvm_pmu_trigger_event(vcpu, 
kvm_pmu_eventsel.BRANCH_INSTRUCTIONS_RETIRED);
+       if (unlikely(vcpu->kvm->arch.enable_pmu))
+               kvm_pmu_trigger_event(vcpu, 
kvm_pmu_eventsel.BRANCH_INSTRUCTIONS_RETIRED);
 
        if (CC(evmptrld_status == EVMPTRLD_VMFAIL))
                return nested_vmx_failInvalid(vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1249fbbe26951..8a872cbb18ad7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8957,7 +8957,8 @@ int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
        if (unlikely(!r))
                return 0;
 
-       kvm_pmu_trigger_event(vcpu, kvm_pmu_eventsel.INSTRUCTIONS_RETIRED);
+       if (unlikely(vcpu->kvm->arch.enable_pmu))
+               kvm_pmu_trigger_event(vcpu, 
kvm_pmu_eventsel.INSTRUCTIONS_RETIRED);
 
        /*
         * rflags is the old, "raw" value of the flags.  The new value has
@@ -9283,9 +9284,11 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t 
cr2_or_gpa,
                 */
                if (!ctxt->have_exception ||
                    exception_type(ctxt->exception.vector) == EXCPT_TRAP) {
-                       kvm_pmu_trigger_event(vcpu, 
kvm_pmu_eventsel.INSTRUCTIONS_RETIRED);
-                       if (ctxt->is_branch)
-                               kvm_pmu_trigger_event(vcpu, 
kvm_pmu_eventsel.BRANCH_INSTRUCTIONS_RETIRED);
+                       if (unlikely(vcpu->kvm->arch.enable_pmu)) {
+                               kvm_pmu_trigger_event(vcpu, 
kvm_pmu_eventsel.INSTRUCTIONS_RETIRED);
+                               if (ctxt->is_branch)
+                                       kvm_pmu_trigger_event(vcpu, 
kvm_pmu_eventsel.BRANCH_INSTRUCTIONS_RETIRED);
+                       }
                        kvm_rip_write(vcpu, ctxt->eip);
                        if (r && (ctxt->tf || (vcpu->guest_debug & 
KVM_GUESTDBG_SINGLESTEP)))
                                r = kvm_vcpu_do_singlestep(vcpu);
_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to