Manali Shukla <[email protected]> writes:

> From: Manali Shukla <[email protected]>
>
> The hypervisor can intercept the HLT instruction by setting the
> HLT-Intercept Bit in VMCB, causing a VMEXIT. This can be wasteful if
> there are pending V_INTR and V_NMI events, as the hypervisor must then
> initiate a VMRUN to handle them.
>
> If the HLT-Intercept Bit is cleared and the vCPU executes HLT while
> there are pending V_INTR and V_NMI events, the hypervisor won’t detect
> them, potentially causing indefinite suspension of the vCPU. This poses
> a problem for enlightened guests who  wish to securely handle the
> events.
>
> For Secure AVIC scenarios, if a guest does a HLT while an interrupt is
> pending (in IRR), the hypervisor does not have a way to figure out
> whether the guest needs to be re-entered, as it cannot read the guest
> backing page.  The Idle HLT intercept feature allows the hypervisor to
> intercept HLT execution only if there are no pending V_INTR and V_NMI
> events.
>
> There are two use cases for the Idle HLT intercept feature:
> - Secure VMs that wish to handle pending events securely without exiting
>   to the hypervisor on HLT (Secure AVIC).
> - Optimization for all the VMs to avoid a wasteful VMEXIT during HLT
>   when there are pending events.
>
> On discovering the Idle HLT Intercept, the KVM hypervisor,
> Sets the Idle HLT Intercept bit (bit (6), offset 0x14h) in the VMCB.
> When the Idle HLT Intercept bit is set, HLT Intercept bit (bit (0),
> offset 0xFh) should be cleared.
>
> Before entering the HLT state, the HLT instruction performs checks in
> following order:
> - The HLT intercept check, if set, it unconditionally triggers
>   SVM_EXIT_HLT (0x78).
> - The Idle HLT intercept check, if set and there are no pending V_INTR
>   or V_NMI events, triggers SVM_EXIT_IDLE_HLT (0xA6).
>
> Details about the Idle HLT intercept feature can be found in AMD APM [1].
>
> [1]: AMD64 Architecture Programmer's Manual Pub. 24593, April
>      2024, Vol 2, 15.9 Instruction Intercepts (Table 15-7: IDLE_HLT).
>      https://bugzilla.kernel.org/attachment.cgi?id=306250
>
> Signed-off-by: Manali Shukla <[email protected]>

LGTM

Reviewed-by: Nikunj A Dadhania <[email protected]>

> ---
>  arch/x86/include/asm/svm.h      |  1 +
>  arch/x86/include/uapi/asm/svm.h |  2 ++
>  arch/x86/kvm/svm/svm.c          | 13 ++++++++++---
>  3 files changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 2b59b9951c90..992050cb83d0 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -116,6 +116,7 @@ enum {
>       INTERCEPT_INVPCID,
>       INTERCEPT_MCOMMIT,
>       INTERCEPT_TLBSYNC,
> +     INTERCEPT_IDLE_HLT = 166,
>  };
>  
>  
> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
> index 1814b413fd57..ec1321248dac 100644
> --- a/arch/x86/include/uapi/asm/svm.h
> +++ b/arch/x86/include/uapi/asm/svm.h
> @@ -95,6 +95,7 @@
>  #define SVM_EXIT_CR14_WRITE_TRAP             0x09e
>  #define SVM_EXIT_CR15_WRITE_TRAP             0x09f
>  #define SVM_EXIT_INVPCID       0x0a2
> +#define SVM_EXIT_IDLE_HLT      0x0a6
>  #define SVM_EXIT_NPF           0x400
>  #define SVM_EXIT_AVIC_INCOMPLETE_IPI         0x401
>  #define SVM_EXIT_AVIC_UNACCELERATED_ACCESS   0x402
> @@ -224,6 +225,7 @@
>       { SVM_EXIT_CR4_WRITE_TRAP,      "write_cr4_trap" }, \
>       { SVM_EXIT_CR8_WRITE_TRAP,      "write_cr8_trap" }, \
>       { SVM_EXIT_INVPCID,     "invpcid" }, \
> +     { SVM_EXIT_IDLE_HLT,     "idle-halt" }, \
>       { SVM_EXIT_NPF,         "npf" }, \
>       { SVM_EXIT_AVIC_INCOMPLETE_IPI,         "avic_incomplete_ipi" }, \
>       { SVM_EXIT_AVIC_UNACCELERATED_ACCESS,   "avic_unaccelerated_access" }, \
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 78daedf6697b..36f307e71d5d 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1296,8 +1296,12 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
>               svm_set_intercept(svm, INTERCEPT_MWAIT);
>       }
>  
> -     if (!kvm_hlt_in_guest(vcpu->kvm))
> -             svm_set_intercept(svm, INTERCEPT_HLT);
> +     if (!kvm_hlt_in_guest(vcpu->kvm)) {
> +             if (cpu_feature_enabled(X86_FEATURE_IDLE_HLT))
> +                     svm_set_intercept(svm, INTERCEPT_IDLE_HLT);
> +             else
> +                     svm_set_intercept(svm, INTERCEPT_HLT);
> +     }
>  
>       control->iopm_base_pa = iopm_base;
>       control->msrpm_base_pa = __sme_set(__pa(svm->msrpm));
> @@ -3341,6 +3345,7 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu 
> *vcpu) = {
>       [SVM_EXIT_CR4_WRITE_TRAP]               = cr_trap,
>       [SVM_EXIT_CR8_WRITE_TRAP]               = cr_trap,
>       [SVM_EXIT_INVPCID]                      = invpcid_interception,
> +     [SVM_EXIT_IDLE_HLT]                     = kvm_emulate_halt,
>       [SVM_EXIT_NPF]                          = npf_interception,
>       [SVM_EXIT_RSM]                          = rsm_interception,
>       [SVM_EXIT_AVIC_INCOMPLETE_IPI]          = 
> avic_incomplete_ipi_interception,
> @@ -3503,7 +3508,7 @@ int svm_invoke_exit_handler(struct kvm_vcpu *vcpu, u64 
> exit_code)
>               return interrupt_window_interception(vcpu);
>       else if (exit_code == SVM_EXIT_INTR)
>               return intr_interception(vcpu);
> -     else if (exit_code == SVM_EXIT_HLT)
> +     else if (exit_code == SVM_EXIT_HLT || exit_code == SVM_EXIT_IDLE_HLT)
>               return kvm_emulate_halt(vcpu);
>       else if (exit_code == SVM_EXIT_NPF)
>               return npf_interception(vcpu);
> @@ -5224,6 +5229,8 @@ static __init void svm_set_cpu_caps(void)
>               if (vnmi)
>                       kvm_cpu_cap_set(X86_FEATURE_VNMI);
>  
> +             kvm_cpu_cap_check_and_set(X86_FEATURE_IDLE_HLT);
> +
>               /* Nested VM can receive #VMEXIT instead of triggering #GP */
>               kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
>       }
> -- 
> 2.34.1

Reply via email to