Re: [PATCH v2 1/2] KVM: X86: Add per-VM no-HLT-exiting capability

2018-02-13 Thread Wanpeng Li
2018-02-14 0:02 GMT+08:00 Paolo Bonzini :
> On 05/02/2018 07:57, Wanpeng Li wrote:
>> From: Wanpeng Li 
>>
>> If host CPUs are dedicated to a VM, we can avoid VM exits on HLT.
>> This patch adds the per-VM non-HLT-exiting capability.
>>
>> Cc: Paolo Bonzini 
>> Cc: Radim Krčmář 
>> Signed-off-by: Wanpeng Li 
>> ---
>> v1 -> v2:
>>  * vmx_clear_hlt() around INIT handling
>>  * vmx_clear_hlt() upon SMI and implement auto halt restart
>
> Hi Wanpeng,
>
> sorry I could not answer before.
>
> We do not need to implement AutoHalt.  It's a messy functionality and
> the way it works is much simpler: on RSM the microcode reads AutoHALT's
> bit 0 and... decrements RIP if it is 1.  All you need to do however is
> clear the activity state.  Guests should expect anyway that "CLI;HLT"
> can be interrupted by an NMI and follow it with a JMP.

Thanks for pointing out.

>
> Second, I would prefer to implement at the same time MWAIT and PAUSE
> passthrough, as in https://www.spinics.net/lists/kvm/msg159517.html:

Understand.

>
>> The three capabilities are more or less all doing the same thing.
>> Perhaps it would make some sense to only leave PAUSE spin loops in
>> guest, but not HLT/MWAIT; but apart from that I think users would
>> probably enable all of them.  So I think we should put in the
>> documentation that blindly passing the KVM_CHECK_EXTENSION result to
>> KVM_ENABLE_CAP is a valid thing to do when vCPUs are associated to
>> dedicated physical CPUs.
>>
>> Let's get rid of KVM_CAP_X86_GUEST_MWAIT altogether and
>> add a new capability.  But let's use just one.
>
> Thanks again for your work, and sorry for slightly contradicting Radim's
> review.  I've rebased and applied patch 2.

No problem. You and Radim's review is always appreciated and helpful.

Regards,
Wanpeng Li


Re: [PATCH v2 1/2] KVM: X86: Add per-VM no-HLT-exiting capability

2018-02-13 Thread Paolo Bonzini
On 05/02/2018 07:57, Wanpeng Li wrote:
> From: Wanpeng Li 
> 
> If host CPUs are dedicated to a VM, we can avoid VM exits on HLT.
> This patch adds the per-VM non-HLT-exiting capability.
> 
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Signed-off-by: Wanpeng Li 
> ---
> v1 -> v2:
>  * vmx_clear_hlt() around INIT handling
>  * vmx_clear_hlt() upon SMI and implement auto halt restart 

Hi Wanpeng,

sorry I could not answer before.

We do not need to implement AutoHalt.  It's a messy functionality and
the way it works is much simpler: on RSM the microcode reads AutoHALT's
bit 0 and... decrements RIP if it is 1.  All you need to do however is
clear the activity state.  Guests should expect anyway that "CLI;HLT"
can be interrupted by an NMI and follow it with a JMP.

Second, I would prefer to implement at the same time MWAIT and PAUSE
passthrough, as in https://www.spinics.net/lists/kvm/msg159517.html:

> The three capabilities are more or less all doing the same thing.
> Perhaps it would make some sense to only leave PAUSE spin loops in
> guest, but not HLT/MWAIT; but apart from that I think users would
> probably enable all of them.  So I think we should put in the
> documentation that blindly passing the KVM_CHECK_EXTENSION result to
> KVM_ENABLE_CAP is a valid thing to do when vCPUs are associated to
> dedicated physical CPUs.
>
> Let's get rid of KVM_CAP_X86_GUEST_MWAIT altogether and
> add a new capability.  But let's use just one.

Thanks again for your work, and sorry for slightly contradicting Radim's
review.  I've rebased and applied patch 2.

Paolo


Re: [PATCH v2 1/2] KVM: X86: Add per-VM no-HLT-exiting capability

2018-02-12 Thread Wanpeng Li
Ping,
2018-02-05 14:57 GMT+08:00 Wanpeng Li :
> From: Wanpeng Li 
>
> If host CPUs are dedicated to a VM, we can avoid VM exits on HLT.
> This patch adds the per-VM non-HLT-exiting capability.
>
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Signed-off-by: Wanpeng Li 
> ---
> v1 -> v2:
>  * vmx_clear_hlt() around INIT handling
>  * vmx_clear_hlt() upon SMI and implement auto halt restart
>
>  Documentation/virtual/kvm/api.txt  | 11 +++
>  arch/x86/include/asm/kvm_emulate.h |  1 +
>  arch/x86/include/asm/kvm_host.h|  7 +++
>  arch/x86/kvm/emulate.c |  2 ++
>  arch/x86/kvm/vmx.c | 38 
> ++
>  arch/x86/kvm/x86.c | 27 +++
>  arch/x86/kvm/x86.h |  5 +
>  include/uapi/linux/kvm.h   |  1 +
>  8 files changed, 88 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index 023da07..865b029 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -4302,6 +4302,17 @@ enables QEMU to build error log and branch to guest 
> kernel registered
>  machine check handling routine. Without this capability KVM will
>  branch to guests' 0x200 interrupt vector.
>
> +7.13 KVM_CAP_X86_GUEST_HLT
> +
> +Architectures: x86
> +Parameters: none
> +Returns: 0 on success
> +
> +This capability indicates that a guest using HLT to stop a virtual CPU
> +will not cause a VM exit. As such, time spent while a virtual CPU is
> +halted in this way will then be accounted for as guest running time on
> +the host, KVM_FEATURE_PV_UNHALT should be disabled.
> +
>  8. Other capabilities.
>  --
>
> diff --git a/arch/x86/include/asm/kvm_emulate.h 
> b/arch/x86/include/asm/kvm_emulate.h
> index b24b1c8..78cfe8ca 100644
> --- a/arch/x86/include/asm/kvm_emulate.h
> +++ b/arch/x86/include/asm/kvm_emulate.h
> @@ -225,6 +225,7 @@ struct x86_emulate_ops {
> unsigned (*get_hflags)(struct x86_emulate_ctxt *ctxt);
> void (*set_hflags)(struct x86_emulate_ctxt *ctxt, unsigned hflags);
> int (*pre_leave_smm)(struct x86_emulate_ctxt *ctxt, u64 smbase);
> +   void (*smm_auto_halt_restart)(struct x86_emulate_ctxt *ctxt);
>
>  };
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 8f0f09a..95b2c44 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -623,6 +623,11 @@ struct kvm_vcpu_arch {
> unsigned nmi_pending; /* NMI queued after currently running handler */
> bool nmi_injected;/* Trying to inject an NMI this entry */
> bool smi_pending;/* SMI queued after currently running handler */
> +   /*
> +* bit 0 is set if Value of Auto HALT Restart after Entry to SMM is 
> true
> +* bit 1 is set if Value of Auto HALT Restart When Exiting SMM is true
> +*/
> +   int smm_auto_halt_restart;
>
> struct kvm_mtrr mtrr_state;
> u64 pat;
> @@ -806,6 +811,8 @@ struct kvm_arch {
>
> gpa_t wall_clock;
>
> +   bool hlt_in_guest;
> +
> bool ept_identity_pagetable_done;
> gpa_t ept_identity_map_addr;
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index d91eaeb..ee5bc65 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -2597,6 +2597,8 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
>
> smbase = ctxt->ops->get_smbase(ctxt);
>
> +   if (GET_SMSTATE(u16, smbase, 0x7f02) & 0x1)
> +   ctxt->ops->smm_auto_halt_restart(ctxt);
> /*
>  * Give pre_leave_smm() a chance to make ISA-specific changes to the
>  * vCPU state (e.g. enter guest mode) before loading state from the 
> SMM
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 3e71086..23789c9 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2474,6 +2474,24 @@ static int nested_vmx_check_exception(struct kvm_vcpu 
> *vcpu, unsigned long *exit
> return 0;
>  }
>
> +static bool vmx_need_clear_hlt(struct kvm_vcpu *vcpu)
> +{
> +   return kvm_hlt_in_guest(vcpu->kvm) &&
> +   vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT;
> +}
> +
> +static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
> +{
> +   /*
> +* Ensure that we clear the HLT state in the VMCS.  We don't need to
> +* explicitly skip the instruction because if the HLT state is set,
> +* then the instruction is already executing and RIP has already been
> +* advanced.
> +*/
> +   if (vmx_need_clear_hlt(vcpu))
> +   vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
> +}
> +
>  static void vmx_queue_exception(struct kvm_vcpu *vcpu)
>  {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -2504,6 +2522,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
> intr_info |= INTR_TYPE_HA

[PATCH v2 1/2] KVM: X86: Add per-VM no-HLT-exiting capability

2018-02-04 Thread Wanpeng Li
From: Wanpeng Li 

If host CPUs are dedicated to a VM, we can avoid VM exits on HLT.
This patch adds the per-VM non-HLT-exiting capability.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
v1 -> v2:
 * vmx_clear_hlt() around INIT handling
 * vmx_clear_hlt() upon SMI and implement auto halt restart 

 Documentation/virtual/kvm/api.txt  | 11 +++
 arch/x86/include/asm/kvm_emulate.h |  1 +
 arch/x86/include/asm/kvm_host.h|  7 +++
 arch/x86/kvm/emulate.c |  2 ++
 arch/x86/kvm/vmx.c | 38 ++
 arch/x86/kvm/x86.c | 27 +++
 arch/x86/kvm/x86.h |  5 +
 include/uapi/linux/kvm.h   |  1 +
 8 files changed, 88 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 023da07..865b029 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4302,6 +4302,17 @@ enables QEMU to build error log and branch to guest 
kernel registered
 machine check handling routine. Without this capability KVM will
 branch to guests' 0x200 interrupt vector.
 
+7.13 KVM_CAP_X86_GUEST_HLT
+
+Architectures: x86
+Parameters: none
+Returns: 0 on success
+
+This capability indicates that a guest using HLT to stop a virtual CPU
+will not cause a VM exit. As such, time spent while a virtual CPU is
+halted in this way will then be accounted for as guest running time on
+the host, KVM_FEATURE_PV_UNHALT should be disabled.
+
 8. Other capabilities.
 --
 
diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index b24b1c8..78cfe8ca 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -225,6 +225,7 @@ struct x86_emulate_ops {
unsigned (*get_hflags)(struct x86_emulate_ctxt *ctxt);
void (*set_hflags)(struct x86_emulate_ctxt *ctxt, unsigned hflags);
int (*pre_leave_smm)(struct x86_emulate_ctxt *ctxt, u64 smbase);
+   void (*smm_auto_halt_restart)(struct x86_emulate_ctxt *ctxt);
 
 };
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8f0f09a..95b2c44 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -623,6 +623,11 @@ struct kvm_vcpu_arch {
unsigned nmi_pending; /* NMI queued after currently running handler */
bool nmi_injected;/* Trying to inject an NMI this entry */
bool smi_pending;/* SMI queued after currently running handler */
+   /*
+* bit 0 is set if Value of Auto HALT Restart after Entry to SMM is true
+* bit 1 is set if Value of Auto HALT Restart When Exiting SMM is true
+*/
+   int smm_auto_halt_restart;
 
struct kvm_mtrr mtrr_state;
u64 pat;
@@ -806,6 +811,8 @@ struct kvm_arch {
 
gpa_t wall_clock;
 
+   bool hlt_in_guest;
+
bool ept_identity_pagetable_done;
gpa_t ept_identity_map_addr;
 
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index d91eaeb..ee5bc65 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2597,6 +2597,8 @@ static int em_rsm(struct x86_emulate_ctxt *ctxt)
 
smbase = ctxt->ops->get_smbase(ctxt);
 
+   if (GET_SMSTATE(u16, smbase, 0x7f02) & 0x1)
+   ctxt->ops->smm_auto_halt_restart(ctxt);
/*
 * Give pre_leave_smm() a chance to make ISA-specific changes to the
 * vCPU state (e.g. enter guest mode) before loading state from the SMM
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 3e71086..23789c9 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2474,6 +2474,24 @@ static int nested_vmx_check_exception(struct kvm_vcpu 
*vcpu, unsigned long *exit
return 0;
 }
 
+static bool vmx_need_clear_hlt(struct kvm_vcpu *vcpu)
+{
+   return kvm_hlt_in_guest(vcpu->kvm) &&
+   vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT;
+}
+
+static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
+{
+   /*
+* Ensure that we clear the HLT state in the VMCS.  We don't need to
+* explicitly skip the instruction because if the HLT state is set,
+* then the instruction is already executing and RIP has already been
+* advanced.
+*/
+   if (vmx_need_clear_hlt(vcpu))
+   vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
+}
+
 static void vmx_queue_exception(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -2504,6 +2522,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
intr_info |= INTR_TYPE_HARD_EXCEPTION;
 
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
+
+   vmx_clear_hlt(vcpu);
 }
 
 static bool vmx_rdtscp_supported(void)
@@ -5359,6 +5379,8 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
exec_control |= CPU_BASED_CR3_STORE_EXITING |