Re: [PATCH v5 2/2] kvm: nVMX: Introduce KVM_CAP_NESTED_STATE

2018-07-18 Thread Paolo Bonzini
On 18/07/2018 20:03, Jim Mattson wrote:
> On Wed, Jul 18, 2018 at 10:55 AM, Radim Krčmář  wrote:
> 
>>> + vmx->nested.nested_run_pending = 1;
>> This is not necessary.  We're only copying state and do not add anything
>> that would be lost on a nested VM exit without prior VM entry.
> If nested_run_pending is blindly set on restore, then prepare_vmcs02
> will do the wrong thing. For example, if there was an injected event
> in the vmcs12, it will get injected again, even if the vCPU has been
> in L2 for some time.
> 
> The value of nested_run_pending should always come from the saved VMX
> state (a few lines above).
> 

Yep, and there are a couple other things that need adjustment.  Stay
tuned...

Paolo


Re: [PATCH v5 2/2] kvm: nVMX: Introduce KVM_CAP_NESTED_STATE

2018-07-18 Thread Paolo Bonzini
On 18/07/2018 20:03, Jim Mattson wrote:
> On Wed, Jul 18, 2018 at 10:55 AM, Radim Krčmář  wrote:
> 
>>> + vmx->nested.nested_run_pending = 1;
>> This is not necessary.  We're only copying state and do not add anything
>> that would be lost on a nested VM exit without prior VM entry.
> If nested_run_pending is blindly set on restore, then prepare_vmcs02
> will do the wrong thing. For example, if there was an injected event
> in the vmcs12, it will get injected again, even if the vCPU has been
> in L2 for some time.
> 
> The value of nested_run_pending should always come from the saved VMX
> state (a few lines above).
> 

Yep, and there are a couple other things that need adjustment.  Stay
tuned...

Paolo


Re: [PATCH v5 2/2] kvm: nVMX: Introduce KVM_CAP_NESTED_STATE

2018-07-18 Thread Jim Mattson
On Wed, Jul 18, 2018 at 10:55 AM, Radim Krčmář  wrote:

>> + vmx->nested.nested_run_pending = 1;
>
> This is not necessary.  We're only copying state and do not add anything
> that would be lost on a nested VM exit without prior VM entry.

If nested_run_pending is blindly set on restore, then prepare_vmcs02
will do the wrong thing. For example, if there was an injected event
in the vmcs12, it will get injected again, even if the vCPU has been
in L2 for some time.

The value of nested_run_pending should always come from the saved VMX
state (a few lines above).


Re: [PATCH v5 2/2] kvm: nVMX: Introduce KVM_CAP_NESTED_STATE

2018-07-18 Thread Jim Mattson
On Wed, Jul 18, 2018 at 10:55 AM, Radim Krčmář  wrote:

>> + vmx->nested.nested_run_pending = 1;
>
> This is not necessary.  We're only copying state and do not add anything
> that would be lost on a nested VM exit without prior VM entry.

If nested_run_pending is blindly set on restore, then prepare_vmcs02
will do the wrong thing. For example, if there was an injected event
in the vmcs12, it will get injected again, even if the vCPU has been
in L2 for some time.

The value of nested_run_pending should always come from the saved VMX
state (a few lines above).


Re: [PATCH v5 2/2] kvm: nVMX: Introduce KVM_CAP_NESTED_STATE

2018-07-18 Thread Radim Krčmář
2018-07-10 11:27+0200, KarimAllah Ahmed:
> From: Jim Mattson 
> 
> For nested virtualization L0 KVM is managing a bit of state for L2 guests,
> this state can not be captured through the currently available IOCTLs. In
> fact the state captured through all of these IOCTLs is usually a mix of L1
> and L2 state. It is also dependent on whether the L2 guest was running at
> the moment when the process was interrupted to save its state.
> 
> With this capability, there are two new vcpu ioctls: KVM_GET_NESTED_STATE
> and KVM_SET_NESTED_STATE. These can be used for saving and restoring a VM
> that is in VMX operation.
> 
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: H. Peter Anvin 
> Cc: x...@kernel.org
> Cc: k...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Jim Mattson 
> [karahmed@ - rename structs and functions and make them ready for AMD and
>  address previous comments.
>- handle nested.smm state.
>- rebase & a bit of refactoring.
>- Merge 7/8 and 8/8 into one patch. ]
> Signed-off-by: KarimAllah Ahmed 
> ---
> v4 -> v5:
> - Drop the update to KVM_REQUEST_ARCH_BASE in favor of a patch to switch to
>   u64 instead.
> - Fix commit message.
> - Handle nested.smm state as well.
> - rebase
> 
> v3 -> v4:
> - Rename function to have _nested
> 
> v2 -> v3:
> - Remove the forced VMExit from L2 after reading the kvm_state. The actual
>   problem is solved.
> - Rebase again!
> - Set nested_run_pending during restore (not sure if it makes sense yet or
>   not).
> - Reduce KVM_REQUEST_ARCH_BASE to 7 instead of 8 (the other alternative is
>   to switch everything to u64)
> 
> v1 -> v2:
> - Rename structs and functions and make them ready for AMD and address
>   previous comments.
> - Rebase & a bit of refactoring.
> - Merge 7/8 and 8/8 into one patch.
> - Force a VMExit from L2 after reading the kvm_state to avoid mixed state
>   between L1 and L2 on resurrecting the instance.
> ---
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> @@ -12976,6 +12977,197 @@ static int enable_smi_window(struct kvm_vcpu *vcpu)
> +static int set_vmcs_cache(struct kvm_vcpu *vcpu,
> +   struct kvm_nested_state __user *user_kvm_nested_state,
> +   struct kvm_nested_state *kvm_state)
> +
> +{
> [...]
> +
> + if (kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING)
> + vmx->nested.nested_run_pending = 1;
> +
> + if (check_vmentry_prereqs(vcpu, vmcs12) ||
> + check_vmentry_postreqs(vcpu, vmcs12, _qual))
> + return -EINVAL;
> +
> + ret = enter_vmx_non_root_mode(vcpu);
> + if (ret)
> + return ret;
> +
> + /*
> +  * The MMU is not initialized to point at the right entities yet and
> +  * "get pages" would need to read data from the guest (i.e. we will
> +  * need to perform gpa to hpa translation). So, This request will
> +  * result in a call to nested_get_vmcs12_pages before the next
> +  * VM-entry.
> +  */
> + kvm_make_request(KVM_REQ_GET_VMCS12_PAGES, vcpu);
> +
> + vmx->nested.nested_run_pending = 1;

This is not necessary.  We're only copying state and do not add anything
that would be lost on a nested VM exit without prior VM entry.

> +

Halting the VCPU should probably be done here, just like at the end of
nested_vmx_run().

> + return 0;
> +}
> +
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> @@ -963,6 +963,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_GET_MSR_FEATURES 153
>  #define KVM_CAP_HYPERV_EVENTFD 154
>  #define KVM_CAP_HYPERV_TLBFLUSH 155
> +#define KVM_CAP_STATE 156

KVM_CAP_NESTED_STATE

(good documentation makes code worse. :])


Re: [PATCH v5 2/2] kvm: nVMX: Introduce KVM_CAP_NESTED_STATE

2018-07-18 Thread Radim Krčmář
2018-07-10 11:27+0200, KarimAllah Ahmed:
> From: Jim Mattson 
> 
> For nested virtualization L0 KVM is managing a bit of state for L2 guests,
> this state can not be captured through the currently available IOCTLs. In
> fact the state captured through all of these IOCTLs is usually a mix of L1
> and L2 state. It is also dependent on whether the L2 guest was running at
> the moment when the process was interrupted to save its state.
> 
> With this capability, there are two new vcpu ioctls: KVM_GET_NESTED_STATE
> and KVM_SET_NESTED_STATE. These can be used for saving and restoring a VM
> that is in VMX operation.
> 
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: H. Peter Anvin 
> Cc: x...@kernel.org
> Cc: k...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Jim Mattson 
> [karahmed@ - rename structs and functions and make them ready for AMD and
>  address previous comments.
>- handle nested.smm state.
>- rebase & a bit of refactoring.
>- Merge 7/8 and 8/8 into one patch. ]
> Signed-off-by: KarimAllah Ahmed 
> ---
> v4 -> v5:
> - Drop the update to KVM_REQUEST_ARCH_BASE in favor of a patch to switch to
>   u64 instead.
> - Fix commit message.
> - Handle nested.smm state as well.
> - rebase
> 
> v3 -> v4:
> - Rename function to have _nested
> 
> v2 -> v3:
> - Remove the forced VMExit from L2 after reading the kvm_state. The actual
>   problem is solved.
> - Rebase again!
> - Set nested_run_pending during restore (not sure if it makes sense yet or
>   not).
> - Reduce KVM_REQUEST_ARCH_BASE to 7 instead of 8 (the other alternative is
>   to switch everything to u64)
> 
> v1 -> v2:
> - Rename structs and functions and make them ready for AMD and address
>   previous comments.
> - Rebase & a bit of refactoring.
> - Merge 7/8 and 8/8 into one patch.
> - Force a VMExit from L2 after reading the kvm_state to avoid mixed state
>   between L1 and L2 on resurrecting the instance.
> ---
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> @@ -12976,6 +12977,197 @@ static int enable_smi_window(struct kvm_vcpu *vcpu)
> +static int set_vmcs_cache(struct kvm_vcpu *vcpu,
> +   struct kvm_nested_state __user *user_kvm_nested_state,
> +   struct kvm_nested_state *kvm_state)
> +
> +{
> [...]
> +
> + if (kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING)
> + vmx->nested.nested_run_pending = 1;
> +
> + if (check_vmentry_prereqs(vcpu, vmcs12) ||
> + check_vmentry_postreqs(vcpu, vmcs12, _qual))
> + return -EINVAL;
> +
> + ret = enter_vmx_non_root_mode(vcpu);
> + if (ret)
> + return ret;
> +
> + /*
> +  * The MMU is not initialized to point at the right entities yet and
> +  * "get pages" would need to read data from the guest (i.e. we will
> +  * need to perform gpa to hpa translation). So, This request will
> +  * result in a call to nested_get_vmcs12_pages before the next
> +  * VM-entry.
> +  */
> + kvm_make_request(KVM_REQ_GET_VMCS12_PAGES, vcpu);
> +
> + vmx->nested.nested_run_pending = 1;

This is not necessary.  We're only copying state and do not add anything
that would be lost on a nested VM exit without prior VM entry.

> +

Halting the VCPU should probably be done here, just like at the end of
nested_vmx_run().

> + return 0;
> +}
> +
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> @@ -963,6 +963,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_GET_MSR_FEATURES 153
>  #define KVM_CAP_HYPERV_EVENTFD 154
>  #define KVM_CAP_HYPERV_TLBFLUSH 155
> +#define KVM_CAP_STATE 156

KVM_CAP_NESTED_STATE

(good documentation makes code worse. :])


Re: [PATCH v5 2/2] kvm: nVMX: Introduce KVM_CAP_NESTED_STATE

2018-07-18 Thread Paolo Bonzini
On 10/07/2018 11:27, KarimAllah Ahmed wrote:
> @@ -11772,7 +11772,6 @@ static int enter_vmx_non_root_mode(struct kvm_vcpu 
> *vcpu)
>   if (prepare_vmcs02(vcpu, vmcs12, _qual))
>   goto fail;
>  
> - nested_get_vmcs12_pages(vcpu, vmcs12);
>  
>   r = EXIT_REASON_MSR_LOAD_FAIL;
>   msr_entry_idx = nested_vmx_load_msr(vcpu,

I think this is not enough, the MSR load should not be redone on
KVM_SET_NESTED_STATE.  This issue is preexisting and happens for SMM
exit as well.  SMM exit in fact also needs to defer
nested_get_vmcs12_pages, or the pages could be read from SMRAM.

I'll send a v6 patch and a testcase, and in the meanwhile I have applied
patch 1.

Paolo


Re: [PATCH v5 2/2] kvm: nVMX: Introduce KVM_CAP_NESTED_STATE

2018-07-18 Thread Paolo Bonzini
On 10/07/2018 11:27, KarimAllah Ahmed wrote:
> @@ -11772,7 +11772,6 @@ static int enter_vmx_non_root_mode(struct kvm_vcpu 
> *vcpu)
>   if (prepare_vmcs02(vcpu, vmcs12, _qual))
>   goto fail;
>  
> - nested_get_vmcs12_pages(vcpu, vmcs12);
>  
>   r = EXIT_REASON_MSR_LOAD_FAIL;
>   msr_entry_idx = nested_vmx_load_msr(vcpu,

I think this is not enough, the MSR load should not be redone on
KVM_SET_NESTED_STATE.  This issue is preexisting and happens for SMM
exit as well.  SMM exit in fact also needs to defer
nested_get_vmcs12_pages, or the pages could be read from SMRAM.

I'll send a v6 patch and a testcase, and in the meanwhile I have applied
patch 1.

Paolo