Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-21 Thread Wanpeng Li
2017-06-21 0:12 GMT+08:00 Radim Krčmář :
> 2017-06-20 05:47+0800, Wanpeng Li:
>> 2017-06-19 22:51 GMT+08:00 Radim Krčmář :
>> > 2017-06-17 13:52+0800, Wanpeng Li:
>> >> 2017-06-16 23:38 GMT+08:00 Radim Krčmář :
>> >> > 2017-06-16 22:24+0800, Wanpeng Li:
>> >> >> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
>> >> >> > 2017-06-14 19:26-0700, Wanpeng Li:
>> >> >> >> From: Wanpeng Li 
>> >> >> >>
>> >> >> >> Add an async_page_fault field to vcpu->arch.exception to identify 
>> >> >> >> an async
>> >> >> >> page fault, and constructs the expected vm-exit information fields. 
>> >> >> >> Force
>> >> >> >> a nested VM exit from nested_vmx_check_exception() if the injected 
>> >> >> >> #PF
>> >> >> >> is async page fault.
>> >> >> >>
>> >> >> >> Cc: Paolo Bonzini 
>> >> >> >> Cc: Radim Krčmář 
>> >> >> >> Signed-off-by: Wanpeng Li 
>> >> >> >> ---
>> >> >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >> >> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>> >> >> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct 
>> >> >> >> x86_exception *fault)
>> >> >> >>  {
>> >> >> >>   ++vcpu->stat.pf_guest;
>> >> >> >> - vcpu->arch.cr2 = fault->address;
>> >> >> >> + vcpu->arch.exception.async_page_fault = 
>> >> >> >> fault->async_page_fault;
>> >> >> >
>> >> >> > I think we need to act as if arch.exception.async_page_fault was not
>> >> >> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
>> >> >> > migrate with pending async_page_fault exception, we'd inject it as a
>> >> >> > normal #PF, which could confuse/kill the nested guest.
>> >> >> >
>> >> >> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
>> >> >> > sanity as well.
>> >> >>
>> >> >> Do you mean we should add a field like async_page_fault to
>> >> >> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
>> >> >> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
>> >> >> restores events->exception.async_page_fault to
>> >> >> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
>> >> >
>> >> > No, I thought we could get away with a disgusting hack of hiding the
>> >> > exception from userspace, which would work for migration, but not if
>> >> > local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
>> >> >
>> >> > Extending the userspace interface would work, but I'd do it as a last
>> >> > resort, after all conservative solutions have failed.
>> >> > async_pf migration is very crude, so exposing the exception is just an
>> >> > ugly workaround for the local case.  Adding the flag would also require
>> >> > userspace configuration of async_pf features for the guest to keep
>> >> > compatibility.
>> >> >
>> >> > I see two options that might be simpler than adding the userspace flag:
>> >> >
>> >> >  1) do the nested VM exit sooner, at the place where we now queue #PF,
>> >> >  2) queue the #PF later, save the async_pf in some intermediate
>> >> > structure and consume it at the place where you proposed the nested
>> >> > VM exit.
>> >>
>> >> How about something like this to not get exception events if it is
>> >> "is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
>> >> vcpu->arch.exception.async_page_fault" since lost a reschedule
>> >> optimization is not that importmant in L1.
>> >>
>> >> @@ -3072,13 +3074,16 @@ static void
>> >> kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
>> >> struct kvm_vcpu_events *events)
>> >>  {
>> >>  process_nmi(vcpu);
>> >> -events->exception.injected =
>> >> -vcpu->arch.exception.pending &&
>> >> -!kvm_exception_is_soft(vcpu->arch.exception.nr);
>> >> -events->exception.nr = vcpu->arch.exception.nr;
>> >> -events->exception.has_error_code = 
>> >> vcpu->arch.exception.has_error_code;
>> >> -events->exception.pad = 0;
>> >> -events->exception.error_code = vcpu->arch.exception.error_code;
>> >> +if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
>> >> +vcpu->arch.exception.async_page_fault)) {
>> >> +events->exception.injected =
>> >> +vcpu->arch.exception.pending &&
>> >> +!kvm_exception_is_soft(vcpu->arch.exception.nr);
>> >> +events->exception.nr = vcpu->arch.exception.nr;
>> >> +events->exception.has_error_code = 
>> >> vcpu->arch.exception.has_error_code;
>> >> +events->exception.pad = 0;
>> >> +events->exception.error_code = vcpu->arch.exception.error_code;
>> >> +}
>> >
>> > This adds a bug when userspace does KVM_GET_VCPU_EVENTS and
>> > KVM_SET_VCPU_EVENTS without migration -- KVM would drop the async_pf and
>> > a L1 process gets stuck as a result.
>> >
>> > We we'd need to add a similar condition to
>> > 

Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-21 Thread Wanpeng Li
2017-06-21 0:12 GMT+08:00 Radim Krčmář :
> 2017-06-20 05:47+0800, Wanpeng Li:
>> 2017-06-19 22:51 GMT+08:00 Radim Krčmář :
>> > 2017-06-17 13:52+0800, Wanpeng Li:
>> >> 2017-06-16 23:38 GMT+08:00 Radim Krčmář :
>> >> > 2017-06-16 22:24+0800, Wanpeng Li:
>> >> >> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
>> >> >> > 2017-06-14 19:26-0700, Wanpeng Li:
>> >> >> >> From: Wanpeng Li 
>> >> >> >>
>> >> >> >> Add an async_page_fault field to vcpu->arch.exception to identify 
>> >> >> >> an async
>> >> >> >> page fault, and constructs the expected vm-exit information fields. 
>> >> >> >> Force
>> >> >> >> a nested VM exit from nested_vmx_check_exception() if the injected 
>> >> >> >> #PF
>> >> >> >> is async page fault.
>> >> >> >>
>> >> >> >> Cc: Paolo Bonzini 
>> >> >> >> Cc: Radim Krčmář 
>> >> >> >> Signed-off-by: Wanpeng Li 
>> >> >> >> ---
>> >> >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >> >> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>> >> >> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct 
>> >> >> >> x86_exception *fault)
>> >> >> >>  {
>> >> >> >>   ++vcpu->stat.pf_guest;
>> >> >> >> - vcpu->arch.cr2 = fault->address;
>> >> >> >> + vcpu->arch.exception.async_page_fault = 
>> >> >> >> fault->async_page_fault;
>> >> >> >
>> >> >> > I think we need to act as if arch.exception.async_page_fault was not
>> >> >> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
>> >> >> > migrate with pending async_page_fault exception, we'd inject it as a
>> >> >> > normal #PF, which could confuse/kill the nested guest.
>> >> >> >
>> >> >> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
>> >> >> > sanity as well.
>> >> >>
>> >> >> Do you mean we should add a field like async_page_fault to
>> >> >> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
>> >> >> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
>> >> >> restores events->exception.async_page_fault to
>> >> >> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
>> >> >
>> >> > No, I thought we could get away with a disgusting hack of hiding the
>> >> > exception from userspace, which would work for migration, but not if
>> >> > local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
>> >> >
>> >> > Extending the userspace interface would work, but I'd do it as a last
>> >> > resort, after all conservative solutions have failed.
>> >> > async_pf migration is very crude, so exposing the exception is just an
>> >> > ugly workaround for the local case.  Adding the flag would also require
>> >> > userspace configuration of async_pf features for the guest to keep
>> >> > compatibility.
>> >> >
>> >> > I see two options that might be simpler than adding the userspace flag:
>> >> >
>> >> >  1) do the nested VM exit sooner, at the place where we now queue #PF,
>> >> >  2) queue the #PF later, save the async_pf in some intermediate
>> >> > structure and consume it at the place where you proposed the nested
>> >> > VM exit.
>> >>
>> >> How about something like this to not get exception events if it is
>> >> "is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
>> >> vcpu->arch.exception.async_page_fault" since lost a reschedule
>> >> optimization is not that importmant in L1.
>> >>
>> >> @@ -3072,13 +3074,16 @@ static void
>> >> kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
>> >> struct kvm_vcpu_events *events)
>> >>  {
>> >>  process_nmi(vcpu);
>> >> -events->exception.injected =
>> >> -vcpu->arch.exception.pending &&
>> >> -!kvm_exception_is_soft(vcpu->arch.exception.nr);
>> >> -events->exception.nr = vcpu->arch.exception.nr;
>> >> -events->exception.has_error_code = 
>> >> vcpu->arch.exception.has_error_code;
>> >> -events->exception.pad = 0;
>> >> -events->exception.error_code = vcpu->arch.exception.error_code;
>> >> +if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
>> >> +vcpu->arch.exception.async_page_fault)) {
>> >> +events->exception.injected =
>> >> +vcpu->arch.exception.pending &&
>> >> +!kvm_exception_is_soft(vcpu->arch.exception.nr);
>> >> +events->exception.nr = vcpu->arch.exception.nr;
>> >> +events->exception.has_error_code = 
>> >> vcpu->arch.exception.has_error_code;
>> >> +events->exception.pad = 0;
>> >> +events->exception.error_code = vcpu->arch.exception.error_code;
>> >> +}
>> >
>> > This adds a bug when userspace does KVM_GET_VCPU_EVENTS and
>> > KVM_SET_VCPU_EVENTS without migration -- KVM would drop the async_pf and
>> > a L1 process gets stuck as a result.
>> >
>> > We we'd need to add a similar condition to
>> > kvm_vcpu_ioctl_x86_set_vcpu_events(), so userspace SET doesn't drop it,
>> > but that is far beyond the realm of acceptable code.
>>
>> Do you mean current status of the patchset v2 

Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-20 Thread Radim Krčmář
2017-06-20 05:47+0800, Wanpeng Li:
> 2017-06-19 22:51 GMT+08:00 Radim Krčmář :
> > 2017-06-17 13:52+0800, Wanpeng Li:
> >> 2017-06-16 23:38 GMT+08:00 Radim Krčmář :
> >> > 2017-06-16 22:24+0800, Wanpeng Li:
> >> >> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
> >> >> > 2017-06-14 19:26-0700, Wanpeng Li:
> >> >> >> From: Wanpeng Li 
> >> >> >>
> >> >> >> Add an async_page_fault field to vcpu->arch.exception to identify an 
> >> >> >> async
> >> >> >> page fault, and constructs the expected vm-exit information fields. 
> >> >> >> Force
> >> >> >> a nested VM exit from nested_vmx_check_exception() if the injected 
> >> >> >> #PF
> >> >> >> is async page fault.
> >> >> >>
> >> >> >> Cc: Paolo Bonzini 
> >> >> >> Cc: Radim Krčmář 
> >> >> >> Signed-off-by: Wanpeng Li 
> >> >> >> ---
> >> >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> >> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
> >> >> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct 
> >> >> >> x86_exception *fault)
> >> >> >>  {
> >> >> >>   ++vcpu->stat.pf_guest;
> >> >> >> - vcpu->arch.cr2 = fault->address;
> >> >> >> + vcpu->arch.exception.async_page_fault = 
> >> >> >> fault->async_page_fault;
> >> >> >
> >> >> > I think we need to act as if arch.exception.async_page_fault was not
> >> >> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
> >> >> > migrate with pending async_page_fault exception, we'd inject it as a
> >> >> > normal #PF, which could confuse/kill the nested guest.
> >> >> >
> >> >> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
> >> >> > sanity as well.
> >> >>
> >> >> Do you mean we should add a field like async_page_fault to
> >> >> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
> >> >> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
> >> >> restores events->exception.async_page_fault to
> >> >> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
> >> >
> >> > No, I thought we could get away with a disgusting hack of hiding the
> >> > exception from userspace, which would work for migration, but not if
> >> > local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
> >> >
> >> > Extending the userspace interface would work, but I'd do it as a last
> >> > resort, after all conservative solutions have failed.
> >> > async_pf migration is very crude, so exposing the exception is just an
> >> > ugly workaround for the local case.  Adding the flag would also require
> >> > userspace configuration of async_pf features for the guest to keep
> >> > compatibility.
> >> >
> >> > I see two options that might be simpler than adding the userspace flag:
> >> >
> >> >  1) do the nested VM exit sooner, at the place where we now queue #PF,
> >> >  2) queue the #PF later, save the async_pf in some intermediate
> >> > structure and consume it at the place where you proposed the nested
> >> > VM exit.
> >>
> >> How about something like this to not get exception events if it is
> >> "is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
> >> vcpu->arch.exception.async_page_fault" since lost a reschedule
> >> optimization is not that importmant in L1.
> >>
> >> @@ -3072,13 +3074,16 @@ static void
> >> kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
> >> struct kvm_vcpu_events *events)
> >>  {
> >>  process_nmi(vcpu);
> >> -events->exception.injected =
> >> -vcpu->arch.exception.pending &&
> >> -!kvm_exception_is_soft(vcpu->arch.exception.nr);
> >> -events->exception.nr = vcpu->arch.exception.nr;
> >> -events->exception.has_error_code = 
> >> vcpu->arch.exception.has_error_code;
> >> -events->exception.pad = 0;
> >> -events->exception.error_code = vcpu->arch.exception.error_code;
> >> +if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
> >> +vcpu->arch.exception.async_page_fault)) {
> >> +events->exception.injected =
> >> +vcpu->arch.exception.pending &&
> >> +!kvm_exception_is_soft(vcpu->arch.exception.nr);
> >> +events->exception.nr = vcpu->arch.exception.nr;
> >> +events->exception.has_error_code = 
> >> vcpu->arch.exception.has_error_code;
> >> +events->exception.pad = 0;
> >> +events->exception.error_code = vcpu->arch.exception.error_code;
> >> +}
> >
> > This adds a bug when userspace does KVM_GET_VCPU_EVENTS and
> > KVM_SET_VCPU_EVENTS without migration -- KVM would drop the async_pf and
> > a L1 process gets stuck as a result.
> >
> > We we'd need to add a similar condition to
> > kvm_vcpu_ioctl_x86_set_vcpu_events(), so userspace SET doesn't drop it,
> > but that is far beyond the realm of acceptable code.
> 
> Do you mean current status of the 

Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-20 Thread Radim Krčmář
2017-06-20 05:47+0800, Wanpeng Li:
> 2017-06-19 22:51 GMT+08:00 Radim Krčmář :
> > 2017-06-17 13:52+0800, Wanpeng Li:
> >> 2017-06-16 23:38 GMT+08:00 Radim Krčmář :
> >> > 2017-06-16 22:24+0800, Wanpeng Li:
> >> >> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
> >> >> > 2017-06-14 19:26-0700, Wanpeng Li:
> >> >> >> From: Wanpeng Li 
> >> >> >>
> >> >> >> Add an async_page_fault field to vcpu->arch.exception to identify an 
> >> >> >> async
> >> >> >> page fault, and constructs the expected vm-exit information fields. 
> >> >> >> Force
> >> >> >> a nested VM exit from nested_vmx_check_exception() if the injected 
> >> >> >> #PF
> >> >> >> is async page fault.
> >> >> >>
> >> >> >> Cc: Paolo Bonzini 
> >> >> >> Cc: Radim Krčmář 
> >> >> >> Signed-off-by: Wanpeng Li 
> >> >> >> ---
> >> >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> >> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
> >> >> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct 
> >> >> >> x86_exception *fault)
> >> >> >>  {
> >> >> >>   ++vcpu->stat.pf_guest;
> >> >> >> - vcpu->arch.cr2 = fault->address;
> >> >> >> + vcpu->arch.exception.async_page_fault = 
> >> >> >> fault->async_page_fault;
> >> >> >
> >> >> > I think we need to act as if arch.exception.async_page_fault was not
> >> >> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
> >> >> > migrate with pending async_page_fault exception, we'd inject it as a
> >> >> > normal #PF, which could confuse/kill the nested guest.
> >> >> >
> >> >> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
> >> >> > sanity as well.
> >> >>
> >> >> Do you mean we should add a field like async_page_fault to
> >> >> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
> >> >> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
> >> >> restores events->exception.async_page_fault to
> >> >> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
> >> >
> >> > No, I thought we could get away with a disgusting hack of hiding the
> >> > exception from userspace, which would work for migration, but not if
> >> > local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
> >> >
> >> > Extending the userspace interface would work, but I'd do it as a last
> >> > resort, after all conservative solutions have failed.
> >> > async_pf migration is very crude, so exposing the exception is just an
> >> > ugly workaround for the local case.  Adding the flag would also require
> >> > userspace configuration of async_pf features for the guest to keep
> >> > compatibility.
> >> >
> >> > I see two options that might be simpler than adding the userspace flag:
> >> >
> >> >  1) do the nested VM exit sooner, at the place where we now queue #PF,
> >> >  2) queue the #PF later, save the async_pf in some intermediate
> >> > structure and consume it at the place where you proposed the nested
> >> > VM exit.
> >>
> >> How about something like this to not get exception events if it is
> >> "is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
> >> vcpu->arch.exception.async_page_fault" since lost a reschedule
> >> optimization is not that importmant in L1.
> >>
> >> @@ -3072,13 +3074,16 @@ static void
> >> kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
> >> struct kvm_vcpu_events *events)
> >>  {
> >>  process_nmi(vcpu);
> >> -events->exception.injected =
> >> -vcpu->arch.exception.pending &&
> >> -!kvm_exception_is_soft(vcpu->arch.exception.nr);
> >> -events->exception.nr = vcpu->arch.exception.nr;
> >> -events->exception.has_error_code = 
> >> vcpu->arch.exception.has_error_code;
> >> -events->exception.pad = 0;
> >> -events->exception.error_code = vcpu->arch.exception.error_code;
> >> +if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
> >> +vcpu->arch.exception.async_page_fault)) {
> >> +events->exception.injected =
> >> +vcpu->arch.exception.pending &&
> >> +!kvm_exception_is_soft(vcpu->arch.exception.nr);
> >> +events->exception.nr = vcpu->arch.exception.nr;
> >> +events->exception.has_error_code = 
> >> vcpu->arch.exception.has_error_code;
> >> +events->exception.pad = 0;
> >> +events->exception.error_code = vcpu->arch.exception.error_code;
> >> +}
> >
> > This adds a bug when userspace does KVM_GET_VCPU_EVENTS and
> > KVM_SET_VCPU_EVENTS without migration -- KVM would drop the async_pf and
> > a L1 process gets stuck as a result.
> >
> > We we'd need to add a similar condition to
> > kvm_vcpu_ioctl_x86_set_vcpu_events(), so userspace SET doesn't drop it,
> > but that is far beyond the realm of acceptable code.
> 
> Do you mean current status of the patchset v2 can be accepted?
> Otherwise, what's the next should be done?

No, sorry, that one has the migration bug (the async_page_fault gets
dropped on 

Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-19 Thread Wanpeng Li
2017-06-19 22:51 GMT+08:00 Radim Krčmář :
> 2017-06-17 13:52+0800, Wanpeng Li:
>> 2017-06-16 23:38 GMT+08:00 Radim Krčmář :
>> > 2017-06-16 22:24+0800, Wanpeng Li:
>> >> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
>> >> > 2017-06-14 19:26-0700, Wanpeng Li:
>> >> >> From: Wanpeng Li 
>> >> >>
>> >> >> Add an async_page_fault field to vcpu->arch.exception to identify an 
>> >> >> async
>> >> >> page fault, and constructs the expected vm-exit information fields. 
>> >> >> Force
>> >> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
>> >> >> is async page fault.
>> >> >>
>> >> >> Cc: Paolo Bonzini 
>> >> >> Cc: Radim Krčmář 
>> >> >> Signed-off-by: Wanpeng Li 
>> >> >> ---
>> >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>> >> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct 
>> >> >> x86_exception *fault)
>> >> >>  {
>> >> >>   ++vcpu->stat.pf_guest;
>> >> >> - vcpu->arch.cr2 = fault->address;
>> >> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
>> >> >
>> >> > I think we need to act as if arch.exception.async_page_fault was not
>> >> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
>> >> > migrate with pending async_page_fault exception, we'd inject it as a
>> >> > normal #PF, which could confuse/kill the nested guest.
>> >> >
>> >> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
>> >> > sanity as well.
>> >>
>> >> Do you mean we should add a field like async_page_fault to
>> >> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
>> >> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
>> >> restores events->exception.async_page_fault to
>> >> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
>> >
>> > No, I thought we could get away with a disgusting hack of hiding the
>> > exception from userspace, which would work for migration, but not if
>> > local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
>> >
>> > Extending the userspace interface would work, but I'd do it as a last
>> > resort, after all conservative solutions have failed.
>> > async_pf migration is very crude, so exposing the exception is just an
>> > ugly workaround for the local case.  Adding the flag would also require
>> > userspace configuration of async_pf features for the guest to keep
>> > compatibility.
>> >
>> > I see two options that might be simpler than adding the userspace flag:
>> >
>> >  1) do the nested VM exit sooner, at the place where we now queue #PF,
>> >  2) queue the #PF later, save the async_pf in some intermediate
>> > structure and consume it at the place where you proposed the nested
>> > VM exit.
>>
>> How about something like this to not get exception events if it is
>> "is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
>> vcpu->arch.exception.async_page_fault" since lost a reschedule
>> optimization is not that importmant in L1.
>>
>> @@ -3072,13 +3074,16 @@ static void
>> kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
>> struct kvm_vcpu_events *events)
>>  {
>>  process_nmi(vcpu);
>> -events->exception.injected =
>> -vcpu->arch.exception.pending &&
>> -!kvm_exception_is_soft(vcpu->arch.exception.nr);
>> -events->exception.nr = vcpu->arch.exception.nr;
>> -events->exception.has_error_code = vcpu->arch.exception.has_error_code;
>> -events->exception.pad = 0;
>> -events->exception.error_code = vcpu->arch.exception.error_code;
>> +if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
>> +vcpu->arch.exception.async_page_fault)) {
>> +events->exception.injected =
>> +vcpu->arch.exception.pending &&
>> +!kvm_exception_is_soft(vcpu->arch.exception.nr);
>> +events->exception.nr = vcpu->arch.exception.nr;
>> +events->exception.has_error_code = 
>> vcpu->arch.exception.has_error_code;
>> +events->exception.pad = 0;
>> +events->exception.error_code = vcpu->arch.exception.error_code;
>> +}
>
> This adds a bug when userspace does KVM_GET_VCPU_EVENTS and
> KVM_SET_VCPU_EVENTS without migration -- KVM would drop the async_pf and
> a L1 process gets stuck as a result.
>
> We we'd need to add a similar condition to
> kvm_vcpu_ioctl_x86_set_vcpu_events(), so userspace SET doesn't drop it,
> but that is far beyond the realm of acceptable code.

Do you mean current status of the patchset v2 can be accepted?
Otherwise, what's the next should be done?

Regards,
Wanpeng Li


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-19 Thread Wanpeng Li
2017-06-19 22:51 GMT+08:00 Radim Krčmář :
> 2017-06-17 13:52+0800, Wanpeng Li:
>> 2017-06-16 23:38 GMT+08:00 Radim Krčmář :
>> > 2017-06-16 22:24+0800, Wanpeng Li:
>> >> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
>> >> > 2017-06-14 19:26-0700, Wanpeng Li:
>> >> >> From: Wanpeng Li 
>> >> >>
>> >> >> Add an async_page_fault field to vcpu->arch.exception to identify an 
>> >> >> async
>> >> >> page fault, and constructs the expected vm-exit information fields. 
>> >> >> Force
>> >> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
>> >> >> is async page fault.
>> >> >>
>> >> >> Cc: Paolo Bonzini 
>> >> >> Cc: Radim Krčmář 
>> >> >> Signed-off-by: Wanpeng Li 
>> >> >> ---
>> >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>> >> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct 
>> >> >> x86_exception *fault)
>> >> >>  {
>> >> >>   ++vcpu->stat.pf_guest;
>> >> >> - vcpu->arch.cr2 = fault->address;
>> >> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
>> >> >
>> >> > I think we need to act as if arch.exception.async_page_fault was not
>> >> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
>> >> > migrate with pending async_page_fault exception, we'd inject it as a
>> >> > normal #PF, which could confuse/kill the nested guest.
>> >> >
>> >> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
>> >> > sanity as well.
>> >>
>> >> Do you mean we should add a field like async_page_fault to
>> >> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
>> >> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
>> >> restores events->exception.async_page_fault to
>> >> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
>> >
>> > No, I thought we could get away with a disgusting hack of hiding the
>> > exception from userspace, which would work for migration, but not if
>> > local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
>> >
>> > Extending the userspace interface would work, but I'd do it as a last
>> > resort, after all conservative solutions have failed.
>> > async_pf migration is very crude, so exposing the exception is just an
>> > ugly workaround for the local case.  Adding the flag would also require
>> > userspace configuration of async_pf features for the guest to keep
>> > compatibility.
>> >
>> > I see two options that might be simpler than adding the userspace flag:
>> >
>> >  1) do the nested VM exit sooner, at the place where we now queue #PF,
>> >  2) queue the #PF later, save the async_pf in some intermediate
>> > structure and consume it at the place where you proposed the nested
>> > VM exit.
>>
>> How about something like this to not get exception events if it is
>> "is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
>> vcpu->arch.exception.async_page_fault" since lost a reschedule
>> optimization is not that importmant in L1.
>>
>> @@ -3072,13 +3074,16 @@ static void
>> kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
>> struct kvm_vcpu_events *events)
>>  {
>>  process_nmi(vcpu);
>> -events->exception.injected =
>> -vcpu->arch.exception.pending &&
>> -!kvm_exception_is_soft(vcpu->arch.exception.nr);
>> -events->exception.nr = vcpu->arch.exception.nr;
>> -events->exception.has_error_code = vcpu->arch.exception.has_error_code;
>> -events->exception.pad = 0;
>> -events->exception.error_code = vcpu->arch.exception.error_code;
>> +if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
>> +vcpu->arch.exception.async_page_fault)) {
>> +events->exception.injected =
>> +vcpu->arch.exception.pending &&
>> +!kvm_exception_is_soft(vcpu->arch.exception.nr);
>> +events->exception.nr = vcpu->arch.exception.nr;
>> +events->exception.has_error_code = 
>> vcpu->arch.exception.has_error_code;
>> +events->exception.pad = 0;
>> +events->exception.error_code = vcpu->arch.exception.error_code;
>> +}
>
> This adds a bug when userspace does KVM_GET_VCPU_EVENTS and
> KVM_SET_VCPU_EVENTS without migration -- KVM would drop the async_pf and
> a L1 process gets stuck as a result.
>
> We we'd need to add a similar condition to
> kvm_vcpu_ioctl_x86_set_vcpu_events(), so userspace SET doesn't drop it,
> but that is far beyond the realm of acceptable code.

Do you mean current status of the patchset v2 can be accepted?
Otherwise, what's the next should be done?

Regards,
Wanpeng Li


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-19 Thread Radim Krčmář
2017-06-17 13:52+0800, Wanpeng Li:
> 2017-06-16 23:38 GMT+08:00 Radim Krčmář :
> > 2017-06-16 22:24+0800, Wanpeng Li:
> >> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
> >> > 2017-06-14 19:26-0700, Wanpeng Li:
> >> >> From: Wanpeng Li 
> >> >>
> >> >> Add an async_page_fault field to vcpu->arch.exception to identify an 
> >> >> async
> >> >> page fault, and constructs the expected vm-exit information fields. 
> >> >> Force
> >> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
> >> >> is async page fault.
> >> >>
> >> >> Cc: Paolo Bonzini 
> >> >> Cc: Radim Krčmář 
> >> >> Signed-off-by: Wanpeng Li 
> >> >> ---
> >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
> >> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
> >> >> *fault)
> >> >>  {
> >> >>   ++vcpu->stat.pf_guest;
> >> >> - vcpu->arch.cr2 = fault->address;
> >> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
> >> >
> >> > I think we need to act as if arch.exception.async_page_fault was not
> >> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
> >> > migrate with pending async_page_fault exception, we'd inject it as a
> >> > normal #PF, which could confuse/kill the nested guest.
> >> >
> >> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
> >> > sanity as well.
> >>
> >> Do you mean we should add a field like async_page_fault to
> >> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
> >> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
> >> restores events->exception.async_page_fault to
> >> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
> >
> > No, I thought we could get away with a disgusting hack of hiding the
> > exception from userspace, which would work for migration, but not if
> > local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
> >
> > Extending the userspace interface would work, but I'd do it as a last
> > resort, after all conservative solutions have failed.
> > async_pf migration is very crude, so exposing the exception is just an
> > ugly workaround for the local case.  Adding the flag would also require
> > userspace configuration of async_pf features for the guest to keep
> > compatibility.
> >
> > I see two options that might be simpler than adding the userspace flag:
> >
> >  1) do the nested VM exit sooner, at the place where we now queue #PF,
> >  2) queue the #PF later, save the async_pf in some intermediate
> > structure and consume it at the place where you proposed the nested
> > VM exit.
> 
> How about something like this to not get exception events if it is
> "is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
> vcpu->arch.exception.async_page_fault" since lost a reschedule
> optimization is not that importmant in L1.
> 
> @@ -3072,13 +3074,16 @@ static void
> kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
> struct kvm_vcpu_events *events)
>  {
>  process_nmi(vcpu);
> -events->exception.injected =
> -vcpu->arch.exception.pending &&
> -!kvm_exception_is_soft(vcpu->arch.exception.nr);
> -events->exception.nr = vcpu->arch.exception.nr;
> -events->exception.has_error_code = vcpu->arch.exception.has_error_code;
> -events->exception.pad = 0;
> -events->exception.error_code = vcpu->arch.exception.error_code;
> +if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
> +vcpu->arch.exception.async_page_fault)) {
> +events->exception.injected =
> +vcpu->arch.exception.pending &&
> +!kvm_exception_is_soft(vcpu->arch.exception.nr);
> +events->exception.nr = vcpu->arch.exception.nr;
> +events->exception.has_error_code = 
> vcpu->arch.exception.has_error_code;
> +events->exception.pad = 0;
> +events->exception.error_code = vcpu->arch.exception.error_code;
> +}

This adds a bug when userspace does KVM_GET_VCPU_EVENTS and
KVM_SET_VCPU_EVENTS without migration -- KVM would drop the async_pf and
a L1 process gets stuck as a result.

We we'd need to add a similar condition to
kvm_vcpu_ioctl_x86_set_vcpu_events(), so userspace SET doesn't drop it,
but that is far beyond the realm of acceptable code.

I realized this bug only after the first mail, sorry for the confusing
paragraph.


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-19 Thread Radim Krčmář
2017-06-17 13:52+0800, Wanpeng Li:
> 2017-06-16 23:38 GMT+08:00 Radim Krčmář :
> > 2017-06-16 22:24+0800, Wanpeng Li:
> >> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
> >> > 2017-06-14 19:26-0700, Wanpeng Li:
> >> >> From: Wanpeng Li 
> >> >>
> >> >> Add an async_page_fault field to vcpu->arch.exception to identify an 
> >> >> async
> >> >> page fault, and constructs the expected vm-exit information fields. 
> >> >> Force
> >> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
> >> >> is async page fault.
> >> >>
> >> >> Cc: Paolo Bonzini 
> >> >> Cc: Radim Krčmář 
> >> >> Signed-off-by: Wanpeng Li 
> >> >> ---
> >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
> >> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
> >> >> *fault)
> >> >>  {
> >> >>   ++vcpu->stat.pf_guest;
> >> >> - vcpu->arch.cr2 = fault->address;
> >> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
> >> >
> >> > I think we need to act as if arch.exception.async_page_fault was not
> >> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
> >> > migrate with pending async_page_fault exception, we'd inject it as a
> >> > normal #PF, which could confuse/kill the nested guest.
> >> >
> >> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
> >> > sanity as well.
> >>
> >> Do you mean we should add a field like async_page_fault to
> >> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
> >> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
> >> restores events->exception.async_page_fault to
> >> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
> >
> > No, I thought we could get away with a disgusting hack of hiding the
> > exception from userspace, which would work for migration, but not if
> > local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
> >
> > Extending the userspace interface would work, but I'd do it as a last
> > resort, after all conservative solutions have failed.
> > async_pf migration is very crude, so exposing the exception is just an
> > ugly workaround for the local case.  Adding the flag would also require
> > userspace configuration of async_pf features for the guest to keep
> > compatibility.
> >
> > I see two options that might be simpler than adding the userspace flag:
> >
> >  1) do the nested VM exit sooner, at the place where we now queue #PF,
> >  2) queue the #PF later, save the async_pf in some intermediate
> > structure and consume it at the place where you proposed the nested
> > VM exit.
> 
> How about something like this to not get exception events if it is
> "is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
> vcpu->arch.exception.async_page_fault" since lost a reschedule
> optimization is not that importmant in L1.
> 
> @@ -3072,13 +3074,16 @@ static void
> kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
> struct kvm_vcpu_events *events)
>  {
>  process_nmi(vcpu);
> -events->exception.injected =
> -vcpu->arch.exception.pending &&
> -!kvm_exception_is_soft(vcpu->arch.exception.nr);
> -events->exception.nr = vcpu->arch.exception.nr;
> -events->exception.has_error_code = vcpu->arch.exception.has_error_code;
> -events->exception.pad = 0;
> -events->exception.error_code = vcpu->arch.exception.error_code;
> +if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
> +vcpu->arch.exception.async_page_fault)) {
> +events->exception.injected =
> +vcpu->arch.exception.pending &&
> +!kvm_exception_is_soft(vcpu->arch.exception.nr);
> +events->exception.nr = vcpu->arch.exception.nr;
> +events->exception.has_error_code = 
> vcpu->arch.exception.has_error_code;
> +events->exception.pad = 0;
> +events->exception.error_code = vcpu->arch.exception.error_code;
> +}

This adds a bug when userspace does KVM_GET_VCPU_EVENTS and
KVM_SET_VCPU_EVENTS without migration -- KVM would drop the async_pf and
a L1 process gets stuck as a result.

We we'd need to add a similar condition to
kvm_vcpu_ioctl_x86_set_vcpu_events(), so userspace SET doesn't drop it,
but that is far beyond the realm of acceptable code.

I realized this bug only after the first mail, sorry for the confusing
paragraph.


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Wanpeng Li
2017-06-16 23:38 GMT+08:00 Radim Krčmář :
> 2017-06-16 22:24+0800, Wanpeng Li:
>> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
>> > 2017-06-14 19:26-0700, Wanpeng Li:
>> >> From: Wanpeng Li 
>> >>
>> >> Add an async_page_fault field to vcpu->arch.exception to identify an async
>> >> page fault, and constructs the expected vm-exit information fields. Force
>> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
>> >> is async page fault.
>> >>
>> >> Cc: Paolo Bonzini 
>> >> Cc: Radim Krčmář 
>> >> Signed-off-by: Wanpeng Li 
>> >> ---
>> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
>> >> *fault)
>> >>  {
>> >>   ++vcpu->stat.pf_guest;
>> >> - vcpu->arch.cr2 = fault->address;
>> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
>> >
>> > I think we need to act as if arch.exception.async_page_fault was not
>> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
>> > migrate with pending async_page_fault exception, we'd inject it as a
>> > normal #PF, which could confuse/kill the nested guest.
>> >
>> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
>> > sanity as well.
>>
>> Do you mean we should add a field like async_page_fault to
>> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
>> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
>> restores events->exception.async_page_fault to
>> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
>
> No, I thought we could get away with a disgusting hack of hiding the
> exception from userspace, which would work for migration, but not if
> local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
>
> Extending the userspace interface would work, but I'd do it as a last
> resort, after all conservative solutions have failed.
> async_pf migration is very crude, so exposing the exception is just an
> ugly workaround for the local case.  Adding the flag would also require
> userspace configuration of async_pf features for the guest to keep
> compatibility.
>
> I see two options that might be simpler than adding the userspace flag:
>
>  1) do the nested VM exit sooner, at the place where we now queue #PF,
>  2) queue the #PF later, save the async_pf in some intermediate
> structure and consume it at the place where you proposed the nested
> VM exit.

How about something like this to not get exception events if it is
"is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
vcpu->arch.exception.async_page_fault" since lost a reschedule
optimization is not that importmant in L1.

@@ -3072,13 +3074,16 @@ static void
kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
struct kvm_vcpu_events *events)
 {
 process_nmi(vcpu);
-events->exception.injected =
-vcpu->arch.exception.pending &&
-!kvm_exception_is_soft(vcpu->arch.exception.nr);
-events->exception.nr = vcpu->arch.exception.nr;
-events->exception.has_error_code = vcpu->arch.exception.has_error_code;
-events->exception.pad = 0;
-events->exception.error_code = vcpu->arch.exception.error_code;
+if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
+vcpu->arch.exception.async_page_fault)) {
+events->exception.injected =
+vcpu->arch.exception.pending &&
+!kvm_exception_is_soft(vcpu->arch.exception.nr);
+events->exception.nr = vcpu->arch.exception.nr;
+events->exception.has_error_code = vcpu->arch.exception.has_error_code;
+events->exception.pad = 0;
+events->exception.error_code = vcpu->arch.exception.error_code;
+}

Regards,
Wanpeng Li


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Wanpeng Li
2017-06-16 23:38 GMT+08:00 Radim Krčmář :
> 2017-06-16 22:24+0800, Wanpeng Li:
>> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
>> > 2017-06-14 19:26-0700, Wanpeng Li:
>> >> From: Wanpeng Li 
>> >>
>> >> Add an async_page_fault field to vcpu->arch.exception to identify an async
>> >> page fault, and constructs the expected vm-exit information fields. Force
>> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
>> >> is async page fault.
>> >>
>> >> Cc: Paolo Bonzini 
>> >> Cc: Radim Krčmář 
>> >> Signed-off-by: Wanpeng Li 
>> >> ---
>> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
>> >> *fault)
>> >>  {
>> >>   ++vcpu->stat.pf_guest;
>> >> - vcpu->arch.cr2 = fault->address;
>> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
>> >
>> > I think we need to act as if arch.exception.async_page_fault was not
>> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
>> > migrate with pending async_page_fault exception, we'd inject it as a
>> > normal #PF, which could confuse/kill the nested guest.
>> >
>> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
>> > sanity as well.
>>
>> Do you mean we should add a field like async_page_fault to
>> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
>> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
>> restores events->exception.async_page_fault to
>> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
>
> No, I thought we could get away with a disgusting hack of hiding the
> exception from userspace, which would work for migration, but not if
> local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
>
> Extending the userspace interface would work, but I'd do it as a last
> resort, after all conservative solutions have failed.
> async_pf migration is very crude, so exposing the exception is just an
> ugly workaround for the local case.  Adding the flag would also require
> userspace configuration of async_pf features for the guest to keep
> compatibility.
>
> I see two options that might be simpler than adding the userspace flag:
>
>  1) do the nested VM exit sooner, at the place where we now queue #PF,
>  2) queue the #PF later, save the async_pf in some intermediate
> structure and consume it at the place where you proposed the nested
> VM exit.

How about something like this to not get exception events if it is
"is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
vcpu->arch.exception.async_page_fault" since lost a reschedule
optimization is not that importmant in L1.

@@ -3072,13 +3074,16 @@ static void
kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
struct kvm_vcpu_events *events)
 {
 process_nmi(vcpu);
-events->exception.injected =
-vcpu->arch.exception.pending &&
-!kvm_exception_is_soft(vcpu->arch.exception.nr);
-events->exception.nr = vcpu->arch.exception.nr;
-events->exception.has_error_code = vcpu->arch.exception.has_error_code;
-events->exception.pad = 0;
-events->exception.error_code = vcpu->arch.exception.error_code;
+if (!(is_guest_mode(vcpu) && vcpu->arch.exception.nr == PF_VECTOR &&
+vcpu->arch.exception.async_page_fault)) {
+events->exception.injected =
+vcpu->arch.exception.pending &&
+!kvm_exception_is_soft(vcpu->arch.exception.nr);
+events->exception.nr = vcpu->arch.exception.nr;
+events->exception.has_error_code = vcpu->arch.exception.has_error_code;
+events->exception.pad = 0;
+events->exception.error_code = vcpu->arch.exception.error_code;
+}

Regards,
Wanpeng Li


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Wanpeng Li
2017-06-16 23:38 GMT+08:00 Radim Krčmář :
> 2017-06-16 22:24+0800, Wanpeng Li:
>> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
>> > 2017-06-14 19:26-0700, Wanpeng Li:
>> >> From: Wanpeng Li 
>> >>
>> >> Add an async_page_fault field to vcpu->arch.exception to identify an async
>> >> page fault, and constructs the expected vm-exit information fields. Force
>> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
>> >> is async page fault.
>> >>
>> >> Cc: Paolo Bonzini 
>> >> Cc: Radim Krčmář 
>> >> Signed-off-by: Wanpeng Li 
>> >> ---
>> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
>> >> *fault)
>> >>  {
>> >>   ++vcpu->stat.pf_guest;
>> >> - vcpu->arch.cr2 = fault->address;
>> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
>> >
>> > I think we need to act as if arch.exception.async_page_fault was not
>> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
>> > migrate with pending async_page_fault exception, we'd inject it as a
>> > normal #PF, which could confuse/kill the nested guest.
>> >
>> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
>> > sanity as well.
>>
>> Do you mean we should add a field like async_page_fault to
>> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
>> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
>> restores events->exception.async_page_fault to
>> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
>
> No, I thought we could get away with a disgusting hack of hiding the
> exception from userspace, which would work for migration, but not if
> local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
>
> Extending the userspace interface would work, but I'd do it as a last
> resort, after all conservative solutions have failed.
> async_pf migration is very crude, so exposing the exception is just an
> ugly workaround for the local case.  Adding the flag would also require
> userspace configuration of async_pf features for the guest to keep
> compatibility.
>
> I see two options that might be simpler than adding the userspace flag:
>
>  1) do the nested VM exit sooner, at the place where we now queue #PF,
>  2) queue the #PF later, save the async_pf in some intermediate
> structure and consume it at the place where you proposed the nested
> VM exit.

Yeah, this hidden looks rational to me, even if we lost a
PAGE_NOTREADY async_pf to L1, that just a hint to L1 reschedule
optimization, and PAGE_READY async_pf will be guranteed by the wakeup
all after live migration.

Regards,
Wanpeng Li


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Wanpeng Li
2017-06-16 23:38 GMT+08:00 Radim Krčmář :
> 2017-06-16 22:24+0800, Wanpeng Li:
>> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
>> > 2017-06-14 19:26-0700, Wanpeng Li:
>> >> From: Wanpeng Li 
>> >>
>> >> Add an async_page_fault field to vcpu->arch.exception to identify an async
>> >> page fault, and constructs the expected vm-exit information fields. Force
>> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
>> >> is async page fault.
>> >>
>> >> Cc: Paolo Bonzini 
>> >> Cc: Radim Krčmář 
>> >> Signed-off-by: Wanpeng Li 
>> >> ---
>> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
>> >> *fault)
>> >>  {
>> >>   ++vcpu->stat.pf_guest;
>> >> - vcpu->arch.cr2 = fault->address;
>> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
>> >
>> > I think we need to act as if arch.exception.async_page_fault was not
>> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
>> > migrate with pending async_page_fault exception, we'd inject it as a
>> > normal #PF, which could confuse/kill the nested guest.
>> >
>> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
>> > sanity as well.
>>
>> Do you mean we should add a field like async_page_fault to
>> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
>> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
>> restores events->exception.async_page_fault to
>> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?
>
> No, I thought we could get away with a disgusting hack of hiding the
> exception from userspace, which would work for migration, but not if
> local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...
>
> Extending the userspace interface would work, but I'd do it as a last
> resort, after all conservative solutions have failed.
> async_pf migration is very crude, so exposing the exception is just an
> ugly workaround for the local case.  Adding the flag would also require
> userspace configuration of async_pf features for the guest to keep
> compatibility.
>
> I see two options that might be simpler than adding the userspace flag:
>
>  1) do the nested VM exit sooner, at the place where we now queue #PF,
>  2) queue the #PF later, save the async_pf in some intermediate
> structure and consume it at the place where you proposed the nested
> VM exit.

Yeah, this hidden looks rational to me, even if we lost a
PAGE_NOTREADY async_pf to L1, that just a hint to L1 reschedule
optimization, and PAGE_READY async_pf will be guranteed by the wakeup
all after live migration.

Regards,
Wanpeng Li


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Radim Krčmář
2017-06-16 22:24+0800, Wanpeng Li:
> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
> > 2017-06-14 19:26-0700, Wanpeng Li:
> >> From: Wanpeng Li 
> >>
> >> Add an async_page_fault field to vcpu->arch.exception to identify an async
> >> page fault, and constructs the expected vm-exit information fields. Force
> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
> >> is async page fault.
> >>
> >> Cc: Paolo Bonzini 
> >> Cc: Radim Krčmář 
> >> Signed-off-by: Wanpeng Li 
> >> ---
> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
> >> *fault)
> >>  {
> >>   ++vcpu->stat.pf_guest;
> >> - vcpu->arch.cr2 = fault->address;
> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
> >
> > I think we need to act as if arch.exception.async_page_fault was not
> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
> > migrate with pending async_page_fault exception, we'd inject it as a
> > normal #PF, which could confuse/kill the nested guest.
> >
> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
> > sanity as well.
> 
> Do you mean we should add a field like async_page_fault to
> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
> restores events->exception.async_page_fault to
> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?

No, I thought we could get away with a disgusting hack of hiding the
exception from userspace, which would work for migration, but not if
local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...

Extending the userspace interface would work, but I'd do it as a last
resort, after all conservative solutions have failed.
async_pf migration is very crude, so exposing the exception is just an
ugly workaround for the local case.  Adding the flag would also require
userspace configuration of async_pf features for the guest to keep
compatibility.

I see two options that might be simpler than adding the userspace flag:

 1) do the nested VM exit sooner, at the place where we now queue #PF,
 2) queue the #PF later, save the async_pf in some intermediate
structure and consume it at the place where you proposed the nested
VM exit.


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Radim Krčmář
2017-06-16 22:24+0800, Wanpeng Li:
> 2017-06-16 21:37 GMT+08:00 Radim Krčmář :
> > 2017-06-14 19:26-0700, Wanpeng Li:
> >> From: Wanpeng Li 
> >>
> >> Add an async_page_fault field to vcpu->arch.exception to identify an async
> >> page fault, and constructs the expected vm-exit information fields. Force
> >> a nested VM exit from nested_vmx_check_exception() if the injected #PF
> >> is async page fault.
> >>
> >> Cc: Paolo Bonzini 
> >> Cc: Radim Krčmář 
> >> Signed-off-by: Wanpeng Li 
> >> ---
> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
> >>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
> >> *fault)
> >>  {
> >>   ++vcpu->stat.pf_guest;
> >> - vcpu->arch.cr2 = fault->address;
> >> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
> >
> > I think we need to act as if arch.exception.async_page_fault was not
> > pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
> > migrate with pending async_page_fault exception, we'd inject it as a
> > normal #PF, which could confuse/kill the nested guest.
> >
> > And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
> > sanity as well.
> 
> Do you mean we should add a field like async_page_fault to
> kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
> to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
> restores events->exception.async_page_fault to
> arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?

No, I thought we could get away with a disgusting hack of hiding the
exception from userspace, which would work for migration, but not if
local userspace did KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS ...

Extending the userspace interface would work, but I'd do it as a last
resort, after all conservative solutions have failed.
async_pf migration is very crude, so exposing the exception is just an
ugly workaround for the local case.  Adding the flag would also require
userspace configuration of async_pf features for the guest to keep
compatibility.

I see two options that might be simpler than adding the userspace flag:

 1) do the nested VM exit sooner, at the place where we now queue #PF,
 2) queue the #PF later, save the async_pf in some intermediate
structure and consume it at the place where you proposed the nested
VM exit.


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Wanpeng Li
2017-06-16 21:37 GMT+08:00 Radim Krčmář :
> 2017-06-14 19:26-0700, Wanpeng Li:
>> From: Wanpeng Li 
>>
>> Add an async_page_fault field to vcpu->arch.exception to identify an async
>> page fault, and constructs the expected vm-exit information fields. Force
>> a nested VM exit from nested_vmx_check_exception() if the injected #PF
>> is async page fault.
>>
>> Cc: Paolo Bonzini 
>> Cc: Radim Krčmář 
>> Signed-off-by: Wanpeng Li 
>> ---
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
>> *fault)
>>  {
>>   ++vcpu->stat.pf_guest;
>> - vcpu->arch.cr2 = fault->address;
>> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
>
> I think we need to act as if arch.exception.async_page_fault was not
> pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
> migrate with pending async_page_fault exception, we'd inject it as a
> normal #PF, which could confuse/kill the nested guest.
>
> And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
> sanity as well.

Do you mean we should add a field like async_page_fault to
kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
restores events->exception.async_page_fault to
arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?

Regards,
Wanpeng Li


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Wanpeng Li
2017-06-16 21:37 GMT+08:00 Radim Krčmář :
> 2017-06-14 19:26-0700, Wanpeng Li:
>> From: Wanpeng Li 
>>
>> Add an async_page_fault field to vcpu->arch.exception to identify an async
>> page fault, and constructs the expected vm-exit information fields. Force
>> a nested VM exit from nested_vmx_check_exception() if the injected #PF
>> is async page fault.
>>
>> Cc: Paolo Bonzini 
>> Cc: Radim Krčmář 
>> Signed-off-by: Wanpeng Li 
>> ---
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
>> *fault)
>>  {
>>   ++vcpu->stat.pf_guest;
>> - vcpu->arch.cr2 = fault->address;
>> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;
>
> I think we need to act as if arch.exception.async_page_fault was not
> pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
> migrate with pending async_page_fault exception, we'd inject it as a
> normal #PF, which could confuse/kill the nested guest.
>
> And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
> sanity as well.

Do you mean we should add a field like async_page_fault to
kvm_vcpu_events::exception, then saves arch.exception.async_page_fault
to events->exception.async_page_fault through KVM_GET_VCPU_EVENTS and
restores events->exception.async_page_fault to
arch.exception.async_page_fault through KVM_SET_VCPU_EVENTS?

Regards,
Wanpeng Li


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Radim Krčmář
2017-06-14 19:26-0700, Wanpeng Li:
> From: Wanpeng Li 
> 
> Add an async_page_fault field to vcpu->arch.exception to identify an async 
> page fault, and constructs the expected vm-exit information fields. Force 
> a nested VM exit from nested_vmx_check_exception() if the injected #PF 
> is async page fault.
> 
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Signed-off-by: Wanpeng Li 
> ---
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
> *fault)
>  {
>   ++vcpu->stat.pf_guest;
> - vcpu->arch.cr2 = fault->address;
> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;

I think we need to act as if arch.exception.async_page_fault was not
pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
migrate with pending async_page_fault exception, we'd inject it as a
normal #PF, which could confuse/kill the nested guest.

And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
sanity as well.

Thanks.


Re: [PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-16 Thread Radim Krčmář
2017-06-14 19:26-0700, Wanpeng Li:
> From: Wanpeng Li 
> 
> Add an async_page_fault field to vcpu->arch.exception to identify an async 
> page fault, and constructs the expected vm-exit information fields. Force 
> a nested VM exit from nested_vmx_check_exception() if the injected #PF 
> is async page fault.
> 
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Signed-off-by: Wanpeng Li 
> ---
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> @@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
>  void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception 
> *fault)
>  {
>   ++vcpu->stat.pf_guest;
> - vcpu->arch.cr2 = fault->address;
> + vcpu->arch.exception.async_page_fault = fault->async_page_fault;

I think we need to act as if arch.exception.async_page_fault was not
pending in kvm_vcpu_ioctl_x86_get_vcpu_events().  Otherwise, if we
migrate with pending async_page_fault exception, we'd inject it as a
normal #PF, which could confuse/kill the nested guest.

And kvm_vcpu_ioctl_x86_set_vcpu_events() should clean the flag for
sanity as well.

Thanks.


[PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-14 Thread Wanpeng Li
From: Wanpeng Li 

Add an async_page_fault field to vcpu->arch.exception to identify an async 
page fault, and constructs the expected vm-exit information fields. Force 
a nested VM exit from nested_vmx_check_exception() if the injected #PF 
is async page fault.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/include/asm/kvm_emulate.h |  1 +
 arch/x86/include/asm/kvm_host.h|  2 ++
 arch/x86/kvm/vmx.c | 17 ++---
 arch/x86/kvm/x86.c |  8 +++-
 4 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index 0559626..b5bcad9 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -23,6 +23,7 @@ struct x86_exception {
u16 error_code;
bool nested_page_fault;
u64 address; /* cr2 or nested page fault gpa */
+   bool async_page_fault;
 };
 
 /*
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1f01bfb..100ad9a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -545,6 +545,7 @@ struct kvm_vcpu_arch {
bool reinject;
u8 nr;
u32 error_code;
+   bool async_page_fault;
} exception;
 
struct kvm_queued_interrupt {
@@ -645,6 +646,7 @@ struct kvm_vcpu_arch {
u64 msr_val;
u32 id;
bool send_user_only;
+   unsigned long nested_apf_token;
} apf;
 
/* OSVW MSRs (AMD only) */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f533cc1..e7b9844 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2419,13 +2419,24 @@ static void skip_emulated_instruction(struct kvm_vcpu 
*vcpu)
  * KVM wants to inject page-faults which it got to the guest. This function
  * checks whether in a nested guest, we need to inject them to L1 or L2.
  */
-static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned nr)
+static int nested_vmx_check_exception(struct kvm_vcpu *vcpu)
 {
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+   unsigned int nr = vcpu->arch.exception.nr;
 
-   if (!(vmcs12->exception_bitmap & (1u << nr)))
+   if (!((vmcs12->exception_bitmap & (1u << nr)) ||
+   (nr == PF_VECTOR && vcpu->arch.exception.async_page_fault)))
return 0;
 
+   if (vcpu->arch.exception.async_page_fault) {
+   vmcs_write32(VM_EXIT_INTR_ERROR_CODE, 
vcpu->arch.exception.error_code);
+   nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
+   PF_VECTOR | INTR_TYPE_HARD_EXCEPTION |
+   INTR_INFO_DELIVER_CODE_MASK | INTR_INFO_VALID_MASK,
+   vcpu->arch.apf.nested_apf_token);
+   return 1;
+   }
+
nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
  vmcs_read32(VM_EXIT_INTR_INFO),
  vmcs_readl(EXIT_QUALIFICATION));
@@ -2442,7 +2453,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
u32 intr_info = nr | INTR_INFO_VALID_MASK;
 
if (!reinject && is_guest_mode(vcpu) &&
-   nested_vmx_check_exception(vcpu, nr))
+   nested_vmx_check_exception(vcpu))
return;
 
if (has_error_code) {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1b28a31..5931ce7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
 void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault)
 {
++vcpu->stat.pf_guest;
-   vcpu->arch.cr2 = fault->address;
+   vcpu->arch.exception.async_page_fault = fault->async_page_fault;
+   if (is_guest_mode(vcpu) && vcpu->arch.exception.async_page_fault)
+   vcpu->arch.apf.nested_apf_token = fault->address;
+   else
+   vcpu->arch.cr2 = fault->address;
kvm_queue_exception_e(vcpu, PF_VECTOR, fault->error_code);
 }
 EXPORT_SYMBOL_GPL(kvm_inject_page_fault);
@@ -8571,6 +8575,7 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu 
*vcpu,
fault.error_code = 0;
fault.nested_page_fault = false;
fault.address = work->arch.token;
+   fault.async_page_fault = true;
kvm_inject_page_fault(vcpu, );
}
 }
@@ -8593,6 +8598,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
fault.error_code = 0;
fault.nested_page_fault = false;
fault.address = work->arch.token;
+   fault.async_page_fault = true;
kvm_inject_page_fault(vcpu, );
}
vcpu->arch.apf.halted = false;
-- 
2.7.4



[PATCH v2 3/4] KVM: async_pf: Force a nested vmexit if the injected #PF is async_pf

2017-06-14 Thread Wanpeng Li
From: Wanpeng Li 

Add an async_page_fault field to vcpu->arch.exception to identify an async 
page fault, and constructs the expected vm-exit information fields. Force 
a nested VM exit from nested_vmx_check_exception() if the injected #PF 
is async page fault.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Signed-off-by: Wanpeng Li 
---
 arch/x86/include/asm/kvm_emulate.h |  1 +
 arch/x86/include/asm/kvm_host.h|  2 ++
 arch/x86/kvm/vmx.c | 17 ++---
 arch/x86/kvm/x86.c |  8 +++-
 4 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index 0559626..b5bcad9 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -23,6 +23,7 @@ struct x86_exception {
u16 error_code;
bool nested_page_fault;
u64 address; /* cr2 or nested page fault gpa */
+   bool async_page_fault;
 };
 
 /*
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1f01bfb..100ad9a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -545,6 +545,7 @@ struct kvm_vcpu_arch {
bool reinject;
u8 nr;
u32 error_code;
+   bool async_page_fault;
} exception;
 
struct kvm_queued_interrupt {
@@ -645,6 +646,7 @@ struct kvm_vcpu_arch {
u64 msr_val;
u32 id;
bool send_user_only;
+   unsigned long nested_apf_token;
} apf;
 
/* OSVW MSRs (AMD only) */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index f533cc1..e7b9844 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2419,13 +2419,24 @@ static void skip_emulated_instruction(struct kvm_vcpu 
*vcpu)
  * KVM wants to inject page-faults which it got to the guest. This function
  * checks whether in a nested guest, we need to inject them to L1 or L2.
  */
-static int nested_vmx_check_exception(struct kvm_vcpu *vcpu, unsigned nr)
+static int nested_vmx_check_exception(struct kvm_vcpu *vcpu)
 {
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+   unsigned int nr = vcpu->arch.exception.nr;
 
-   if (!(vmcs12->exception_bitmap & (1u << nr)))
+   if (!((vmcs12->exception_bitmap & (1u << nr)) ||
+   (nr == PF_VECTOR && vcpu->arch.exception.async_page_fault)))
return 0;
 
+   if (vcpu->arch.exception.async_page_fault) {
+   vmcs_write32(VM_EXIT_INTR_ERROR_CODE, 
vcpu->arch.exception.error_code);
+   nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
+   PF_VECTOR | INTR_TYPE_HARD_EXCEPTION |
+   INTR_INFO_DELIVER_CODE_MASK | INTR_INFO_VALID_MASK,
+   vcpu->arch.apf.nested_apf_token);
+   return 1;
+   }
+
nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
  vmcs_read32(VM_EXIT_INTR_INFO),
  vmcs_readl(EXIT_QUALIFICATION));
@@ -2442,7 +2453,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
u32 intr_info = nr | INTR_INFO_VALID_MASK;
 
if (!reinject && is_guest_mode(vcpu) &&
-   nested_vmx_check_exception(vcpu, nr))
+   nested_vmx_check_exception(vcpu))
return;
 
if (has_error_code) {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1b28a31..5931ce7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -452,7 +452,11 @@ EXPORT_SYMBOL_GPL(kvm_complete_insn_gp);
 void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault)
 {
++vcpu->stat.pf_guest;
-   vcpu->arch.cr2 = fault->address;
+   vcpu->arch.exception.async_page_fault = fault->async_page_fault;
+   if (is_guest_mode(vcpu) && vcpu->arch.exception.async_page_fault)
+   vcpu->arch.apf.nested_apf_token = fault->address;
+   else
+   vcpu->arch.cr2 = fault->address;
kvm_queue_exception_e(vcpu, PF_VECTOR, fault->error_code);
 }
 EXPORT_SYMBOL_GPL(kvm_inject_page_fault);
@@ -8571,6 +8575,7 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu 
*vcpu,
fault.error_code = 0;
fault.nested_page_fault = false;
fault.address = work->arch.token;
+   fault.async_page_fault = true;
kvm_inject_page_fault(vcpu, );
}
 }
@@ -8593,6 +8598,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
fault.error_code = 0;
fault.nested_page_fault = false;
fault.address = work->arch.token;
+   fault.async_page_fault = true;
kvm_inject_page_fault(vcpu, );
}
vcpu->arch.apf.halted = false;
-- 
2.7.4