Re: [RFC PATCH v6 64/92] kvm: introspection: add single-stepping

2019-08-14 Thread Paolo Bonzini
On 14/08/19 14:36, Nicusor CITU wrote:
> Thank you for signaling this. This piece of code is leftover from the
> initial attempt to make single step running.
> Based on latest results, we do not actually need to change
> interruptibility during the singlestep. It is enough to enable the MTF
> and just suppress any interrupt injection (if any) before leaving the
> vcpu entering in guest.
> 

This is exactly what testcases are for...

Paolo
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v6 64/92] kvm: introspection: add single-stepping

2019-08-13 Thread Adalbert Lazăr
On Mon, 12 Aug 2019 13:50:39 -0700, Sean Christopherson 
 wrote:
> On Fri, Aug 09, 2019 at 07:00:19PM +0300, Adalbert Lazăr wrote:
> > From: Nicușor Cîțu 
> > 
> > This would be used either if the introspection tool request it as a
> > reply to a KVMI_EVENT_PF event or to cope with instructions that cannot
> > be handled by the x86 emulator during the handling of a VMEXIT. In
> > these situations, all other vCPU-s are kicked and held, the EPT-based
> > protection is removed and the guest is single stepped by the vCPU that
> > triggered the initial VMEXIT. Upon completion the EPT-base protection
> > is reinstalled and all vCPU-s all allowed to return to the guest.
> > 
> > This is a rather slow workaround that kicks in occasionally. In the
> > future, the most frequently single-stepped instructions should be added
> > to the emulator (usually, stores to and from memory - SSE/AVX).
> > 
> > For the moment it works only on Intel.
> > 
> > CC: Jim Mattson 
> > CC: Sean Christopherson 
> > CC: Joerg Roedel 
> > Signed-off-by: Nicușor Cîțu 
> > Co-developed-by: Mihai Donțu 
> > Signed-off-by: Mihai Donțu 
> > Co-developed-by: Adalbert Lazăr 
> > Signed-off-by: Adalbert Lazăr 
> > ---
> >  arch/x86/include/asm/kvm_host.h |   3 +
> >  arch/x86/kvm/kvmi.c |  47 ++-
> >  arch/x86/kvm/svm.c  |   5 ++
> >  arch/x86/kvm/vmx/vmx.c  |  17 
> >  arch/x86/kvm/x86.c  |  19 +
> >  include/linux/kvmi.h|   4 +
> >  virt/kvm/kvmi.c | 145 +++-
> >  virt/kvm/kvmi_int.h |  16 
> >  8 files changed, 253 insertions(+), 3 deletions(-)
> > 

[...] We'll do.

> > diff --git a/virt/kvm/kvmi_int.h b/virt/kvm/kvmi_int.h
> > index d7f9858d3e97..1550fe33ed48 100644
> > --- a/virt/kvm/kvmi_int.h
> > +++ b/virt/kvm/kvmi_int.h
> > @@ -126,6 +126,9 @@ struct kvmi_vcpu {
> > DECLARE_BITMAP(high, KVMI_NUM_MSR);
> > } msr_mask;
> >  
> > +   bool ss_owner;
> 
> Why is single-stepping mutually exclusive across all vCPUs?  Does that
> always have to be the case?

I never thought to single-step multiple vCPUs in the same time.

If one vCPU will relax the access to a guest page while a second one,
finishing single-stepping, restores the 'r--' flags, the first one
will get another page fault and relax the page access again. It might
be doable, but before starting single-stepping a vCPU we might replace
guest memory (as requested by the introspection tool) and we will have
to use a lock for this.

However, we would like to use alternate EPT views with single-step.
So, we might replace this patch.

> > +   bool ss_requested;
> > +
> > struct list_head job_list;
> > spinlock_t job_lock;
> >  
> > @@ -151,6 +154,15 @@ struct kvmi {
> > DECLARE_BITMAP(event_allow_mask, KVMI_NUM_EVENTS);
> > DECLARE_BITMAP(vm_ev_mask, KVMI_NUM_EVENTS);
> >  
> > +#define SINGLE_STEP_MAX_DEPTH 8
> > +   struct {
> > +   gfn_t gfn;
> > +   u8 old_access;
> > +   u32 old_write_bitmap;
> > +   } ss_context[SINGLE_STEP_MAX_DEPTH];
> > +   u8 ss_level;
> > +   atomic_t ss_active;
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH v6 64/92] kvm: introspection: add single-stepping

2019-08-12 Thread Sean Christopherson
On Fri, Aug 09, 2019 at 07:00:19PM +0300, Adalbert Lazăr wrote:
> From: Nicușor Cîțu 
> 
> This would be used either if the introspection tool request it as a
> reply to a KVMI_EVENT_PF event or to cope with instructions that cannot
> be handled by the x86 emulator during the handling of a VMEXIT. In
> these situations, all other vCPU-s are kicked and held, the EPT-based
> protection is removed and the guest is single stepped by the vCPU that
> triggered the initial VMEXIT. Upon completion the EPT-base protection
> is reinstalled and all vCPU-s all allowed to return to the guest.
> 
> This is a rather slow workaround that kicks in occasionally. In the
> future, the most frequently single-stepped instructions should be added
> to the emulator (usually, stores to and from memory - SSE/AVX).
> 
> For the moment it works only on Intel.
> 
> CC: Jim Mattson 
> CC: Sean Christopherson 
> CC: Joerg Roedel 
> Signed-off-by: Nicușor Cîțu 
> Co-developed-by: Mihai Donțu 
> Signed-off-by: Mihai Donțu 
> Co-developed-by: Adalbert Lazăr 
> Signed-off-by: Adalbert Lazăr 
> ---
>  arch/x86/include/asm/kvm_host.h |   3 +
>  arch/x86/kvm/kvmi.c |  47 ++-
>  arch/x86/kvm/svm.c  |   5 ++
>  arch/x86/kvm/vmx/vmx.c  |  17 
>  arch/x86/kvm/x86.c  |  19 +
>  include/linux/kvmi.h|   4 +
>  virt/kvm/kvmi.c | 145 +++-
>  virt/kvm/kvmi_int.h |  16 
>  8 files changed, 253 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index ad36a5fc2048..60e2c298d469 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1016,6 +1016,7 @@ struct kvm_x86_ops {
>   void (*msr_intercept)(struct kvm_vcpu *vcpu, unsigned int msr,
>   bool enable);
>   bool (*desc_intercept)(struct kvm_vcpu *vcpu, bool enable);
> + void (*set_mtf)(struct kvm_vcpu *vcpu, bool enable);

MTF is a VMX specific implementation of single-stepping, this should be
enable_single_step() or something along those lines.  For example, I assume
SVM could implement something that is mostly functional via RFLAGS.TF.

>   void (*cr3_write_exiting)(struct kvm_vcpu *vcpu, bool enable);
>   bool (*nested_pagefault)(struct kvm_vcpu *vcpu);
>   bool (*spt_fault)(struct kvm_vcpu *vcpu);
> @@ -1628,6 +1629,8 @@ void kvm_arch_msr_intercept(struct kvm_vcpu *vcpu, 
> unsigned int msr,
>   bool enable);
>  bool kvm_mmu_nested_pagefault(struct kvm_vcpu *vcpu);
>  bool kvm_spt_fault(struct kvm_vcpu *vcpu);
> +void kvm_set_mtf(struct kvm_vcpu *vcpu, bool enable);
> +void kvm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask);
>  void kvm_control_cr3_write_exiting(struct kvm_vcpu *vcpu, bool enable);
>  
>  #endif /* _ASM_X86_KVM_HOST_H */
> diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c
> index 04cac5b8a4d0..f0ab4bd9eb37 100644
> --- a/arch/x86/kvm/kvmi.c
> +++ b/arch/x86/kvm/kvmi.c
> @@ -520,7 +520,6 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, 
> gva_t gva,
>   u32 ctx_size;
>   u64 ctx_addr;
>   u32 action;
> - bool singlestep_ignored;
>   bool ret = false;
>  
>   if (!kvm_spt_fault(vcpu))
> @@ -533,7 +532,7 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, 
> gva_t gva,
>   if (ivcpu->effective_rep_complete)
>   return true;
>  
> - action = kvmi_msg_send_pf(vcpu, gpa, gva, access, _ignored,
> + action = kvmi_msg_send_pf(vcpu, gpa, gva, access, >ss_requested,
> >rep_complete, _addr,
> ivcpu->ctx_data, _size);
>  
> @@ -547,6 +546,8 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, 
> gva_t gva,
>   ret = true;
>   break;
>   case KVMI_EVENT_ACTION_RETRY:
> + if (ivcpu->ss_requested && !kvmi_start_ss(vcpu, gpa, access))
> + ret = true;
>   break;
>   default:
>   kvmi_handle_common_event_actions(vcpu, action, "PF");
> @@ -758,6 +759,48 @@ int kvmi_arch_cmd_control_cr(struct kvm_vcpu *vcpu,
>   return 0;
>  }
>  
> +void kvmi_arch_start_single_step(struct kvm_vcpu *vcpu)
> +{
> + kvm_set_mtf(vcpu, true);
> +
> + /*
> +  * Set block by STI only if the RFLAGS.IF = 1.
> +  * Blocking by both STI and MOV/POP SS is not possible.
> +  */
> + if (kvm_arch_interrupt_allowed(vcpu))
> + kvm_set_interrupt_shadow(vcpu, KVM_X86_SHADOW_INT_STI);

This is wrong, the STI shadow only exists if interrupts were unblocked
prior to STI.  I'm guessing this is a hack to workaround
kvmi_arch_stop_single_step() not properly handling the clearing case.

> +
> +}
> +
> +void kvmi_arch_stop_single_step(struct kvm_vcpu *vcpu)
> +{
> + kvm_set_mtf(vcpu, false);
> + /*
> +  * The blocking by STI is cleared after the guest

[RFC PATCH v6 64/92] kvm: introspection: add single-stepping

2019-08-09 Thread Adalbert Lazăr
From: Nicușor Cîțu 

This would be used either if the introspection tool request it as a
reply to a KVMI_EVENT_PF event or to cope with instructions that cannot
be handled by the x86 emulator during the handling of a VMEXIT. In
these situations, all other vCPU-s are kicked and held, the EPT-based
protection is removed and the guest is single stepped by the vCPU that
triggered the initial VMEXIT. Upon completion the EPT-base protection
is reinstalled and all vCPU-s all allowed to return to the guest.

This is a rather slow workaround that kicks in occasionally. In the
future, the most frequently single-stepped instructions should be added
to the emulator (usually, stores to and from memory - SSE/AVX).

For the moment it works only on Intel.

CC: Jim Mattson 
CC: Sean Christopherson 
CC: Joerg Roedel 
Signed-off-by: Nicușor Cîțu 
Co-developed-by: Mihai Donțu 
Signed-off-by: Mihai Donțu 
Co-developed-by: Adalbert Lazăr 
Signed-off-by: Adalbert Lazăr 
---
 arch/x86/include/asm/kvm_host.h |   3 +
 arch/x86/kvm/kvmi.c |  47 ++-
 arch/x86/kvm/svm.c  |   5 ++
 arch/x86/kvm/vmx/vmx.c  |  17 
 arch/x86/kvm/x86.c  |  19 +
 include/linux/kvmi.h|   4 +
 virt/kvm/kvmi.c | 145 +++-
 virt/kvm/kvmi_int.h |  16 
 8 files changed, 253 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ad36a5fc2048..60e2c298d469 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1016,6 +1016,7 @@ struct kvm_x86_ops {
void (*msr_intercept)(struct kvm_vcpu *vcpu, unsigned int msr,
bool enable);
bool (*desc_intercept)(struct kvm_vcpu *vcpu, bool enable);
+   void (*set_mtf)(struct kvm_vcpu *vcpu, bool enable);
void (*cr3_write_exiting)(struct kvm_vcpu *vcpu, bool enable);
bool (*nested_pagefault)(struct kvm_vcpu *vcpu);
bool (*spt_fault)(struct kvm_vcpu *vcpu);
@@ -1628,6 +1629,8 @@ void kvm_arch_msr_intercept(struct kvm_vcpu *vcpu, 
unsigned int msr,
bool enable);
 bool kvm_mmu_nested_pagefault(struct kvm_vcpu *vcpu);
 bool kvm_spt_fault(struct kvm_vcpu *vcpu);
+void kvm_set_mtf(struct kvm_vcpu *vcpu, bool enable);
+void kvm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask);
 void kvm_control_cr3_write_exiting(struct kvm_vcpu *vcpu, bool enable);
 
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c
index 04cac5b8a4d0..f0ab4bd9eb37 100644
--- a/arch/x86/kvm/kvmi.c
+++ b/arch/x86/kvm/kvmi.c
@@ -520,7 +520,6 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, 
gva_t gva,
u32 ctx_size;
u64 ctx_addr;
u32 action;
-   bool singlestep_ignored;
bool ret = false;
 
if (!kvm_spt_fault(vcpu))
@@ -533,7 +532,7 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, 
gva_t gva,
if (ivcpu->effective_rep_complete)
return true;
 
-   action = kvmi_msg_send_pf(vcpu, gpa, gva, access, _ignored,
+   action = kvmi_msg_send_pf(vcpu, gpa, gva, access, >ss_requested,
  >rep_complete, _addr,
  ivcpu->ctx_data, _size);
 
@@ -547,6 +546,8 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, 
gva_t gva,
ret = true;
break;
case KVMI_EVENT_ACTION_RETRY:
+   if (ivcpu->ss_requested && !kvmi_start_ss(vcpu, gpa, access))
+   ret = true;
break;
default:
kvmi_handle_common_event_actions(vcpu, action, "PF");
@@ -758,6 +759,48 @@ int kvmi_arch_cmd_control_cr(struct kvm_vcpu *vcpu,
return 0;
 }
 
+void kvmi_arch_start_single_step(struct kvm_vcpu *vcpu)
+{
+   kvm_set_mtf(vcpu, true);
+
+   /*
+* Set block by STI only if the RFLAGS.IF = 1.
+* Blocking by both STI and MOV/POP SS is not possible.
+*/
+   if (kvm_arch_interrupt_allowed(vcpu))
+   kvm_set_interrupt_shadow(vcpu, KVM_X86_SHADOW_INT_STI);
+
+}
+
+void kvmi_arch_stop_single_step(struct kvm_vcpu *vcpu)
+{
+   kvm_set_mtf(vcpu, false);
+   /*
+* The blocking by STI is cleared after the guest
+* executes one instruction or incurs an exception.
+* However we migh stop the SS before entering to guest,
+* so be sure we are clearing the STI blocking.
+*/
+   kvm_set_interrupt_shadow(vcpu, 0);
+}
+
+u8 kvmi_arch_relax_page_access(u8 old, u8 new)
+{
+   u8 ret = old | new;
+
+   /*
+* An SPTE entry with just the -wx bits set can trigger a
+* misconfiguration error from the hardware, as it's the case
+* for x86 where this access mode is used to mark I/O memory.
+* Thus, we make sure that -wx accesses are translated to rwx.
+*/
+