Re: [RFC PATCH v6 64/92] kvm: introspection: add single-stepping
On 14/08/19 14:36, Nicusor CITU wrote: > Thank you for signaling this. This piece of code is leftover from the > initial attempt to make single step running. > Based on latest results, we do not actually need to change > interruptibility during the singlestep. It is enough to enable the MTF > and just suppress any interrupt injection (if any) before leaving the > vcpu entering in guest. > This is exactly what testcases are for... Paolo ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH v6 64/92] kvm: introspection: add single-stepping
On Mon, 12 Aug 2019 13:50:39 -0700, Sean Christopherson wrote: > On Fri, Aug 09, 2019 at 07:00:19PM +0300, Adalbert Lazăr wrote: > > From: Nicușor Cîțu > > > > This would be used either if the introspection tool request it as a > > reply to a KVMI_EVENT_PF event or to cope with instructions that cannot > > be handled by the x86 emulator during the handling of a VMEXIT. In > > these situations, all other vCPU-s are kicked and held, the EPT-based > > protection is removed and the guest is single stepped by the vCPU that > > triggered the initial VMEXIT. Upon completion the EPT-base protection > > is reinstalled and all vCPU-s all allowed to return to the guest. > > > > This is a rather slow workaround that kicks in occasionally. In the > > future, the most frequently single-stepped instructions should be added > > to the emulator (usually, stores to and from memory - SSE/AVX). > > > > For the moment it works only on Intel. > > > > CC: Jim Mattson > > CC: Sean Christopherson > > CC: Joerg Roedel > > Signed-off-by: Nicușor Cîțu > > Co-developed-by: Mihai Donțu > > Signed-off-by: Mihai Donțu > > Co-developed-by: Adalbert Lazăr > > Signed-off-by: Adalbert Lazăr > > --- > > arch/x86/include/asm/kvm_host.h | 3 + > > arch/x86/kvm/kvmi.c | 47 ++- > > arch/x86/kvm/svm.c | 5 ++ > > arch/x86/kvm/vmx/vmx.c | 17 > > arch/x86/kvm/x86.c | 19 + > > include/linux/kvmi.h| 4 + > > virt/kvm/kvmi.c | 145 +++- > > virt/kvm/kvmi_int.h | 16 > > 8 files changed, 253 insertions(+), 3 deletions(-) > > [...] We'll do. > > diff --git a/virt/kvm/kvmi_int.h b/virt/kvm/kvmi_int.h > > index d7f9858d3e97..1550fe33ed48 100644 > > --- a/virt/kvm/kvmi_int.h > > +++ b/virt/kvm/kvmi_int.h > > @@ -126,6 +126,9 @@ struct kvmi_vcpu { > > DECLARE_BITMAP(high, KVMI_NUM_MSR); > > } msr_mask; > > > > + bool ss_owner; > > Why is single-stepping mutually exclusive across all vCPUs? Does that > always have to be the case? I never thought to single-step multiple vCPUs in the same time. If one vCPU will relax the access to a guest page while a second one, finishing single-stepping, restores the 'r--' flags, the first one will get another page fault and relax the page access again. It might be doable, but before starting single-stepping a vCPU we might replace guest memory (as requested by the introspection tool) and we will have to use a lock for this. However, we would like to use alternate EPT views with single-step. So, we might replace this patch. > > + bool ss_requested; > > + > > struct list_head job_list; > > spinlock_t job_lock; > > > > @@ -151,6 +154,15 @@ struct kvmi { > > DECLARE_BITMAP(event_allow_mask, KVMI_NUM_EVENTS); > > DECLARE_BITMAP(vm_ev_mask, KVMI_NUM_EVENTS); > > > > +#define SINGLE_STEP_MAX_DEPTH 8 > > + struct { > > + gfn_t gfn; > > + u8 old_access; > > + u32 old_write_bitmap; > > + } ss_context[SINGLE_STEP_MAX_DEPTH]; > > + u8 ss_level; > > + atomic_t ss_active; ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH v6 64/92] kvm: introspection: add single-stepping
On Fri, Aug 09, 2019 at 07:00:19PM +0300, Adalbert Lazăr wrote: > From: Nicușor Cîțu > > This would be used either if the introspection tool request it as a > reply to a KVMI_EVENT_PF event or to cope with instructions that cannot > be handled by the x86 emulator during the handling of a VMEXIT. In > these situations, all other vCPU-s are kicked and held, the EPT-based > protection is removed and the guest is single stepped by the vCPU that > triggered the initial VMEXIT. Upon completion the EPT-base protection > is reinstalled and all vCPU-s all allowed to return to the guest. > > This is a rather slow workaround that kicks in occasionally. In the > future, the most frequently single-stepped instructions should be added > to the emulator (usually, stores to and from memory - SSE/AVX). > > For the moment it works only on Intel. > > CC: Jim Mattson > CC: Sean Christopherson > CC: Joerg Roedel > Signed-off-by: Nicușor Cîțu > Co-developed-by: Mihai Donțu > Signed-off-by: Mihai Donțu > Co-developed-by: Adalbert Lazăr > Signed-off-by: Adalbert Lazăr > --- > arch/x86/include/asm/kvm_host.h | 3 + > arch/x86/kvm/kvmi.c | 47 ++- > arch/x86/kvm/svm.c | 5 ++ > arch/x86/kvm/vmx/vmx.c | 17 > arch/x86/kvm/x86.c | 19 + > include/linux/kvmi.h| 4 + > virt/kvm/kvmi.c | 145 +++- > virt/kvm/kvmi_int.h | 16 > 8 files changed, 253 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index ad36a5fc2048..60e2c298d469 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1016,6 +1016,7 @@ struct kvm_x86_ops { > void (*msr_intercept)(struct kvm_vcpu *vcpu, unsigned int msr, > bool enable); > bool (*desc_intercept)(struct kvm_vcpu *vcpu, bool enable); > + void (*set_mtf)(struct kvm_vcpu *vcpu, bool enable); MTF is a VMX specific implementation of single-stepping, this should be enable_single_step() or something along those lines. For example, I assume SVM could implement something that is mostly functional via RFLAGS.TF. > void (*cr3_write_exiting)(struct kvm_vcpu *vcpu, bool enable); > bool (*nested_pagefault)(struct kvm_vcpu *vcpu); > bool (*spt_fault)(struct kvm_vcpu *vcpu); > @@ -1628,6 +1629,8 @@ void kvm_arch_msr_intercept(struct kvm_vcpu *vcpu, > unsigned int msr, > bool enable); > bool kvm_mmu_nested_pagefault(struct kvm_vcpu *vcpu); > bool kvm_spt_fault(struct kvm_vcpu *vcpu); > +void kvm_set_mtf(struct kvm_vcpu *vcpu, bool enable); > +void kvm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask); > void kvm_control_cr3_write_exiting(struct kvm_vcpu *vcpu, bool enable); > > #endif /* _ASM_X86_KVM_HOST_H */ > diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c > index 04cac5b8a4d0..f0ab4bd9eb37 100644 > --- a/arch/x86/kvm/kvmi.c > +++ b/arch/x86/kvm/kvmi.c > @@ -520,7 +520,6 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, > gva_t gva, > u32 ctx_size; > u64 ctx_addr; > u32 action; > - bool singlestep_ignored; > bool ret = false; > > if (!kvm_spt_fault(vcpu)) > @@ -533,7 +532,7 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, > gva_t gva, > if (ivcpu->effective_rep_complete) > return true; > > - action = kvmi_msg_send_pf(vcpu, gpa, gva, access, _ignored, > + action = kvmi_msg_send_pf(vcpu, gpa, gva, access, >ss_requested, > >rep_complete, _addr, > ivcpu->ctx_data, _size); > > @@ -547,6 +546,8 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, > gva_t gva, > ret = true; > break; > case KVMI_EVENT_ACTION_RETRY: > + if (ivcpu->ss_requested && !kvmi_start_ss(vcpu, gpa, access)) > + ret = true; > break; > default: > kvmi_handle_common_event_actions(vcpu, action, "PF"); > @@ -758,6 +759,48 @@ int kvmi_arch_cmd_control_cr(struct kvm_vcpu *vcpu, > return 0; > } > > +void kvmi_arch_start_single_step(struct kvm_vcpu *vcpu) > +{ > + kvm_set_mtf(vcpu, true); > + > + /* > + * Set block by STI only if the RFLAGS.IF = 1. > + * Blocking by both STI and MOV/POP SS is not possible. > + */ > + if (kvm_arch_interrupt_allowed(vcpu)) > + kvm_set_interrupt_shadow(vcpu, KVM_X86_SHADOW_INT_STI); This is wrong, the STI shadow only exists if interrupts were unblocked prior to STI. I'm guessing this is a hack to workaround kvmi_arch_stop_single_step() not properly handling the clearing case. > + > +} > + > +void kvmi_arch_stop_single_step(struct kvm_vcpu *vcpu) > +{ > + kvm_set_mtf(vcpu, false); > + /* > + * The blocking by STI is cleared after the guest
[RFC PATCH v6 64/92] kvm: introspection: add single-stepping
From: Nicușor Cîțu This would be used either if the introspection tool request it as a reply to a KVMI_EVENT_PF event or to cope with instructions that cannot be handled by the x86 emulator during the handling of a VMEXIT. In these situations, all other vCPU-s are kicked and held, the EPT-based protection is removed and the guest is single stepped by the vCPU that triggered the initial VMEXIT. Upon completion the EPT-base protection is reinstalled and all vCPU-s all allowed to return to the guest. This is a rather slow workaround that kicks in occasionally. In the future, the most frequently single-stepped instructions should be added to the emulator (usually, stores to and from memory - SSE/AVX). For the moment it works only on Intel. CC: Jim Mattson CC: Sean Christopherson CC: Joerg Roedel Signed-off-by: Nicușor Cîțu Co-developed-by: Mihai Donțu Signed-off-by: Mihai Donțu Co-developed-by: Adalbert Lazăr Signed-off-by: Adalbert Lazăr --- arch/x86/include/asm/kvm_host.h | 3 + arch/x86/kvm/kvmi.c | 47 ++- arch/x86/kvm/svm.c | 5 ++ arch/x86/kvm/vmx/vmx.c | 17 arch/x86/kvm/x86.c | 19 + include/linux/kvmi.h| 4 + virt/kvm/kvmi.c | 145 +++- virt/kvm/kvmi_int.h | 16 8 files changed, 253 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ad36a5fc2048..60e2c298d469 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1016,6 +1016,7 @@ struct kvm_x86_ops { void (*msr_intercept)(struct kvm_vcpu *vcpu, unsigned int msr, bool enable); bool (*desc_intercept)(struct kvm_vcpu *vcpu, bool enable); + void (*set_mtf)(struct kvm_vcpu *vcpu, bool enable); void (*cr3_write_exiting)(struct kvm_vcpu *vcpu, bool enable); bool (*nested_pagefault)(struct kvm_vcpu *vcpu); bool (*spt_fault)(struct kvm_vcpu *vcpu); @@ -1628,6 +1629,8 @@ void kvm_arch_msr_intercept(struct kvm_vcpu *vcpu, unsigned int msr, bool enable); bool kvm_mmu_nested_pagefault(struct kvm_vcpu *vcpu); bool kvm_spt_fault(struct kvm_vcpu *vcpu); +void kvm_set_mtf(struct kvm_vcpu *vcpu, bool enable); +void kvm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask); void kvm_control_cr3_write_exiting(struct kvm_vcpu *vcpu, bool enable); #endif /* _ASM_X86_KVM_HOST_H */ diff --git a/arch/x86/kvm/kvmi.c b/arch/x86/kvm/kvmi.c index 04cac5b8a4d0..f0ab4bd9eb37 100644 --- a/arch/x86/kvm/kvmi.c +++ b/arch/x86/kvm/kvmi.c @@ -520,7 +520,6 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, u32 ctx_size; u64 ctx_addr; u32 action; - bool singlestep_ignored; bool ret = false; if (!kvm_spt_fault(vcpu)) @@ -533,7 +532,7 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, if (ivcpu->effective_rep_complete) return true; - action = kvmi_msg_send_pf(vcpu, gpa, gva, access, _ignored, + action = kvmi_msg_send_pf(vcpu, gpa, gva, access, >ss_requested, >rep_complete, _addr, ivcpu->ctx_data, _size); @@ -547,6 +546,8 @@ bool kvmi_arch_pf_event(struct kvm_vcpu *vcpu, gpa_t gpa, gva_t gva, ret = true; break; case KVMI_EVENT_ACTION_RETRY: + if (ivcpu->ss_requested && !kvmi_start_ss(vcpu, gpa, access)) + ret = true; break; default: kvmi_handle_common_event_actions(vcpu, action, "PF"); @@ -758,6 +759,48 @@ int kvmi_arch_cmd_control_cr(struct kvm_vcpu *vcpu, return 0; } +void kvmi_arch_start_single_step(struct kvm_vcpu *vcpu) +{ + kvm_set_mtf(vcpu, true); + + /* +* Set block by STI only if the RFLAGS.IF = 1. +* Blocking by both STI and MOV/POP SS is not possible. +*/ + if (kvm_arch_interrupt_allowed(vcpu)) + kvm_set_interrupt_shadow(vcpu, KVM_X86_SHADOW_INT_STI); + +} + +void kvmi_arch_stop_single_step(struct kvm_vcpu *vcpu) +{ + kvm_set_mtf(vcpu, false); + /* +* The blocking by STI is cleared after the guest +* executes one instruction or incurs an exception. +* However we migh stop the SS before entering to guest, +* so be sure we are clearing the STI blocking. +*/ + kvm_set_interrupt_shadow(vcpu, 0); +} + +u8 kvmi_arch_relax_page_access(u8 old, u8 new) +{ + u8 ret = old | new; + + /* +* An SPTE entry with just the -wx bits set can trigger a +* misconfiguration error from the hardware, as it's the case +* for x86 where this access mode is used to mark I/O memory. +* Thus, we make sure that -wx accesses are translated to rwx. +*/ +