Re: kvm_read_guest_page() missing kvm->srcu read lock?
On 10/05/2018 19:41, Andre Przywara wrote: > Hi, > > Jan posted an lockdep splat complaining about a suspicious > rcu_dereference_check: > https://lists.cs.columbia.edu/pipermail/kvmarm/2018-May/031116.html > > The gist of that is: > ... > [ 1025.695517] dump_stack+0x9c/0xd4 > [ 1025.695524] lockdep_rcu_suspicious+0xcc/0x118 > [ 1025.695537] gfn_to_memslot+0x174/0x190 > [ 1025.695546] kvm_read_guest+0x50/0xb0 > [ 1025.695553] vgic_its_check_id.isra.0+0x114/0x148 > ... > I chased that down and wonder if kvm_read_guest{,_page} is supposed to > be called inside a kvm->srcu critical section? > > We have a check that suggests that eventually someone needs to enter the > SRCU criticial section: > static inline struct kvm_memslots *__kvm_memslots(struct kvm *kvm, > int as_id) > { > return srcu_dereference_check(kvm->memslots[as_id], >srcu, > lockdep_is_held(>slots_lock) || > !refcount_read(>users_count)); > } > > If I get this correctly this mean for accessing kvm->memslots we either > need to be inside an srcu critical section or hold the kvm->slots_lock > (for updates only). > > If I am not mistaken, it is not necessary for *callers* of > kvm_read_guest_page() to do this, as this could be entirely contained > inside this function - since we only use the reference to the memslot > entry while doing the copy_from_user(), and the data is safe afterwards > from an RCU point of view because it has been *copied*. Yes, it's the caller's responsibility. srcu_read_lock/unlock is pretty expensive so KVM assumes that the topmost callers do it. Paolo ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
kvm_read_guest_page() missing kvm->srcu read lock?
Hi, Jan posted an lockdep splat complaining about a suspicious rcu_dereference_check: https://lists.cs.columbia.edu/pipermail/kvmarm/2018-May/031116.html The gist of that is: ... [ 1025.695517] dump_stack+0x9c/0xd4 [ 1025.695524] lockdep_rcu_suspicious+0xcc/0x118 [ 1025.695537] gfn_to_memslot+0x174/0x190 [ 1025.695546] kvm_read_guest+0x50/0xb0 [ 1025.695553] vgic_its_check_id.isra.0+0x114/0x148 ... I chased that down and wonder if kvm_read_guest{,_page} is supposed to be called inside a kvm->srcu critical section? We have a check that suggests that eventually someone needs to enter the SRCU criticial section: static inline struct kvm_memslots *__kvm_memslots(struct kvm *kvm, int as_id) { return srcu_dereference_check(kvm->memslots[as_id], >srcu, lockdep_is_held(>slots_lock) || !refcount_read(>users_count)); } If I get this correctly this mean for accessing kvm->memslots we either need to be inside an srcu critical section or hold the kvm->slots_lock (for updates only). If I am not mistaken, it is not necessary for *callers* of kvm_read_guest_page() to do this, as this could be entirely contained inside this function - since we only use the reference to the memslot entry while doing the copy_from_user(), and the data is safe afterwards from an RCU point of view because it has been *copied*. Does that sound correct? Has anyone run a lockdep enabled kernel with PROVE_RCU before and stumbled upon this message? I am a bit doubtful that this is valid because this is generic KVM code and should trigger with every kvm_read_guest() user. With my limited understanding I would say the we just need to have a srcu_read_lock() call at the beginning of kvm_read_guest_page() and the corresponding srcu_read_unlock() call at the end of that function. Is that enough? Cheers, Andre. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v7 13/16] KVM: arm64: Remove eager host SVE state saving
On Wed, May 09, 2018 at 05:13:02PM +0100, Dave P Martin wrote: > Now that the host SVE context can be saved on demand from Hyp, > there is no longer any need to save this state in advance before > entering the guest. > > This patch removes the relevant call to > kvm_fpsimd_flush_cpu_state(). > > Since the problem that function was intended to solve now no longer > exists, the function and its dependencies are also deleted. > > Signed-off-by: Dave Martin> Acked-by: Christoffer Dall > Acked-by: Marc Zyngier Acked-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v7 12/16] KVM: arm64: Save host SVE context as appropriate
On Wed, May 09, 2018 at 05:13:01PM +0100, Dave P Martin wrote: > This patch adds SVE context saving to the hyp FPSIMD context switch > path. This means that it is no longer necessary to save the host > SVE state in advance of entering the guest, when in use. > > In order to avoid adding pointless complexity to the code, VHE is > assumed if SVE is in use. VHE is an architectural prerequisite for > SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in > kernels that support both SVE and KVM. > > Historically, software models exist that can expose the > architecturally invalid configuration of SVE without VHE, so if > this situation is detected at kvm_init() time then KVM will be > disabled. > > Signed-off-by: Dave Martin> Reviewed-by: Christoffer Dall > Acked-by: Marc Zyngier > --- > arch/arm64/Kconfig | 7 +++ > arch/arm64/kvm/fpsimd.c | 1 - > arch/arm64/kvm/hyp/switch.c | 20 +++- > virt/kvm/arm/arm.c | 18 ++ > 4 files changed, 44 insertions(+), 2 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index eb2cf49..b0d3820 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -1130,6 +1130,7 @@ endmenu > config ARM64_SVE > bool "ARM Scalable Vector Extension support" > default y > + depends on !KVM || ARM64_VHE > help > The Scalable Vector Extension (SVE) is an extension to the AArch64 > execution state which complements and extends the SIMD functionality > @@ -1155,6 +1156,12 @@ config ARM64_SVE > booting the kernel. If unsure and you are not observing these > symptoms, you should assume that it is safe to say Y. > > + CPUs that support SVE are architecturally required to support the > + Virtualization Host Extensions (VHE), so the kernel makes no > + provision for supporting SVE alongside KVM without VHE enabled. > + Thus, you will need to enable CONFIG_ARM64_VHE if you want to support > + KVM in the same kernel image. Acked-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v7 08/16] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
On Wed, May 09, 2018 at 05:12:57PM +0100, Dave P Martin wrote: > This patch refactors KVM to align the host and guest FPSIMD > save/restore logic with each other for arm64. This reduces the > number of redundant save/restore operations that must occur, and > reduces the common-case IRQ blackout time during guest exit storms > by saving the host state lazily and optimising away the need to > restore the host state before returning to the run loop. > > Four hooks are defined in order to enable this: > > * kvm_arch_vcpu_run_map_fp(): >Called on PID change to map necessary bits of current to Hyp. > > * kvm_arch_vcpu_load_fp(): >Set up FP/SIMD for entering the KVM run loop (parse as >"vcpu_load fp"). > > * kvm_arch_vcpu_ctxsync_fp(): >Get FP/SIMD into a safe state for re-enabling interrupts after a >guest exit back to the run loop. > >For arm64 specifically, this involves updating the host kernel's >FPSIMD context tracking metadata so that kernel-mode NEON use >will cause the vcpu's FPSIMD state to be saved back correctly >into the vcpu struct. This must be done before re-enabling >interrupts because kernel-mode NEON may be used by softirqs. > > * kvm_arch_vcpu_put_fp(): >Save guest FP/SIMD state back to memory and dissociate from the >CPU ("vcpu_put fp"). > > Also, the arm64 FPSIMD context switch code is updated to enable it > to save back FPSIMD state for a vcpu, not just current. A few > helpers drive this: > > * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp): >mark this CPU as having context fp (which may belong to a vcpu) >currently loaded in its registers. This is the non-task >equivalent of the static function fpsimd_bind_to_cpu() in >fpsimd.c. > > * task_fpsimd_save(): >exported to allow KVM to save the guest's FPSIMD state back to >memory on exit from the run loop. > > * fpsimd_flush_state(): >invalidate any context's FPSIMD state that is currently loaded. >Used to disassociate the vcpu from the CPU regs on run loop exit. > > These changes allow the run loop to enable interrupts (and thus > softirqs that may use kernel-mode NEON) without having to save the > guest's FPSIMD state eagerly. > > Some new vcpu_arch fields are added to make all this work. Because > host FPSIMD state can now be saved back directly into current's > thread_struct as appropriate, host_cpu_context is no longer used > for preserving the FPSIMD state. However, it is still needed for > preserving other things such as the host's system registers. To > avoid ABI churn, the redundant storage space in host_cpu_context is > not removed for now. > > arch/arm is not addressed by this patch and continues to use its > current save/restore logic. It could provide implementations of > the helpers later if desired. > > Signed-off-by: Dave MartinAcked-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] kvm: Change return type to vm_fault_t
On 18/04/2018 21:19, Souptick Joarder wrote: > Use new return type vm_fault_t for fault handler. For > now, this is just documenting that the function returns > a VM_FAULT value rather than an errno. Once all instances > are converted, vm_fault_t will become a distinct type. > > commit 1c8f422059ae ("mm: change return type to vm_fault_t") > > Signed-off-by: Souptick Joarder> Reviewed-by: Matthew Wilcox > --- > arch/mips/kvm/mips.c | 2 +- > arch/powerpc/kvm/powerpc.c | 2 +- > arch/s390/kvm/kvm-s390.c | 2 +- > arch/x86/kvm/x86.c | 2 +- > include/linux/kvm_host.h | 2 +- > virt/kvm/arm/arm.c | 2 +- > virt/kvm/kvm_main.c| 2 +- > 7 files changed, 7 insertions(+), 7 deletions(-) > > diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c > index 2549fdd..03e0e0f 100644 > --- a/arch/mips/kvm/mips.c > +++ b/arch/mips/kvm/mips.c > @@ -1076,7 +1076,7 @@ int kvm_arch_vcpu_ioctl_set_fpu(struct kvm_vcpu *vcpu, > struct kvm_fpu *fpu) > return -ENOIOCTLCMD; > } > > -int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > +vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > { > return VM_FAULT_SIGBUS; > } > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c > index 403e642..3099dee 100644 > --- a/arch/powerpc/kvm/powerpc.c > +++ b/arch/powerpc/kvm/powerpc.c > @@ -1825,7 +1825,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp, > return r; > } > > -int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > +vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > { > return VM_FAULT_SIGBUS; > } > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > index ba4c709..24af487 100644 > --- a/arch/s390/kvm/kvm-s390.c > +++ b/arch/s390/kvm/kvm-s390.c > @@ -3941,7 +3941,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp, > return r; > } > > -int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > +vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > { > #ifdef CONFIG_KVM_S390_UCONTROL > if ((vmf->pgoff == KVM_S390_SIE_PAGE_OFFSET) > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index c8a0b54..95d8102 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3827,7 +3827,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp, > return r; > } > > -int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > +vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > { > return VM_FAULT_SIGBUS; > } > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index ac0062b..8eeb062 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -736,7 +736,7 @@ long kvm_arch_dev_ioctl(struct file *filp, > unsigned int ioctl, unsigned long arg); > long kvm_arch_vcpu_ioctl(struct file *filp, >unsigned int ioctl, unsigned long arg); > -int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf); > +vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf); > > int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext); > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > index 86941f6..6c8cc31 100644 > --- a/virt/kvm/arm/arm.c > +++ b/virt/kvm/arm/arm.c > @@ -163,7 +163,7 @@ int kvm_arch_create_vcpu_debugfs(struct kvm_vcpu *vcpu) > return 0; > } > > -int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > +vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) > { > return VM_FAULT_SIGBUS; > } > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 4501e65..45eb54b 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -2341,7 +2341,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool > yield_to_kernel_mode) > } > EXPORT_SYMBOL_GPL(kvm_vcpu_on_spin); > > -static int kvm_vcpu_fault(struct vm_fault *vmf) > +static vm_fault_t kvm_vcpu_fault(struct vm_fault *vmf) > { > struct kvm_vcpu *vcpu = vmf->vma->vm_file->private_data; > struct page *page; > -- > 1.9.1 > Applied, thanks. Paolo ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v7 06/16] arm64/sve: Refactor user SVE trap maintenance for external use
On Wed, May 09, 2018 at 05:12:55PM +0100, Dave P Martin wrote: > In preparation for optimising the way KVM manages switching the > guest and host FPSIMD state, it is necessary to provide a means for > code outside arch/arm64/kernel/fpsimd.c to restore the user trap > configuration for SVE correctly for the current task. > > Rather than requiring external code to duplicate the maintenance > explicitly, this patch wraps moves the trap maintenenace to > fpsimd_bind_to_cpu(), since it is logically part of the work of > associating the current task with the cpu. > > Because fpsimd_bind_to_cpu() is rather a cryptic name to publish > alongside fpsimd_bind_state_to_cpu(), the former function is > renamed to fpsimd_bind_task_to_cpu() to make its purpose more > explicit. > > This patch makes appropriate changes to ensure that > fpsimd_bind_task_to_cpu() is always called alongside > task_fpsimd_load(), so that the trap maintenance continues to be > done in every situation where it was done prior to this patch. > > As a side-effect, the metadata updates done by > fpsimd_bind_task_to_cpu() now change from conditional to > unconditional in the "already bound" case of sigreturn. This is > harmless, and a couple of extra stores on this slow path will not > impact performance. I consider this a reasonable price to pay for > a slightly cleaner interface. > > Signed-off-by: Dave MartinAcked-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v7 05/16] arm64: fpsimd: Generalise context saving for non-task contexts
On Wed, May 09, 2018 at 05:12:54PM +0100, Dave P Martin wrote: > In preparation for allowing non-task (i.e., KVM vcpu) FPSIMD > contexts to be handled by the fpsimd common code, this patch adapts > task_fpsimd_save() to save back the currently loaded context, > removing the explicit dependency on current. > > The relevant storage to write back to in memory is now found by > examining the fpsimd_last_state percpu struct. > > fpsimd_save() does nothing unless TIF_FOREIGN_FPSTATE is clear, and > fpsimd_last_state is updated under local_bh_disable() or > local_irq_disable() everywhere that TIF_FOREIGN_FPSTATE is cleared: > thus, fpsimd_save() will write back to the correct storage for the > loaded context. > > No functional change. > > Signed-off-by: Dave Martin> Acked-by: Marc Zyngier Acked-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v7 02/16] arm64: Use update{,_tsk}_thread_flag()
On Wed, May 09, 2018 at 05:12:51PM +0100, Dave P Martin wrote: > This patch uses the new update_thread_flag() helpers to simplify a > couple of if () set; else clear; constructs. > > No functional change. > > Signed-off-by: Dave Martin> Acked-by: Marc Zyngier > Cc: Catalin Marinas > Cc: Will Deacon Acked-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v7 01/16] thread_info: Add update_thread_flag() helpers
On Wed, May 09, 2018 at 05:12:50PM +0100, Dave P Martin wrote: > There are a number of bits of code sprinkled around the kernel to > set a thread flag if a certain condition is true, and clear it > otherwise. > > To help make those call sites terser and less cumbersome, this > patch adds a new family of thread flag manipulators > > update*_thread_flag([...,] flag, cond) > > which do the equivalent of: > > if (cond) > set*_thread_flag([...,] flag); > else > clear*_thread_flag([...,] flag); > > Signed-off-by: Dave Martin> Acked-by: Steven Rostedt (VMware) > Acked-by: Marc Zyngier > Cc: Ingo Molnar > Cc: Peter Zijlstra > Cc: Oleg Nesterov FWIW, Acked-by: Catalin Marinas ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] kvm: Change return type to vm_fault_t
On Thu, Apr 19, 2018 at 7:26 PM, Cornelia Huckwrote: > On Thu, 19 Apr 2018 00:49:58 +0530 > Souptick Joarder wrote: > >> Use new return type vm_fault_t for fault handler. For >> now, this is just documenting that the function returns >> a VM_FAULT value rather than an errno. Once all instances >> are converted, vm_fault_t will become a distinct type. >> >> commit 1c8f422059ae ("mm: change return type to vm_fault_t") >> >> Signed-off-by: Souptick Joarder >> Reviewed-by: Matthew Wilcox >> --- >> arch/mips/kvm/mips.c | 2 +- >> arch/powerpc/kvm/powerpc.c | 2 +- >> arch/s390/kvm/kvm-s390.c | 2 +- >> arch/x86/kvm/x86.c | 2 +- >> include/linux/kvm_host.h | 2 +- >> virt/kvm/arm/arm.c | 2 +- >> virt/kvm/kvm_main.c| 2 +- >> 7 files changed, 7 insertions(+), 7 deletions(-) > > Reviewed-by: Cornelia Huck If no further comment, We would like to get this patch in queue for 4.18. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCHv2] ARM64: KVM: use lm_alias() for kvm_ksym_ref()
For historical reasons, we open-code lm_alias() in kvm_ksym_ref(). Let's use lm_alias() to avoid duplication and make things clearer. As we have to pull this from (which is not safe for inclusion in assembly), we may as well move the kvm_ksym_ref() definition into the existing !__ASSEMBLY__ block. Signed-off-by: Mark RutlandCc: Christoffer Dall Cc: Marc Zyngier Cc: kvmarm@lists.cs.columbia.edu --- arch/arm64/include/asm/kvm_asm.h | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) Since v1 [1]: * Rebase to v4.17-rc4 * Fix typo in commit message Mark. [1] https://lkml.kernel.org/r/20180406151909.57197-1-mark.rutl...@arm.com diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index f6648a3e4152..a9ceeec5a76f 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -33,16 +33,19 @@ #define KVM_ARM64_DEBUG_DIRTY_SHIFT0 #define KVM_ARM64_DEBUG_DIRTY (1 << KVM_ARM64_DEBUG_DIRTY_SHIFT) +#ifndef __ASSEMBLY__ + +#include + /* Translate a kernel address of @sym into its equivalent linear mapping */ #define kvm_ksym_ref(sym) \ ({ \ void *val =\ if (!is_kernel_in_hyp_mode()) \ - val = phys_to_virt((u64) - kimage_voffset); \ + val = lm_alias(); \ val;\ }) -#ifndef __ASSEMBLY__ struct kvm; struct kvm_vcpu; -- 2.11.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm