Re: [RFC PATCH v2 3/3] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
On Sat, Apr 07, 2018 at 10:54:22AM +0100, Marc Zyngier wrote: > On Fri, 06 Apr 2018 16:01:04 +0100, > Dave Martin wrote: > > Hi Dave, > > > > > This patch refactors KVM to align the host and guest FPSIMD > > save/restore logic with each other for arm64. This reduces the > > number of redundant save/restore operations that must occur, and > > reduces the common-case IRQ blackout time during guest exit storms > > by saving the host state lazily and optimising away the need to > > restore the host state before returning to the run loop. > > > > Four hooks are defined in order to enable this: > > > > * kvm_arch_vcpu_run_map_fp(): > >Called on PID change to map necessary bits of current to Hyp. > > > > * kvm_arch_vcpu_load_fp(): > >Set up FP/SIMD for entering the KVM run loop (parse as > >"vcpu_load fp"). > > > > * kvm_arch_vcpu_park_fp(): > >Get FP/SIMD into a safe state for re-enabling interrupts after a > >guest exit back to the run loop. > > It would be good if you could outline what this hook does, and what > "safe state" means in this context. [...] > > diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c > > index d3af3f4..cf0f457 100644 > > --- a/virt/kvm/arm/arm.c > > +++ b/virt/kvm/arm/arm.c > > @@ -362,10 +362,12 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int > > cpu) > > kvm_arm_set_running_vcpu(vcpu); > > kvm_vgic_load(vcpu); > > kvm_timer_vcpu_load(vcpu); > > + kvm_arch_vcpu_load_fp(vcpu); > > Can't find this function. [...] > > @@ -772,6 +774,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, > > struct kvm_run *run) > > if (static_branch_unlikely(_irqchip_in_use)) > > kvm_timer_sync_hwstate(vcpu); > > > > + kvm_arch_vcpu_park_fp(vcpu); > > This isn't defined either. I have the feeling that you've missed a > "git add" at some point. > > > + > > /* > > * We may have taken a host interrupt in HYP mode (ie > > * while executing the guest). This interrupt is still > > @@ -816,16 +820,6 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, > > struct kvm_run *run) > > return ret; > > } [...] > The structure seems quite sensible, and I'm looking forward to seeing > the missing bits. Also, it looks like this was done on top of > 4.16. I'm afraid 4.17 is going to bring a number of conflicts, but > nothing that should have any material impact. Hmmm, looks like I dropped an added file when flattening my draft series. See separate repost -- sorry about that. I've kept it on v4.16 just while things stabilise. Cheers ---Dave ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 3/3] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
On Mon, Apr 09, 2018 at 11:48:18AM +0200, Christoffer Dall wrote: [...] > > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c > > index 8605e04..797b259 100644 > > --- a/arch/arm64/kvm/hyp/switch.c > > +++ b/arch/arm64/kvm/hyp/switch.c > > @@ -27,6 +27,7 @@ > > #include > > #include > > #include > > +#include > > > > static bool __hyp_text __fpsimd_enabled_nvhe(void) > > { > > @@ -47,24 +48,40 @@ bool __hyp_text __fpsimd_enabled(void) > > return __fpsimd_is_enabled()(); > > } > > > > -static void __hyp_text __activate_traps_vhe(void) > > +static bool update_fp_enabled(struct kvm_vcpu *vcpu) > > +{ > > + if (vcpu->arch.host_thread_info->flags & _TIF_FOREIGN_FPSTATE) { > > + vcpu->arch.host_fpsimd_state = NULL; > > I can't see where host_fpsimd_state gets set to anything else than NULL, > what am I missing? It's set in kvm_arch_vcpu_load_fp(). Once we enter the run loop the pointer can only be changed to NULL, because the host state only needs to be saved once. Cheers ---Dave ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH v2 3/3] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
On Fri, Apr 06, 2018 at 04:01:04PM +0100, Dave Martin wrote: > This patch refactors KVM to align the host and guest FPSIMD > save/restore logic with each other for arm64. This reduces the > number of redundant save/restore operations that must occur, and > reduces the common-case IRQ blackout time during guest exit storms > by saving the host state lazily and optimising away the need to > restore the host state before returning to the run loop. > > Four hooks are defined in order to enable this: > > * kvm_arch_vcpu_run_map_fp(): >Called on PID change to map necessary bits of current to Hyp. > > * kvm_arch_vcpu_load_fp(): >Set up FP/SIMD for entering the KVM run loop (parse as >"vcpu_load fp"). > > * kvm_arch_vcpu_park_fp(): >Get FP/SIMD into a safe state for re-enabling interrupts after a >guest exit back to the run loop. > > * kvm_arch_vcpu_put_fp(): >Save guest FP/SIMD state back to memory and dissociate from the >CPU ("vcpu_put fp"). > > Also, the arm64 FPSIMD context switch code is updated to enable it > to save back FPSIMD state for a vcpu, not just current. A few > helpers drive this: > > * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp): >mark this CPU as having context fp (which may belong to a vcpu) >currently loaded in its registers. This is the non-task >equivalent of the static function fpsimd_bind_to_cpu() in >fpsimd.c. > > * task_fpsimd_save(): >exported to allow KVM to save the guest's FPSIMD state back to >memory on exit from the run loop. > > * fpsimd_flush_state(): >invalidate any context's FPSIMD state that is currently loaded. >Used to disassociate the vcpu from the CPU regs on run loop exit. > > These changes allow the run loop to enable interrupts (and thus > softirqs that may use kernel-mode NEON) without having to save the > guest's FPSIMD state eagerly. > > Some new vcpu_arch fields are added to make all this work. Because > host FPSIMD state can now be saved back directly into current's > thread_struct as appropriate, host_cpu_context is no longer used > for preserving the FPSIMD state. However, it is still needed for > preserving other things such as the host's system registers. To > avoid ABI churn, the redundant storage space in host_cpu_context is > not removed for now. > > arch/arm is not addressed by this patch and continues to use its > current save/restore logic. It could provide implementations of > the helpers later if desired. > > Signed-off-by: Dave Martin> --- > arch/arm/include/asm/kvm_host.h | 8 +++ > arch/arm64/include/asm/fpsimd.h | 5 + > arch/arm64/include/asm/kvm_host.h | 18 +++ > arch/arm64/kernel/fpsimd.c| 31 -- > arch/arm64/kvm/Makefile | 2 +- > arch/arm64/kvm/hyp/switch.c | 46 > --- > virt/kvm/arm/arm.c| 14 > 7 files changed, 89 insertions(+), 35 deletions(-) > > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h > index 248b930..11cd64a3 100644 > --- a/arch/arm/include/asm/kvm_host.h > +++ b/arch/arm/include/asm/kvm_host.h > @@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu, > int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > struct kvm_device_attr *attr); > > +/* > + * VFP/NEON switching is all done by the hyp switch code, so no need to > + * coordinate with host context handling for this state: > + */ > +static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {} > +static inline void kvm_arch_vcpu_park_fp(struct kvm_vcpu *vcpu) {} > +static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {} > + > /* All host FP/SIMD state is restored on guest exit, so nothing to save: */ > static inline void kvm_fpsimd_flush_cpu_state(void) {} > > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h > index 1bfc920..dbe7340 100644 > --- a/arch/arm64/include/asm/fpsimd.h > +++ b/arch/arm64/include/asm/fpsimd.h > @@ -40,6 +40,8 @@ struct task_struct; > extern void fpsimd_save_state(struct user_fpsimd_state *state); > extern void fpsimd_load_state(struct user_fpsimd_state *state); > > +extern void task_fpsimd_save(void); > + > extern void fpsimd_thread_switch(struct task_struct *next); > extern void fpsimd_flush_thread(void); > > @@ -48,7 +50,10 @@ extern void fpsimd_preserve_current_state(void); > extern void fpsimd_restore_current_state(void); > extern void fpsimd_update_current_state(struct user_fpsimd_state const > *state); > > +extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state); > + > extern void fpsimd_flush_task_state(struct task_struct *target); > +extern void fpsimd_flush_cpu_state(void); > extern void sve_flush_cpu_state(void); > > /* Maximum VL that SVE VL-agnostic software can transparently support */ > diff --git
Re: [RFC PATCH v2 3/3] KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
On Fri, 06 Apr 2018 16:01:04 +0100, Dave Martin wrote: Hi Dave, > > This patch refactors KVM to align the host and guest FPSIMD > save/restore logic with each other for arm64. This reduces the > number of redundant save/restore operations that must occur, and > reduces the common-case IRQ blackout time during guest exit storms > by saving the host state lazily and optimising away the need to > restore the host state before returning to the run loop. > > Four hooks are defined in order to enable this: > > * kvm_arch_vcpu_run_map_fp(): >Called on PID change to map necessary bits of current to Hyp. > > * kvm_arch_vcpu_load_fp(): >Set up FP/SIMD for entering the KVM run loop (parse as >"vcpu_load fp"). > > * kvm_arch_vcpu_park_fp(): >Get FP/SIMD into a safe state for re-enabling interrupts after a >guest exit back to the run loop. It would be good if you could outline what this hook does, and what "safe state" means in this context. > > * kvm_arch_vcpu_put_fp(): >Save guest FP/SIMD state back to memory and dissociate from the >CPU ("vcpu_put fp"). > > Also, the arm64 FPSIMD context switch code is updated to enable it > to save back FPSIMD state for a vcpu, not just current. A few > helpers drive this: > > * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp): >mark this CPU as having context fp (which may belong to a vcpu) >currently loaded in its registers. This is the non-task >equivalent of the static function fpsimd_bind_to_cpu() in >fpsimd.c. > > * task_fpsimd_save(): >exported to allow KVM to save the guest's FPSIMD state back to >memory on exit from the run loop. > > * fpsimd_flush_state(): >invalidate any context's FPSIMD state that is currently loaded. >Used to disassociate the vcpu from the CPU regs on run loop exit. > > These changes allow the run loop to enable interrupts (and thus > softirqs that may use kernel-mode NEON) without having to save the > guest's FPSIMD state eagerly. > > Some new vcpu_arch fields are added to make all this work. Because > host FPSIMD state can now be saved back directly into current's > thread_struct as appropriate, host_cpu_context is no longer used > for preserving the FPSIMD state. However, it is still needed for > preserving other things such as the host's system registers. To > avoid ABI churn, the redundant storage space in host_cpu_context is > not removed for now. > > arch/arm is not addressed by this patch and continues to use its > current save/restore logic. It could provide implementations of > the helpers later if desired. > > Signed-off-by: Dave Martin> --- > arch/arm/include/asm/kvm_host.h | 8 +++ > arch/arm64/include/asm/fpsimd.h | 5 + > arch/arm64/include/asm/kvm_host.h | 18 +++ > arch/arm64/kernel/fpsimd.c| 31 -- > arch/arm64/kvm/Makefile | 2 +- > arch/arm64/kvm/hyp/switch.c | 46 > --- > virt/kvm/arm/arm.c| 14 > 7 files changed, 89 insertions(+), 35 deletions(-) > > diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h > index 248b930..11cd64a3 100644 > --- a/arch/arm/include/asm/kvm_host.h > +++ b/arch/arm/include/asm/kvm_host.h > @@ -303,6 +303,14 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu, > int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu, > struct kvm_device_attr *attr); > > +/* > + * VFP/NEON switching is all done by the hyp switch code, so no need to > + * coordinate with host context handling for this state: > + */ > +static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu) {} > +static inline void kvm_arch_vcpu_park_fp(struct kvm_vcpu *vcpu) {} > +static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {} > + > /* All host FP/SIMD state is restored on guest exit, so nothing to save: */ > static inline void kvm_fpsimd_flush_cpu_state(void) {} > > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h > index 1bfc920..dbe7340 100644 > --- a/arch/arm64/include/asm/fpsimd.h > +++ b/arch/arm64/include/asm/fpsimd.h > @@ -40,6 +40,8 @@ struct task_struct; > extern void fpsimd_save_state(struct user_fpsimd_state *state); > extern void fpsimd_load_state(struct user_fpsimd_state *state); > > +extern void task_fpsimd_save(void); > + > extern void fpsimd_thread_switch(struct task_struct *next); > extern void fpsimd_flush_thread(void); > > @@ -48,7 +50,10 @@ extern void fpsimd_preserve_current_state(void); > extern void fpsimd_restore_current_state(void); > extern void fpsimd_update_current_state(struct user_fpsimd_state const > *state); > > +extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state); > + > extern void fpsimd_flush_task_state(struct task_struct *target); > +extern void fpsimd_flush_cpu_state(void); > extern void