Re: [PATCH 2/4] KVM: selftests: Increase UCALL_MAX_ARGS to 7

2022-06-20 Thread Andrew Jones
On Wed, Jun 15, 2022 at 07:31:14PM +, Colton Lewis wrote:
> Increase UCALL_MAX_ARGS to 7 to allow GUEST_ASSERT_4 to pass 3 builtin
> ucall arguments specified in guest_assert_builtin_args plus 4
> user-specified arguments.
> 
> Signed-off-by: Colton Lewis 
> ---
>  tools/testing/selftests/kvm/include/ucall_common.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/ucall_common.h 
> b/tools/testing/selftests/kvm/include/ucall_common.h
> index dbe872870b83..568c562f14cd 100644
> --- a/tools/testing/selftests/kvm/include/ucall_common.h
> +++ b/tools/testing/selftests/kvm/include/ucall_common.h
> @@ -16,7 +16,7 @@ enum {
>   UCALL_UNHANDLED,
>  };
>  
> -#define UCALL_MAX_ARGS 6
> +#define UCALL_MAX_ARGS 7
>  
>  struct ucall {
>   uint64_t cmd;
> -- 
> 2.36.1.476.g0c4daa206d-goog
>

Reviewed-by: Andrew Jones 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 2/4] KVM: selftests: Increase UCALL_MAX_ARGS to 7

2022-06-20 Thread Andrew Jones
On Mon, Jun 20, 2022 at 09:21:11AM +0200, Andrew Jones wrote:
> On Sat, Jun 18, 2022 at 12:09:11AM +, Sean Christopherson wrote:
> > On Fri, Jun 17, 2022, Colton Lewis wrote:
> > > On Thu, Jun 16, 2022 at 02:10:06PM +0200, Andrew Jones wrote:
> > > > We probably want to ensure all architectures are good with this. afaict,
> > > > riscv only expects 6 args and uses UCALL_MAX_ARGS to cap the ucall 
> > > > inputs,
> > > > for example.
> > > 
> > > All architectures use UCALL_MAX_ARGS for that. Are you saying there
> > > might be limitations beyond the value of the macro? If so, who should
> > > verify whether this is ok?
> > 
> > I thought there were architectural limitations too, but I believe I was 
> > thinking
> > of vcpu_args_set(), where the number of params is limited by the function 
> > call
> > ABI, e.g. the number of registers.
> > 
> > Unless there's something really, really subtle going on, all architectures 
> > pass
> > the actual ucall struct purely through memory.  Actually, that code is ripe 
> > for
> > deduplication, and amazingly it doesn't conflict with Colton's series.  
> > Patches
> > incoming...
> >
> 
> RISC-V uses sbi_ecall() for their implementation of ucall(). CC'ing Anup
> for confirmation, but if I understand the SBI spec correctly, then inputs
> are limited to registers a0-a5.

Ah, never mind... I see SBI is limited to 6 registers, but of course it
only needs one register to pass the uc address... We can make
UCALL_MAX_ARGS whatever we want.

Thanks,
drew

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 7/7] arm64/sve: Don't zero non-FPSIMD register state on syscall by default

2022-06-20 Thread Mark Brown
The documented syscall ABI specifies that the SVE state not shared with
FPSIMD is undefined after a syscall. Currently we implement this by
always flushing this register state to zero, ensuring consistent
behaviour but introducing some overhead in the case where we can return
directly to userspace without otherwise needing to update the register
state. Take advantage of the flexibility offered by the documented ABI
and instead leave the SVE registers untouched in the case where can
return directly to userspace.

Since this is a user visible change a new sysctl abi.sve_syscall_clear_regs
is provided which will restore the current behaviour of flushing the
unshared register state unconditionally when enabled. This can be
enabled for testing or to work around problems with applications that
have been relying on the current flushing behaviour.

The sysctl is disabled by default since it is anticipated that the risk
of disruption to userspace is low. As well as being within the
documented ABI this new behaviour mirrors the standard function call ABI
for SVE in the AAPCS which should mean that compiler generated code is
unlikely to rely on the current behaviour, the main risk is from hand
coded assembly which directly invokes syscalls. The new behaviour is
also what is currently implemented by qemu user mode emulation.

Signed-off-by: Mark Brown 
---
 arch/arm64/kernel/syscall.c | 36 +++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 69b4c06f2e39..29ef3d65cf12 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -158,6 +158,40 @@ static void el0_svc_common(struct pt_regs *regs, int scno, 
int sc_nr,
syscall_trace_exit(regs);
 }
 
+
+static unsigned int sve_syscall_regs_clear;
+
+#ifdef CONFIG_ARM64_SVE
+/*
+ * Global sysctl to control if we force the SVE register state not
+ * shared with FPSIMD to be cleared on every syscall. If this is not
+ * enabled then we will leave the state unchanged unless we need to
+ * reload from memory (eg, after a context switch).
+ */
+
+static struct ctl_table sve_syscall_sysctl_table[] = {
+   {
+   .procname   = "sve_syscall_clear_regs",
+   .mode   = 0644,
+   .data   = _syscall_regs_clear,
+   .maxlen = sizeof(int),
+   .proc_handler   = proc_dointvec_minmax,
+   .extra1 = SYSCTL_ZERO,
+   .extra2 = SYSCTL_ONE,
+   },
+   { }
+};
+
+static int __init sve_syscall_sysctl_init(void)
+{
+   if (!register_sysctl("abi", sve_syscall_sysctl_table))
+   return -EINVAL;
+   return 0;
+}
+
+core_initcall(sve_syscall_sysctl_init);
+#endif /* CONFIG_ARM64_SVE */
+
 /*
  * As per the ABI exit SME streaming mode and clear the SVE state not
  * shared with FPSIMD on syscall entry.
@@ -183,7 +217,7 @@ static inline void fp_user_discard(void)
if (!system_supports_sve())
return;
 
-   if (test_thread_flag(TIF_SVE)) {
+   if (sve_syscall_regs_clear && test_thread_flag(TIF_SVE)) {
unsigned int sve_vq_minus_one;
 
sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1;
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 6/7] arm64/sve: Leave SVE enabled on syscall if we don't context switch

2022-06-20 Thread Mark Brown
The syscall ABI says that the SVE register state not shared with FPSIMD
may not be preserved on syscall, and this is the only mechanism we have
in the ABI to stop tracking the extra SVE state for a process. Currently
we do this unconditionally by means of disabling SVE for the process on
syscall, causing userspace to take a trap to EL1 if it uses SVE again.
These extra traps result in a noticeable overhead for using SVE instead
of FPSIMD in some workloads, especially for simple syscalls where we can
return directly to userspace and would not otherwise need to update the
floating point registers. Tests with fp-pidbench show an approximately
70% overhead on a range of implementations when SVE is in use - while
this is an extreme and entirely artificial benchmark it is clear that
there is some useful room for improvement here.

Now that we have the ability to track the decision about what to save
seprately to TIF_SVE we can improve things by leaving TIF_SVE enabled on
syscall but only saving the FPSIMD registers if we are in a syscall.
This means that if we need to restore the register state from memory
(eg, after a context switch or kernel mode NEON) we will drop TIF_SVE
and reenable traps for userspace but if we can just return to userspace
then traps will remain disabled.

Since our current implementation has the effect of zeroing all the SVE
register state not shared with FPSIMD on syscall we replace the
disabling of TIF_SVE with a flush of the non-shared register state, this
means that there is still some overhead for syscalls when SVE is in use
but it is much reduced.

Signed-off-by: Mark Brown 
---
 arch/arm64/kernel/fpsimd.c  |  8 +++-
 arch/arm64/kernel/syscall.c | 19 +--
 2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index f14452b7a629..5ec13c8bf98b 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -480,7 +480,13 @@ static void fpsimd_save(void)
if (test_thread_flag(TIF_FOREIGN_FPSTATE))
return;
 
-   if ((last->to_save == FP_STATE_TASK && test_thread_flag(TIF_SVE)) ||
+   /*
+* If a task is in a syscall the ABI allows us to only
+* preserve the state shared with FPSIMD so don't bother
+* saving the full SVE state in that case.
+*/
+   if ((last->to_save == FP_STATE_TASK && test_thread_flag(TIF_SVE) &&
+!in_syscall(current_pt_regs())) ||
last->to_save == FP_STATE_SVE) {
save_sve_regs = true;
save_ffr = true;
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 733451fe7e41..69b4c06f2e39 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -183,21 +183,12 @@ static inline void fp_user_discard(void)
if (!system_supports_sve())
return;
 
-   /*
-* If SME is not active then disable SVE, the registers will
-* be cleared when userspace next attempts to access them and
-* we do not need to track the SVE register state until then.
-*/
-   clear_thread_flag(TIF_SVE);
+   if (test_thread_flag(TIF_SVE)) {
+   unsigned int sve_vq_minus_one;
 
-   /*
-* task_fpsimd_load() won't be called to update CPACR_EL1 in
-* ret_to_user unless TIF_FOREIGN_FPSTATE is still set, which only
-* happens if a context switch or kernel_neon_begin() or context
-* modification (sigreturn, ptrace) intervenes.
-* So, ensure that CPACR_EL1 is already correct for the fast-path case.
-*/
-   sve_user_disable();
+   sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1;
+   sve_flush_live(true, sve_vq_minus_one);
+   }
 }
 
 void do_el0_svc(struct pt_regs *regs)
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 5/7] arm64/fpsimd: Load FP state based on recorded data type

2022-06-20 Thread Mark Brown
Now that we are recording the type of floating point register state we
are saving when we save it we can use that information when we load to
decide which register state is required and bring the TIF_SVE state into
sync with the loaded register state.

The SME state detauls are already recorded directly in the saved
SVCR and handled based on the information there.

Since we are not changing any of the save paths there should be no
functional change from this patch, further patches will make use of this
to optimise and clarify the code.

Signed-off-by: Mark Brown 
---
 arch/arm64/kernel/fpsimd.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index ebe66d8c66e8..f14452b7a629 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -391,11 +391,38 @@ static void task_fpsimd_load(void)
WARN_ON(!system_supports_fpsimd());
WARN_ON(!have_cpu_fpsimd_context());
 
-   /* Check if we should restore SVE first */
-   if (IS_ENABLED(CONFIG_ARM64_SVE) && test_thread_flag(TIF_SVE)) {
-   sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1);
-   restore_sve_regs = true;
-   restore_ffr = true;
+   if (system_supports_sve()) {
+   switch (current->thread.fp_type) {
+   case FP_STATE_FPSIMD:
+   /* Stop tracking SVE for this task until next use. */
+   if (test_and_clear_thread_flag(TIF_SVE))
+   sve_user_disable();
+   break;
+   case FP_STATE_SVE:
+   /*
+* A thread with SVE state should either be in
+* streaming mode or already have SVE enabled.
+*/
+   if (!thread_sm_enabled(>thread) &&
+   !WARN_ON_ONCE(!test_and_set_thread_flag(TIF_SVE)))
+   sve_user_enable();
+
+   sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 
1);
+   restore_sve_regs = true;
+   restore_ffr = true;
+   break;
+   default:
+   /*
+* This should never happen, we should always
+* record what we saved when we save. We
+* always at least have the memory allocated
+* for FPSMID registers so try that and hope
+* for the best.
+*/
+   WARN_ON_ONCE(1);
+   clear_thread_flag(TIF_SVE);
+   break;
+   }
}
 
/* Restore SME, override SVE register configuration if needed */
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 4/7] arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM

2022-06-20 Thread Mark Brown
Now that we are explicitly telling the host FP code which register state
it needs to save we can remove the manipulation of TIF_SVE from the KVM
code, simplifying it and allowing us to optimise our handling of normal
tasks. Remove the manipulation of TIF_SVE from KVM and instead rely on
to_save to ensure we save the correct data for it.

Signed-off-by: Mark Brown 
---
 arch/arm64/kernel/fpsimd.c | 22 --
 arch/arm64/kvm/fpsimd.c|  3 ---
 2 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 95c95411bd42..ebe66d8c66e8 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -435,8 +435,8 @@ static void task_fpsimd_load(void)
  * last, if KVM is involved this may be the guest VM context rather
  * than the host thread for the VM pointed to by current. This means
  * that we must always reference the state storage via last rather
- * than via current, other than the TIF_ flags which KVM will
- * carefully maintain for us.
+ * than via current, if we are saving KVM state then it will have
+ * ensured that the type of registers to save is set in last->to_save.
  */
 static void fpsimd_save(void)
 {
@@ -453,27 +453,13 @@ static void fpsimd_save(void)
if (test_thread_flag(TIF_FOREIGN_FPSTATE))
return;
 
-   if (test_thread_flag(TIF_SVE)) {
+   if ((last->to_save == FP_STATE_TASK && test_thread_flag(TIF_SVE)) ||
+   last->to_save == FP_STATE_SVE) {
save_sve_regs = true;
save_ffr = true;
vl = last->sve_vl;
}
 
-   /*
-* For now we're just validating that the requested state is
-* consistent with what we'd otherwise work out.
-*/
-   switch (last->to_save) {
-   case FP_STATE_TASK:
-   break;
-   case FP_STATE_FPSIMD:
-   WARN_ON_ONCE(save_sve_regs);
-   break;
-   case FP_STATE_SVE:
-   WARN_ON_ONCE(!save_sve_regs);
-   break;
-   }
-
if (system_supports_sme()) {
u64 *svcr = last->svcr;
*svcr = read_sysreg_s(SYS_SVCR);
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 542c71b16451..54131a57130e 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -150,7 +150,6 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 >arch.fp_type, fp_type);
 
clear_thread_flag(TIF_FOREIGN_FPSTATE);
-   update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu));
}
 }
 
@@ -207,7 +206,5 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
sysreg_clear_set(CPACR_EL1, CPACR_EL1_ZEN_EL0EN, 0);
}
 
-   update_thread_flag(TIF_SVE, 0);
-
local_irq_restore(flags);
 }
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 3/7] arm64/fpsimd: Have KVM explicitly say which FP registers to save

2022-06-20 Thread Mark Brown
In order to avoid needlessly saving and restoring the guest registers KVM
relies on the host FPSMID code to save the guest registers when we context
switch away from the guest. This is done by binding the KVM guest state to
the CPU on top of the task state that was originally there, then carefully
managing the TIF_SVE flag for the task to cause the host to save the full
SVE state when needed regardless of the needs of the host task. This works
well enough but isn't terribly direct about what is going on and makes it
much more complicated to try to optimise what we're doing with the SVE
register state.

Let's instead have KVM pass in the register state it wants saving when it
binds to the CPU. We introduce a new FP_TYPE_TASK for use during normal
task binding to indicate that we should base our decisions on the current
task. In order to ease any future debugging that might be required this
patch does not actually update any of the decision making about what to
save, it merely starts tracking the new information and warns if the
requested state is not what we would otherwise have decided to save.

Signed-off-by: Mark Brown 
---
 arch/arm64/include/asm/fpsimd.h|  3 ++-
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/kernel/fpsimd.c | 20 +++-
 arch/arm64/kvm/fpsimd.c|  9 -
 4 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 5762419fdcc0..e008965719a4 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -61,7 +61,8 @@ extern void fpsimd_kvm_prepare(void);
 extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
 void *sve_state, unsigned int sve_vl,
 void *za_state, unsigned int sme_vl,
-u64 *svcr, enum fp_state *type);
+u64 *svcr, enum fp_state *type,
+enum fp_state to_save);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_save_and_flush_cpu_state(void);
diff --git a/arch/arm64/include/asm/processor.h 
b/arch/arm64/include/asm/processor.h
index 192986509a8e..7d9f0c95b352 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -123,6 +123,7 @@ enum vec_type {
 };
 
 enum fp_state {
+   FP_STATE_TASK,  /* Save based on current, invalid as fp_type */
FP_STATE_FPSIMD,
FP_STATE_SVE,
 };
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index fdb2925becdf..95c95411bd42 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -126,6 +126,7 @@ struct fpsimd_last_state_struct {
unsigned int sve_vl;
unsigned int sme_vl;
enum fp_state *type;
+   enum fp_state to_save;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -458,6 +459,21 @@ static void fpsimd_save(void)
vl = last->sve_vl;
}
 
+   /*
+* For now we're just validating that the requested state is
+* consistent with what we'd otherwise work out.
+*/
+   switch (last->to_save) {
+   case FP_STATE_TASK:
+   break;
+   case FP_STATE_FPSIMD:
+   WARN_ON_ONCE(save_sve_regs);
+   break;
+   case FP_STATE_SVE:
+   WARN_ON_ONCE(!save_sve_regs);
+   break;
+   }
+
if (system_supports_sme()) {
u64 *svcr = last->svcr;
*svcr = read_sysreg_s(SYS_SVCR);
@@ -1702,6 +1718,7 @@ static void fpsimd_bind_task_to_cpu(void)
last->sme_vl = task_get_sme_vl(current);
last->svcr = >thread.svcr;
last->type = >thread.fp_type;
+   last->to_save = FP_STATE_TASK;
current->thread.fpsimd_cpu = smp_processor_id();
 
/*
@@ -1726,7 +1743,7 @@ static void fpsimd_bind_task_to_cpu(void)
 void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
  unsigned int sve_vl, void *za_state,
  unsigned int sme_vl, u64 *svcr,
- enum fp_state *type)
+ enum fp_state *type, enum fp_state to_save)
 {
struct fpsimd_last_state_struct *last =
this_cpu_ptr(_last_state);
@@ -1741,6 +1758,7 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state 
*st, void *sve_state,
last->sve_vl = sve_vl;
last->sme_vl = sme_vl;
last->type = type;
+   last->to_save = to_save;
 }
 
 /*
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index be3ddb214ab1..542c71b16451 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -129,9 +129,16 @@ void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu)
  */
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 {
+

[PATCH v2 2/7] arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE

2022-06-20 Thread Mark Brown
When we save the state for the floating point registers this can be done
in the form visible through either the FPSIMD V registers or the SVE Z and
P registers. At present we track which format is currently used based on
TIF_SVE and the SME streaming mode state but particularly in the SVE case
this limits our options for optimising things, especially around syscalls.
Introduce a new enum in thread_struct which explicitly states which format
is active and keep it up to date when we change it.

At present we do not use this state except to verify that it has the
expected value when loading the state, future patches will introduce
functional changes.

Signed-off-by: Mark Brown 
---
 arch/arm64/include/asm/fpsimd.h|  2 +-
 arch/arm64/include/asm/kvm_host.h  |  1 +
 arch/arm64/include/asm/processor.h |  6 
 arch/arm64/kernel/fpsimd.c | 57 ++
 arch/arm64/kernel/process.c|  2 ++
 arch/arm64/kernel/ptrace.c |  6 ++--
 arch/arm64/kernel/signal.c |  3 ++
 arch/arm64/kvm/fpsimd.c|  3 +-
 8 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 847fd119cdb8..5762419fdcc0 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -61,7 +61,7 @@ extern void fpsimd_kvm_prepare(void);
 extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
 void *sve_state, unsigned int sve_vl,
 void *za_state, unsigned int sme_vl,
-u64 *svcr);
+u64 *svcr, enum fp_state *type);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void fpsimd_save_and_flush_cpu_state(void);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index de32152cea04..250e23f221c4 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -310,6 +310,7 @@ struct kvm_vcpu_arch {
void *sve_state;
unsigned int sve_max_vl;
u64 svcr;
+   enum fp_state fp_type;
 
/* Stage 2 paging state used by the hardware on next switch */
struct kvm_s2_mmu *hw_mmu;
diff --git a/arch/arm64/include/asm/processor.h 
b/arch/arm64/include/asm/processor.h
index 9e58749db21d..192986509a8e 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -122,6 +122,11 @@ enum vec_type {
ARM64_VEC_MAX,
 };
 
+enum fp_state {
+   FP_STATE_FPSIMD,
+   FP_STATE_SVE,
+};
+
 struct cpu_context {
unsigned long x19;
unsigned long x20;
@@ -152,6 +157,7 @@ struct thread_struct {
struct user_fpsimd_state fpsimd_state;
} uw;
 
+   enum fp_state   fp_type;/* registers FPSIMD or SVE? */
unsigned intfpsimd_cpu;
void*sve_state; /* SVE registers, if any */
void*za_state;  /* ZA register, if any */
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index d67e658f1e24..fdb2925becdf 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -125,6 +125,7 @@ struct fpsimd_last_state_struct {
u64 *svcr;
unsigned int sve_vl;
unsigned int sme_vl;
+   enum fp_state *type;
 };
 
 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state);
@@ -330,15 +331,6 @@ void task_set_vl_onexec(struct task_struct *task, enum 
vec_type type,
  *The task can execute SVE instructions while in userspace without
  *trapping to the kernel.
  *
- *When stored, Z0-Z31 (incorporating Vn in bits[127:0] or the
- *corresponding Zn), P0-P15 and FFR are encoded in
- *task->thread.sve_state, formatted appropriately for vector
- *length task->thread.sve_vl or, if SVCR.SM is set,
- *task->thread.sme_vl.
- *
- *task->thread.sve_state must point to a valid buffer at least
- *sve_state_size(task) bytes in size.
- *
  *During any syscall, the kernel may optionally clear TIF_SVE and
  *discard the vector state except for the FPSIMD subset.
  *
@@ -348,7 +340,15 @@ void task_set_vl_onexec(struct task_struct *task, enum 
vec_type type,
  *do_sve_acc() to be called, which does some preparation and then
  *sets TIF_SVE.
  *
- *When stored, FPSIMD registers V0-V31 are encoded in
+ * During any syscall, the kernel may optionally clear TIF_SVE and
+ * discard the vector state except for the FPSIMD subset.
+ *
+ * The data will be stored in one of two formats:
+ *
+ *  * FPSIMD only - FP_STATE_FPSIMD:
+ *
+ *When the FPSIMD only state stored task->thread.fp_type is set to
+ *FP_STATE_FPSIMD, the FPSIMD registers V0-V31 are encoded in
  *task->thread.uw.fpsimd_state; bits [max : 128] for each of Z0-Z31 are
  *logically zero but not stored anywhere; P0-P15 and 

[PATCH v2 1/7] KVM: arm64: Discard any SVE state when entering KVM guests

2022-06-20 Thread Mark Brown
Since 8383741ab2e773a99 (KVM: arm64: Get rid of host SVE tracking/saving)
KVM has not tracked the host SVE state, relying on the fact that we
currently disable SVE whenever we perform a syscall. This may not be true
in future since performance optimisation may result in us keeping SVE
enabled in order to avoid needing to take access traps to reenable it.
Handle this by clearing TIF_SVE and converting the stored task state to
FPSIMD format when preparing to run the guest.  This is done with a new
call fpsimd_kvm_prepare() to keep the direct state manipulation
functions internal to fpsimd.c.

Signed-off-by: Mark Brown 
---
 arch/arm64/include/asm/fpsimd.h |  1 +
 arch/arm64/kernel/fpsimd.c  | 23 +++
 arch/arm64/kvm/fpsimd.c |  3 ++-
 3 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 9bb1873f5295..847fd119cdb8 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -56,6 +56,7 @@ extern void fpsimd_signal_preserve_current_state(void);
 extern void fpsimd_preserve_current_state(void);
 extern void fpsimd_restore_current_state(void);
 extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
+extern void fpsimd_kvm_prepare(void);
 
 extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state,
 void *sve_state, unsigned int sve_vl,
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index aecf3071efdd..d67e658f1e24 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1637,6 +1637,29 @@ void fpsimd_signal_preserve_current_state(void)
sve_to_fpsimd(current);
 }
 
+/*
+ * Called by KVM when entering the guest.
+ */
+void fpsimd_kvm_prepare(void)
+{
+   if (!system_supports_sve())
+   return;
+
+   /*
+* KVM does not save host SVE state since we can only enter
+* the guest from a syscall so the ABI means that only the
+* non-saved SVE state needs to be saved.  If we have left
+* SVE enabled for performance reasons then update the task
+* state to be FPSIMD only.
+*/
+   get_cpu_fpsimd_context();
+
+   if (test_and_clear_thread_flag(TIF_SVE))
+   sve_to_fpsimd(current);
+
+   put_cpu_fpsimd_context();
+}
+
 /*
  * Associate current's FPSIMD context with this cpu
  * The caller must have ownership of the cpu FPSIMD context before calling
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 6012b08ecb14..a433ee8da232 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -75,7 +75,8 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
 {
BUG_ON(!current->mm);
-   BUG_ON(test_thread_flag(TIF_SVE));
+
+   fpsimd_kvm_prepare();
 
vcpu->arch.flags &= ~KVM_ARM64_FP_ENABLED;
vcpu->arch.flags |= KVM_ARM64_FP_HOST;
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 0/7] arm64/sve: Clean up KVM integration and optimise syscalls

2022-06-20 Thread Mark Brown
This patch series attempts to clarify the tracking of which set of
floating point registers we save on systems supporting SVE, particularly
with reference to KVM, and then uses the results of this clarification
to improve the performance of simple syscalls where we return directly
to userspace in cases where userspace is using SVE.

At present we track which register state is active by using the TIF_SVE
flag for the current task which also controls if userspace is able to
use SVE, this is reasonably straightforward if limiting but for KVM it
gets a bit hairy since we may have guest state loaded in registers. This
results in KVM modifying TIF_SVE for the VMM task while the guest is
running which doesn't entirely help make things easy to follow. To help
make things clearer the series changes things so that in addition to
TIF_SVE we explicitly track both the type of registers that are
currently saved in the task struct and the type of registers that we
should save when we do so. TIF_SVE then solely controls if userspace
can use SVE without trapping, it has no function for KVM guests and we
can remove the code for managing it from KVM.

The refactoring to add the separate tracking is initially done by adding
the new state together with checks that the state corresponds to
expectations when we look at it before subsequent patches make use of
the separated state, the goal being to both split out the more repetitive
bits of tha change and make it easier to debug any problems that might
arise.

With the state tracked separately we then start to optimise the
performance of syscalls when the process is using SVE. Currently every
syscall disables SVE for userspace which means that we need to trap to
EL1 again on the next SVE instruction, flush the SVE registers, and
reenable SVE for EL0, creating overhead for tasks that mix SVE and
syscalls. We build on the above refactoring to eliminate this overhead
for simple syscalls which return directly to userspace by:

 - Keeping SVE enabled.
 - Not flushing the SVE state.

The removal of flushing is within our currently documented ABI but is a
change in our effective ABI so a sysctl is provided to revert to current
behaviour in case of problems or to allow testing of userspace.  If we
don't want to do this I think we should tighten our ABI documentation,
previously Catalin had indicated that he didn't want to tighten it.

v2:
 - Rebase onto v5.19-rc3.
 - Don't warn when restoring streaming mode SVE without TIF_SVE.

Mark Brown (7):
  KVM: arm64: Discard any SVE state when entering KVM guests
  arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE
  arm64/fpsimd: Have KVM explicitly say which FP registers to save
  arm64/fpsimd: Stop using TIF_SVE to manage register saving in KVM
  arm64/fpsimd: Load FP state based on recorded data type
  arm64/sve: Leave SVE enabled on syscall if we don't context switch
  arm64/sve: Don't zero non-FPSIMD register state on syscall by default

 arch/arm64/include/asm/fpsimd.h|   4 +-
 arch/arm64/include/asm/kvm_host.h  |   1 +
 arch/arm64/include/asm/processor.h |   7 ++
 arch/arm64/kernel/fpsimd.c | 131 -
 arch/arm64/kernel/process.c|   2 +
 arch/arm64/kernel/ptrace.c |   6 +-
 arch/arm64/kernel/signal.c |   3 +
 arch/arm64/kernel/syscall.c|  53 +---
 arch/arm64/kvm/fpsimd.c|  16 ++--
 9 files changed, 179 insertions(+), 44 deletions(-)


base-commit: a111daf0c53ae91e71fd2bfe7497862d14132e3e
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 4/4] KVM: selftests: Fix filename reporting in guest asserts

2022-06-20 Thread Paolo Bonzini

On 6/16/22 14:45, Andrew Jones wrote:

+#define __GUEST_ASSERT(_condition, _condstr, _nargs, _args...) do {\
+   if (!(_condition))  \
+   ucall(UCALL_ABORT, GUEST_ASSERT_BUILTIN_NARGS + _nargs, 
\
+ "Failed guest assert: " _condstr,   \
+ __FILE__, \
+ __LINE__, \
+ ##_args); \

We don't need another level of indentation nor the ## operator on _args.



The ## is needed to drop the comma if there are no _args.

Paolo

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 0/3] KVM: selftests: Consolidate ucall code

2022-06-20 Thread Paolo Bonzini

On 6/18/22 02:16, Sean Christopherson wrote:

Consolidate the code for making and getting ucalls.  All architectures pass
the ucall struct via memory, so filling and copying the struct is 100%
generic.  The only per-arch code is sending and receiving the address of
said struct.

Tested on x86 and arm, compile tested on s390 and RISC-V.


I'm not sure about doing this yet.  The SEV tests added multiple 
implementations of the ucalls in one architecture.  I have rebased those 
recently (not the SEV part) to get more familiar with the new kvm_vcpu 
API for selftests, and was going to look at your old review next...


Paolo

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 2/4] KVM: selftests: Increase UCALL_MAX_ARGS to 7

2022-06-20 Thread Anup Patel
On Mon, Jun 20, 2022 at 12:51 PM Andrew Jones  wrote:
>
> On Sat, Jun 18, 2022 at 12:09:11AM +, Sean Christopherson wrote:
> > On Fri, Jun 17, 2022, Colton Lewis wrote:
> > > On Thu, Jun 16, 2022 at 02:10:06PM +0200, Andrew Jones wrote:
> > > > We probably want to ensure all architectures are good with this. afaict,
> > > > riscv only expects 6 args and uses UCALL_MAX_ARGS to cap the ucall 
> > > > inputs,
> > > > for example.
> > >
> > > All architectures use UCALL_MAX_ARGS for that. Are you saying there
> > > might be limitations beyond the value of the macro? If so, who should
> > > verify whether this is ok?
> >
> > I thought there were architectural limitations too, but I believe I was 
> > thinking
> > of vcpu_args_set(), where the number of params is limited by the function 
> > call
> > ABI, e.g. the number of registers.
> >
> > Unless there's something really, really subtle going on, all architectures 
> > pass
> > the actual ucall struct purely through memory.  Actually, that code is ripe 
> > for
> > deduplication, and amazingly it doesn't conflict with Colton's series.  
> > Patches
> > incoming...
> >
>
> RISC-V uses sbi_ecall() for their implementation of ucall(). CC'ing Anup
> for confirmation, but if I understand the SBI spec correctly, then inputs
> are limited to registers a0-a5.

Yes, we only have 6 parameters in ucall() since it is based on SBI spec.

Actually, a6 and a7 are used to identify the type of SBI call (i.e. extension
and function number) whereas a0-a5 are function parameters.

Regards,
Anup

>
> Thanks,
> drew
>
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 1/3] KVM: selftests: Consolidate common code for popuplating ucall struct

2022-06-20 Thread Christian Borntraeger




Am 18.06.22 um 02:16 schrieb Sean Christopherson:

Make ucall() a common helper that populates struct ucall, and only calls
into arch code to make the actually call out to userspace.

Rename all arch-specific helpers to make it clear they're arch-specific,
and to avoid collisions with common helpers (one more on its way...)

No functional change intended.

Cc: Colton Lewis 
Cc: Andrew Jones 
Signed-off-by: Sean Christopherson 


seems to work on s390.
Tested-by: Christian Borntraeger 


---
  tools/testing/selftests/kvm/Makefile  |  1 +
  .../selftests/kvm/include/ucall_common.h  | 23 ---
  .../testing/selftests/kvm/lib/aarch64/ucall.c | 23 ---
  tools/testing/selftests/kvm/lib/riscv/ucall.c | 23 ---
  tools/testing/selftests/kvm/lib/s390x/ucall.c | 23 ---
  .../testing/selftests/kvm/lib/ucall_common.c  | 20 
  .../testing/selftests/kvm/lib/x86_64/ucall.c  | 23 ---
  7 files changed, 61 insertions(+), 75 deletions(-)
  create mode 100644 tools/testing/selftests/kvm/lib/ucall_common.c

diff --git a/tools/testing/selftests/kvm/Makefile 
b/tools/testing/selftests/kvm/Makefile
index b52c130f7b2f..bc2aee2af66c 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -46,6 +46,7 @@ LIBKVM += lib/perf_test_util.c
  LIBKVM += lib/rbtree.c
  LIBKVM += lib/sparsebit.c
  LIBKVM += lib/test_util.c
+LIBKVM += lib/ucall_common.c
  
  LIBKVM_x86_64 += lib/x86_64/apic.c

  LIBKVM_x86_64 += lib/x86_64/handlers.S
diff --git a/tools/testing/selftests/kvm/include/ucall_common.h 
b/tools/testing/selftests/kvm/include/ucall_common.h
index 98562f685151..c6a4fd7fe443 100644
--- a/tools/testing/selftests/kvm/include/ucall_common.h
+++ b/tools/testing/selftests/kvm/include/ucall_common.h
@@ -23,10 +23,27 @@ struct ucall {
uint64_t args[UCALL_MAX_ARGS];
  };
  
-void ucall_init(struct kvm_vm *vm, void *arg);

-void ucall_uninit(struct kvm_vm *vm);
+void ucall_arch_init(struct kvm_vm *vm, void *arg);
+void ucall_arch_uninit(struct kvm_vm *vm);
+void ucall_arch_do_ucall(vm_vaddr_t uc);
+uint64_t ucall_arch_get_ucall(struct kvm_vcpu *vcpu, struct ucall *uc);
+
  void ucall(uint64_t cmd, int nargs, ...);
-uint64_t get_ucall(struct kvm_vcpu *vcpu, struct ucall *uc);
+
+static inline void ucall_init(struct kvm_vm *vm, void *arg)
+{
+   ucall_arch_init(vm, arg);
+}
+
+static inline void ucall_uninit(struct kvm_vm *vm)
+{
+   ucall_arch_uninit(vm);
+}
+
+static inline uint64_t get_ucall(struct kvm_vcpu *vcpu, struct ucall *uc)
+{
+   return ucall_arch_get_ucall(vcpu, uc);
+}
  
  #define GUEST_SYNC_ARGS(stage, arg1, arg2, arg3, arg4)	\

ucall(UCALL_SYNC, 6, "hello", stage, arg1, 
arg2, arg3, arg4)
diff --git a/tools/testing/selftests/kvm/lib/aarch64/ucall.c 
b/tools/testing/selftests/kvm/lib/aarch64/ucall.c
index 0b949ee06b5e..2de9fdd34159 100644
--- a/tools/testing/selftests/kvm/lib/aarch64/ucall.c
+++ b/tools/testing/selftests/kvm/lib/aarch64/ucall.c
@@ -21,7 +21,7 @@ static bool ucall_mmio_init(struct kvm_vm *vm, vm_paddr_t gpa)
return true;
  }
  
-void ucall_init(struct kvm_vm *vm, void *arg)

+void ucall_arch_init(struct kvm_vm *vm, void *arg)
  {
vm_paddr_t gpa, start, end, step, offset;
unsigned int bits;
@@ -64,31 +64,18 @@ void ucall_init(struct kvm_vm *vm, void *arg)
TEST_FAIL("Can't find a ucall mmio address");
  }
  
-void ucall_uninit(struct kvm_vm *vm)

+void ucall_arch_uninit(struct kvm_vm *vm)
  {
ucall_exit_mmio_addr = 0;
sync_global_to_guest(vm, ucall_exit_mmio_addr);
  }
  
-void ucall(uint64_t cmd, int nargs, ...)

+void ucall_arch_do_ucall(vm_vaddr_t uc)
  {
-   struct ucall uc = {
-   .cmd = cmd,
-   };
-   va_list va;
-   int i;
-
-   nargs = min(nargs, UCALL_MAX_ARGS);
-
-   va_start(va, nargs);
-   for (i = 0; i < nargs; ++i)
-   uc.args[i] = va_arg(va, uint64_t);
-   va_end(va);
-
-   *ucall_exit_mmio_addr = (vm_vaddr_t)
+   *ucall_exit_mmio_addr = uc;
  }
  
-uint64_t get_ucall(struct kvm_vcpu *vcpu, struct ucall *uc)

+uint64_t ucall_arch_get_ucall(struct kvm_vcpu *vcpu, struct ucall *uc)
  {
struct kvm_run *run = vcpu->run;
struct ucall ucall = {};
diff --git a/tools/testing/selftests/kvm/lib/riscv/ucall.c 
b/tools/testing/selftests/kvm/lib/riscv/ucall.c
index 087b9740bc8f..b1598f418c1f 100644
--- a/tools/testing/selftests/kvm/lib/riscv/ucall.c
+++ b/tools/testing/selftests/kvm/lib/riscv/ucall.c
@@ -10,11 +10,11 @@
  #include "kvm_util.h"
  #include "processor.h"
  
-void ucall_init(struct kvm_vm *vm, void *arg)

+void ucall_arch_init(struct kvm_vm *vm, void *arg)
  {
  }
  
-void ucall_uninit(struct kvm_vm *vm)

+void ucall_arch_uninit(struct kvm_vm *vm)
  {
  }
  
@@ -44,27 +44,14 @@ struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0,

return ret;
  }
  
-void 

Re: [PATCH 0/3] KVM: selftests: Consolidate ucall code

2022-06-20 Thread Andrew Jones
On Sat, Jun 18, 2022 at 12:16:15AM +, Sean Christopherson wrote:
> Consolidate the code for making and getting ucalls.  All architectures pass
> the ucall struct via memory, so filling and copying the struct is 100%
> generic.  The only per-arch code is sending and receiving the address of
> said struct.
> 
> Tested on x86 and arm, compile tested on s390 and RISC-V.

For the series

Reviewed-by: Andrew Jones 

Thanks,
drew

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 2/4] KVM: selftests: Increase UCALL_MAX_ARGS to 7

2022-06-20 Thread Andrew Jones
On Sat, Jun 18, 2022 at 12:09:11AM +, Sean Christopherson wrote:
> On Fri, Jun 17, 2022, Colton Lewis wrote:
> > On Thu, Jun 16, 2022 at 02:10:06PM +0200, Andrew Jones wrote:
> > > We probably want to ensure all architectures are good with this. afaict,
> > > riscv only expects 6 args and uses UCALL_MAX_ARGS to cap the ucall inputs,
> > > for example.
> > 
> > All architectures use UCALL_MAX_ARGS for that. Are you saying there
> > might be limitations beyond the value of the macro? If so, who should
> > verify whether this is ok?
> 
> I thought there were architectural limitations too, but I believe I was 
> thinking
> of vcpu_args_set(), where the number of params is limited by the function call
> ABI, e.g. the number of registers.
> 
> Unless there's something really, really subtle going on, all architectures 
> pass
> the actual ucall struct purely through memory.  Actually, that code is ripe 
> for
> deduplication, and amazingly it doesn't conflict with Colton's series.  
> Patches
> incoming...
>

RISC-V uses sbi_ecall() for their implementation of ucall(). CC'ing Anup
for confirmation, but if I understand the SBI spec correctly, then inputs
are limited to registers a0-a5.

Thanks,
drew

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm