Re: [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access ioctl interface

2018-11-21 Thread Dave Martin
On Wed, Nov 21, 2018 at 03:20:15PM +, Alex Bennée wrote:
> 
> Dave Martin  writes:
> 
> > This patch adds the following registers for access via the
> > KVM_{GET,SET}_ONE_REG interface:
> >
> >  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
> >  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
> >  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
> >
> > In order to adapt gracefully to future architectural extensions,
> > the registers are divided up into slices as noted above:  the i
> > parameter denotes the slice index.
> >
> > For simplicity, bits or slices that exceed the maximum vector
> > length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> > read as zero for KVM_GET_ONE_REG.
> >
> > For the current architecture, only slice i = 0 is significant.  The
> > interface design allows i to increase to up to 31 in the future if
> > required by future architectural amendments.
> >
> > The registers are only visible for vcpus that have SVE enabled.
> > They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> > have SVE.  In all cases, surplus slices are not enumerated by
> > KVM_GET_REG_LIST.
> >
> > Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not
> > allowed for SVE-enabled vcpus: SVE-aware userspace can use the
> > KVM_REG_ARM64_SVE_ZREG() interface instead to access the same
> > register state.  This avoids some complex and pointless emluation
> > in the kernel.
> >
> > Signed-off-by: Dave Martin 
> > ---
> >
> > Changes since RFCv1:
> >
> >  * Refactored to remove emulation of FPSIMD registers with the SVE
> >register view and vice-versa.  This simplifies the code a fair bit.
> >
> >  * Fixed a couple of range errors.
> >
> >  * Inlined various trivial helpers that now have only one call site.
> >
> >  * Use KVM_REG_SIZE() as a symbolic way of getting SVE register slice
> >sizes.
> > ---
> >  arch/arm64/include/uapi/asm/kvm.h |  10 +++
> >  arch/arm64/kvm/guest.c| 147 
> > ++
> >  2 files changed, 145 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/arm64/include/uapi/asm/kvm.h 
> > b/arch/arm64/include/uapi/asm/kvm.h
> > index 97c3478..1ff68fa 100644
> > --- a/arch/arm64/include/uapi/asm/kvm.h
> > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > @@ -226,6 +226,16 @@ struct kvm_vcpu_events {
> >  KVM_REG_ARM_FW | ((r) & 0x))
> >  #define KVM_REG_ARM_PSCI_VERSION   KVM_REG_ARM_FW_REG(0)
> >
> > +/* SVE registers */
> > +#define KVM_REG_ARM64_SVE  (0x15 << KVM_REG_ARM_COPROC_SHIFT)
> > +#define KVM_REG_ARM64_SVE_ZREG(n, i)   (KVM_REG_ARM64 | 
> > KVM_REG_ARM64_SVE | \
> > +KVM_REG_SIZE_U2048 |   \
> > +((n) << 5) | (i))
> > +#define KVM_REG_ARM64_SVE_PREG(n, i)   (KVM_REG_ARM64 | 
> > KVM_REG_ARM64_SVE | \
> > +KVM_REG_SIZE_U256 |\
> > +((n) << 5) | (i) | 0x400)
> 
> What's the 0x400 for? Aren't PREG's already unique by being 256 bit vs
> the Z regs 2048 bit size?

I was treating the reg size field as metadata rather than being part
of the ID, so the IDs remain unique even with the size field masked
out.

For the core regs, we explicitly allow access to the same underlying
regs using a mix of access sizes (whether or not that is a good idea is
another question).

For the rest of the regs perhaps we can rely on the size field to
disambiguate different regs in practice, but I didn't reel comfortable
making that assumption...

> 
> > +#define KVM_REG_ARM64_SVE_FFR(i)   KVM_REG_ARM64_SVE_PREG(16, i)
> > +
> >  /* Device Control API: ARM VGIC */
> >  #define KVM_DEV_ARM_VGIC_GRP_ADDR  0
> >  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 953a5c9..320db0f 100644
> 
> >
> > @@ -130,6 +154,107 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const 
> > struct kvm_one_reg *reg)
> > return err;
> >  }
> >
> > +struct kreg_region {
> > +   char *kptr;
> > +   size_t size;
> > +   size_t zeropad;
> > +};
> > +
> > +#define SVE_REG_SLICE_SHIFT0
> > +#define SVE_REG_SLICE_BITS 5
> > +#define SVE_REG_ID_SHIFT   (SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
> > +#define SVE_REG_ID_BITS5
> > +
> > +#define SVE_REG_SLICE_MASK \
> > +   (GENMASK(SVE_REG_SLICE_BITS - 1, 0) << SVE_REG_SLICE_SHIFT)
> > +#define SVE_REG_ID_MASK\
> > +   (GENMASK(SVE_REG_ID_BITS - 1, 0) << SVE_REG_ID_SHIFT)
> > +
> 
> I guess this all comes out in the wash once the constants are folded but
> GENMASK does seem to be designed for arbitrary bit positions:
> 
>   #define SVE_REG_SLICE_MASK \
>  GEN_MASK(SVE_REG_SLICE_BITS + SVE_REG_SLICE_SHIFT - 1, 
> SVE_REG_SLICE_SHIFT)
> 
> Hmm I guess that might be even harder to follow...

Swings and roundabouts...

I'm not sure I prefer your version, but I agree it'

Re: [RFC PATCH v2 15/23] KVM: arm64/sve: Add SVE support to register access ioctl interface

2018-11-21 Thread Alex Bennée

Dave Martin  writes:

> This patch adds the following registers for access via the
> KVM_{GET,SET}_ONE_REG interface:
>
>  * KVM_REG_ARM64_SVE_ZREG(n, i) (n = 0..31) (in 2048-bit slices)
>  * KVM_REG_ARM64_SVE_PREG(n, i) (n = 0..15) (in 256-bit slices)
>  * KVM_REG_ARM64_SVE_FFR(i) (in 256-bit slices)
>
> In order to adapt gracefully to future architectural extensions,
> the registers are divided up into slices as noted above:  the i
> parameter denotes the slice index.
>
> For simplicity, bits or slices that exceed the maximum vector
> length supported for the vcpu are ignored for KVM_SET_ONE_REG, and
> read as zero for KVM_GET_ONE_REG.
>
> For the current architecture, only slice i = 0 is significant.  The
> interface design allows i to increase to up to 31 in the future if
> required by future architectural amendments.
>
> The registers are only visible for vcpus that have SVE enabled.
> They are not enumerated by KVM_GET_REG_LIST on vcpus that do not
> have SVE.  In all cases, surplus slices are not enumerated by
> KVM_GET_REG_LIST.
>
> Accesses to the FPSIMD registers via KVM_REG_ARM_CORE is not
> allowed for SVE-enabled vcpus: SVE-aware userspace can use the
> KVM_REG_ARM64_SVE_ZREG() interface instead to access the same
> register state.  This avoids some complex and pointless emluation
> in the kernel.
>
> Signed-off-by: Dave Martin 
> ---
>
> Changes since RFCv1:
>
>  * Refactored to remove emulation of FPSIMD registers with the SVE
>register view and vice-versa.  This simplifies the code a fair bit.
>
>  * Fixed a couple of range errors.
>
>  * Inlined various trivial helpers that now have only one call site.
>
>  * Use KVM_REG_SIZE() as a symbolic way of getting SVE register slice
>sizes.
> ---
>  arch/arm64/include/uapi/asm/kvm.h |  10 +++
>  arch/arm64/kvm/guest.c| 147 
> ++
>  2 files changed, 145 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm64/include/uapi/asm/kvm.h 
> b/arch/arm64/include/uapi/asm/kvm.h
> index 97c3478..1ff68fa 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -226,6 +226,16 @@ struct kvm_vcpu_events {
>KVM_REG_ARM_FW | ((r) & 0x))
>  #define KVM_REG_ARM_PSCI_VERSION KVM_REG_ARM_FW_REG(0)
>
> +/* SVE registers */
> +#define KVM_REG_ARM64_SVE(0x15 << KVM_REG_ARM_COPROC_SHIFT)
> +#define KVM_REG_ARM64_SVE_ZREG(n, i) (KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +  KVM_REG_SIZE_U2048 |   \
> +  ((n) << 5) | (i))
> +#define KVM_REG_ARM64_SVE_PREG(n, i) (KVM_REG_ARM64 | KVM_REG_ARM64_SVE | \
> +  KVM_REG_SIZE_U256 |\
> +  ((n) << 5) | (i) | 0x400)

What's the 0x400 for? Aren't PREG's already unique by being 256 bit vs
the Z regs 2048 bit size?

> +#define KVM_REG_ARM64_SVE_FFR(i) KVM_REG_ARM64_SVE_PREG(16, i)
> +
>  /* Device Control API: ARM VGIC */
>  #define KVM_DEV_ARM_VGIC_GRP_ADDR0
>  #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS   1
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index 953a5c9..320db0f 100644

>
> @@ -130,6 +154,107 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const 
> struct kvm_one_reg *reg)
>   return err;
>  }
>
> +struct kreg_region {
> + char *kptr;
> + size_t size;
> + size_t zeropad;
> +};
> +
> +#define SVE_REG_SLICE_SHIFT  0
> +#define SVE_REG_SLICE_BITS   5
> +#define SVE_REG_ID_SHIFT (SVE_REG_SLICE_SHIFT + SVE_REG_SLICE_BITS)
> +#define SVE_REG_ID_BITS  5
> +
> +#define SVE_REG_SLICE_MASK \
> + (GENMASK(SVE_REG_SLICE_BITS - 1, 0) << SVE_REG_SLICE_SHIFT)
> +#define SVE_REG_ID_MASK  \
> + (GENMASK(SVE_REG_ID_BITS - 1, 0) << SVE_REG_ID_SHIFT)
> +

I guess this all comes out in the wash once the constants are folded but
GENMASK does seem to be designed for arbitrary bit positions:

  #define SVE_REG_SLICE_MASK \
 GEN_MASK(SVE_REG_SLICE_BITS + SVE_REG_SLICE_SHIFT - 1, SVE_REG_SLICE_SHIFT)

Hmm I guess that might be even harder to follow...

> +#define SVE_NUM_SLICES (1 << SVE_REG_SLICE_BITS)
> +
> +static int sve_reg_region(struct kreg_region *b,
> +   const struct kvm_vcpu *vcpu,
> +   const struct kvm_one_reg *reg)
> +{
> + const unsigned int vl = vcpu->arch.sve_max_vl;
> + const unsigned int vq = sve_vq_from_vl(vl);
> +
> + const unsigned int reg_num =
> + (reg->id & SVE_REG_ID_MASK) >> SVE_REG_ID_SHIFT;
> + const unsigned int slice_num =
> + (reg->id & SVE_REG_SLICE_MASK) >> SVE_REG_SLICE_SHIFT;
> +
> + unsigned int slice_size, offset, limit;
> +
> + if (reg->id >= KVM_REG_ARM64_SVE_ZREG(0, 0) &&
> + reg->id <= KVM_REG_ARM64_SVE_ZREG(SVE_NUM_ZREGS - 1,
> +   SVE_NUM_SLICES - 1)) {
> + slice_size = K