On Thu, Apr 04, 2019 at 10:18:54PM +0200, Andrew Jones wrote:
> On Fri, Mar 29, 2019 at 01:00:48PM +0000, Dave Martin wrote:
> > This patch adds a new pseudo-register KVM_REG_ARM64_SVE_VLS to
> > allow userspace to set and query the set of vector lengths visible
> > to the guest.
> >
> > In the future, multiple register slices per SVE register may be
> > visible through the ioctl interface. Once the set of slices has
> > been determined we would not be able to allow the vector length set
> > to be changed any more, in order to avoid userspace seeing
> > inconsistent sets of registers. For this reason, this patch adds
> > support for explicit finalization of the SVE configuration via the
> > KVM_ARM_VCPU_FINALIZE ioctl.
> >
> > Finalization is the proper place to allocate the SVE register state
> > storage in vcpu->arch.sve_state, so this patch adds that as
> > appropriate. The data is freed via kvm_arch_vcpu_uninit(), which
> > was previously a no-op on arm64.
> >
> > To simplify the logic for determining what vector lengths can be
> > supported, some code is added to KVM init to work this out, in the
> > kvm_arm_init_arch_resources() hook.
> >
> > The KVM_REG_ARM64_SVE_VLS pseudo-register is not exposed yet.
> > Subsequent patches will allow SVE to be turned on for guest vcpus,
> > making it visible.
> >
> > Signed-off-by: Dave Martin <[email protected]>
> > Reviewed-by: Julien Thierry <[email protected]>
> > Tested-by: zhang.lei <[email protected]>
> >
> > ---
[...]
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index 2aa80a5..086ab05 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -206,6 +206,73 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const
> > struct kvm_one_reg *reg)
> > return err;
> > }
> >
> > +#define vq_word(vq) (((vq) - SVE_VQ_MIN) / 64)
> > +#define vq_mask(vq) ((u64)1 << ((vq) - SVE_VQ_MIN) % 64)
> > +
> > +static bool vq_present(
> > + const u64 (*const vqs)[DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)],
> > + unsigned int vq)
> > +{
> > + return (*vqs)[vq_word(vq)] & vq_mask(vq);
> > +}
> > +
> > +static int get_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg
> > *reg)
> > +{
> > + unsigned int max_vq, vq;
> > + u64 vqs[DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)];
> > +
> > + if (WARN_ON(!sve_vl_valid(vcpu->arch.sve_max_vl)))
> > + return -EINVAL;
> > +
> > + memset(vqs, 0, sizeof(vqs));
> > +
> > + max_vq = sve_vq_from_vl(vcpu->arch.sve_max_vl);
> > + for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
> > + if (sve_vq_available(vq))
> > + vqs[vq_word(vq)] |= vq_mask(vq);
> > +
> > + if (copy_to_user((void __user *)reg->addr, vqs, sizeof(vqs)))
> > + return -EFAULT;
> > +
> > + return 0;
> > +}
> > +
> > +static int set_sve_vls(struct kvm_vcpu *vcpu, const struct kvm_one_reg
> > *reg)
> > +{
> > + unsigned int max_vq, vq;
> > + u64 vqs[DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)];
>
> There are three of these long 'DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)'.
> A macro is probably warranted.
Fair point. These are a bit cumbersome. How about:
#define SVE_VLS_WORDS DIV_ROUND_UP(SVE_VQ_MAX - SVE_VQ_MIN + 1, 64)
Annoyingly, this is virtually identical to a Linux bitmap: the base type
is u64 instead of unsigned long; otherwise there's no intentional
difference.
(Aside: do you know why the KVM API insists on everything being u64?
This makes sense for non-native types (like guest registers), but not
for native things like host userspace addresses etc.)
> > +
> > + if (kvm_arm_vcpu_sve_finalized(vcpu))
> > + return -EPERM; /* too late! */
> > +
> > + if (WARN_ON(vcpu->arch.sve_state))
> > + return -EINVAL;
> > +
> > + if (copy_from_user(vqs, (const void __user *)reg->addr, sizeof(vqs)))
> > + return -EFAULT;
> > +
> > + max_vq = 0;
> > + for (vq = SVE_VQ_MIN; vq <= SVE_VQ_MAX; ++vq)
> > + if (vq_present(&vqs, vq))
> > + max_vq = vq;
> > +
> > + if (max_vq > sve_vq_from_vl(kvm_sve_max_vl))
> > + return -EINVAL;
> > +
> > + for (vq = SVE_VQ_MIN; vq <= max_vq; ++vq)
> > + if (vq_present(&vqs, vq) != sve_vq_available(vq))
> > + return -EINVAL;
>
> What about supporting the optional non-power-of-2 vector lengths? For
> example, if the maximum VL is 512, then in addition to 512, 128, and
> 256 there could be a 384. If we plan to start a guest on a machine
> that has all four and then migrate it to a machine that only has
> 128,256,512 we would want to set the VL set to 128,256,512, even on
> the machine that supports 384. If I understand the above test correctly,
> then that wouldn't work.
This is exactly what the code is trying to forbid, though it may not be
obvious here why.
The trouble is, we can't correctly emulate a vcpu supporting
{128,256,512} on hardware that also supports 384-bit vectors: the
architecture says that the vector length you get is determined by
rounding ZCR_EL1.LEN down to the next vector length actually
supported.
So, the guest would expect writing ZCR_EL1.LEN = 2 to give a vector
length of 256 bits, whereas on this hardware the actual resulting
vector length would be 384 bits.
sve_probe_vqs() relies on this property to detect the set of available
vector lengths. Simple, bare-metal guests might also just set
ZCR_ELx.LEN = 0x1ff to just get the max available.
The architecture says categorically that the set of vector lengths
supported is a fixed property of the (emulated) hardware -- we can't
having this changing under the guest's feet.
Fixing this would require an archietctural way to filter out individual
vector lengths from the supported set, not just a way to clamp the
maximum (which is all ZCR_EL2.LEN gives us).
The general expectation is that sanely designed cluster will be
"homogeneous enough" and won't trigger this check -- it's here
just in case.
I attempt to capture this in api.txt, leaving the door open in case the
architecture gives a way to support this in future:
| KVM_REG_ARM64_SVE_VLS
|
| [...]
|
| Apart from simply removing all vector lengths from the host set
| that exceed some value, support for arbitrarily chosen sets of
| vector lengths is hardware-dependent and may not be available.
| Attempting to configure an invalid set of vector lengths via
| KVM_SET_ONE_REG will fail with EINVAL.
Is this enough, or do you think more explanation is needed somewhere?
[...]
Chees
---DavE
_______________________________________________
kvmarm mailing list
[email protected]
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm