On Tue, 2024-03-12 at 16:12 +0000, Daniel P. Berrangé wrote:
> On Tue, Mar 12, 2024 at 03:45:20PM +0000, Roy Hopkins wrote:
> > On Fri, 2024-03-01 at 17:01 +0000, Daniel P. Berrangé wrote:
> > > On Tue, Feb 27, 2024 at 02:50:13PM +0000, Roy Hopkins wrote:
> > > > +            /*
> > > > +             * Ideally we would provide the VMSA directly to kvm which
> > > > would
> > > > +             * ensure that the resulting initial VMSA measurement which
> > > > is
> > > > +             * calculated during KVM_SEV_LAUNCH_UPDATE_VMSA is
> > > > calculated
> > > > from
> > > > +             * exactly what we provide here. Currently this is not
> > > > possible
> > > > so
> > > > +             * we need to copy the parts of the VMSA structure that we
> > > > currently
> > > > +             * support into the CPU state.
> > > > +             */
> > > 
> > > This sounds like it is saying that the code is not honouring
> > > everything in the VMSA defiend by the IGVM file ?
> > > 
> > > If so, that is pretty awkward. The VMSA is effectively an external
> > > ABI between QEMU and the guest owner (or whatever is validating
> > > guest attestation reports for them), and thus predictability and
> > > stability of this over time is critical.
> > > 
> > > We don't want the attestation process to be dependent/variable on
> > > the particular QEMU/KVM version, because any upgrade to QEMU/KVM
> > > could then alter the effective VMSA that the guest owner sees.
> > > 
> > > We've already suffered pain in this respect not long ago when the
> > > kernel arbitrarily changed a default setting which altered the
> > > VMSA it exposed, breaking existing apps that validate attestation.
> > > 
> > > What will it take to provide the full VMSA to KVM, so that we can
> > > guarantee to the guest owner than the VMSA for the guest is going
> > > to perfectly match what their IGVM defined ?
> > > 
> > 
> > Yes, the fact that we have to copy the individual fields from the VMSA to
> > "CPUX86State" is less than ideal - a problem made worse by the fact that the
> > kernel does not allow direct control over some of the fields from userspace,
> > "sev_features" being a good example here where "SVM_SEV_FEAT_DEBUG_SWAP" is
> > unconditionally added by the kernel.
> 
> Ah yes, the SVM_SEV_FEAT_DEBUG_SWAP feature is the one I couldn't remember
> the name of in my quoted text above, that break our apps when the kernel
> suddenly set it by default (thankfully now reverted in Linux with
> 5abf6dceb066f2b02b225fd561440c98a8062681).
> 
> > The kernel VMSA is at least predictable. So, although we cannot yet allow
> > full
> > flexibility in providing a complete VMSA from QEMU and guarantee it will be
> > honoured, we could check to see if any settings conflict with those imposed
> > by
> > the kernel and exit with an error if this is the case. I chose not to
> > implement
> > for this first series but could easily add a patch to support this. The
> > problem
> > here is that it ties the version of QEMU to VMSA handling functionality in
> > the
> > kernel. Any change to the VMSA handling in the kernel would potentially
> > invalidate the checks in QEMU. The one upside here is that this will easily
> > be
> > detectable by the attestation measurement not matching the expected
> > measurement
> > of the IGVM file. But it will be difficult for the user to determine what
> > the
> > discrepancy is.
> 
> Yes, the difficulty in diagnosis is the big thing I'm worried about from
> a distro supportability POV. The DEBUG_SWAP issue caused us a bunch of
> pain and that's before CVMs are even widely used.
> 
> I agree that hardcoding checks in QEMU is pretty unpleasant, and probably
> not something that I'd want us to do. I'd want QEMU to be able to live
> query the kernel's default initial VMSA, if it were to be reporting
> differences vs the IGVM provided VMSA. I dn't think there's a way to
> do that nicely though - i only know of ftrace probes to dump it informally.
> 
> I guess if we know & document what subset of the VMSA QEMU /can/ directly
> control, that at least narrows down where to look if something does change
> or go wrong.
> 
Yes, it makes sense to document the subset that can be reliably set by QEMU,
along with any modifications made byt the kernel. Perhaps I should go one step
further and check that the VMSA does not contain any entries beyond what is
copied in "sev_apply_cpu_context()"? If any field other than those explicitly
copied by this function contain a non-zero value then an error is generated. As
you suggest this will limit the scope of any measurement differences to the
documented subset.

> > The ideal solution is to add or modify a KVM ioctl to allow the VMSA to be
> > set
> > directly, overriding the state in "CPUX86State". The current
> > KVM_SEV_LAUNCH_UPDATE_VMSA ioctl triggers the synchronisation of the VMSA
> > but
> > does not allow it to be specified directly. This could be modified for what
> > we
> > need. The SEV-SNP kernel patches add KVM_SEV_SNP_LAUNCH_UPDATE which allows
> > a
> > page type of VMSA to be updated, although the current patch series does not
> > support using this to set the initial state of the VMSA:
> > https://lore.kernel.org/lkml/20231230172351.574091-19-michael.r...@amd.com/ 
> > I
> > have experimented with this myself and have successfully modified the SEV-
> > SNP
> > kernel patches to support directly setting the VMSA from QEMU.
> > 
> > On the other hand, I have also verified that I can indeed measure an IGVM
> > file
> > loaded using the VMSA synchronisation method currently employed and get a
> > matching measurement from the SEV attestation report.
> > 
> > What would you suggest is the best way forward for this?
> 
> I'll delegate to Paolo for an opinion on the possiblity of new (or
> updated) ioctls to provide the full VMSA data.
> 
> If we can't directly set the full VMSA, then next best option is a
> more formal way query to VMSA. That way libvirt could report on
> what the default initial kernel VMSA state is, which could be useful
> debug info for any bug reports.
Setting the full VMSA definitely seems like the right option here. Querying the
VMSA that was actually measured would obviously give us the ability to diagnose
problems with the measurement but does not allow full compatibility with the
IGVM specification. This will potentially restrict the types of guests that can
be packaged using IGVM.

Another thing to bear in mind is that with the incoming host kernel support for
SEV-SNP, there are more constraints on how the VMSA is measured and populated.
In particular, the current patches for SEV-SNP automatically sync and measure
the VMSA as the final stage of guest measurement, requiring the IGVM file to
provide the VMSA as the final directive for the measurement to match. Also, the
kernel hardcodes the VMSA GPA, again requiring the IGVM file to match. If we
have the ability to provide the VMSA directly (including the GPA of the VMSA)
then these restrictions are removed.

I'd suggest that for SEV and SEV-ES, the current method of syncing certain
fields (and updating the QEMU documentation to describe this) is sufficient for
now. And perhaps this is ok for SEV-SNP too, but we should pursue the ability to
provide the full VMSA at least in the SEV-SNP case.

> 
> With regards,
> Daniel

Kind regards,
Roy

Reply via email to