On Wed, Dec 10, 2025 at 11:52:04AM +0000, Daniel P. Berrangé wrote:
> On Wed, Dec 10, 2025 at 12:27:30PM +0100, Kevin Wolf wrote:
> > Am 09.12.2025 um 17:28 hat Peter Xu geschrieben:
> > > [This is an RFC series, as being marked out.  It is trying to collect
> > >  opinions.  It's not for merging yet]
> > > 
> > > Background
> > > ==========
> > > 
> > > It all starts with machine compat properties..
> > > 
> > > Machine compat properties are the major weapon we use currently in QEMU to
> > > define a proper guest ABI, so that whenever we migration a VM instance 
> > > from
> > > whatever QEMU version1 to another QEMU version2, as long as the machine
> > > type is the same, logically the ABI is guaranteed, and migration should
> > > succeed.  If it didn't, it's a bug.
> > > 
> > > These compat properties are only attached to qdev for now.  It almost
> > > worked.
> > > 
> > > Said that, it's also not true - we already have non-qdev users of such, by
> > > explicitly code it up to apply the compat fields.  Please refer to the
> > > first patch commit message for details (meanwhile latter patches will
> > > convert them into a generic model).
> > > 
> > > Obviously, we have demands to leverage machine compat properties even
> > > outside of qdev.  It can be a network backend, it can be an object (for
> > > example, memory backends), it can be a migration object, and more.
> > 
> > This doesn't feel obvious to me at all. A machine type defines what
> > hardware the guest sees. Guest hardware is essentially qdev.
> > 
> > I don't see any reasons why a backend should be interested in what guest
> > hardware looks like, that would seem like a bad layering violation. Many
> > backends can even exist without a guest at all, and are also used in
> > tools like qemu-storage-daemon. Having a machine type in a tool that
> > doesn't run a guest doesn't make any sense.
> 
> The sev-guest compat property for 'legacy-vm-type' is an interesting
> example.
> 
> This property ultimately controls which of two different kernel ioctls
> for KVM are used for initializing the SEV guest. It can casue guest
> VM measurement changes, but none the less this is not really a guest
> ABI knob, it is a host kernel compatibility knob.  You need a newer
> host kernel version if this is set to 'off'.
> 
> So by associating this legacy-vm-type type with the machine type,
> we don't affect the guest hardware, but we *do* impact the ability
> to use that machine type depedning on what kernel version you have.

Yes.

Maybe I emphasized too much on "guest ABI" in the cover letter, so it can
be confusing when not reading into the details of the patchset (I did
mention all the existing users in patch 1, then converted all existing
users in patch 2-5).

Besides SEV, I can also quickly go over the rest ones if that wasn't clear
we're already using this feature.. in a open-coded way.  Maybe that'll make
it slightly easier to grasp for reviewers.

The current use case for hostmem (2nd example) on compat properties, see:

    commit fa0cb34d2210cc749b9a70db99bb41c56ad20831
    Author: Marc-André Lureau <[email protected]>
    Date:   Wed Sep 12 16:18:00 2018 +0400

    hostmem: use object id for memory region name with >= 4.0

So that's not strictly "guest ABI", but the goal was to persist the name of
MRs so that migration is not broken.  It's not strictly "guest ABI" but
more like "guest ABI for migration" - even if the guest OS cannot see the
names of MRs, migration can see (via ramblocks).  So it can be more than
the guest HWs.

3rd example in accel:

    commit fe174132478b4e7b0086f2305a511fd94c9aca8b
    Author: Paolo Bonzini <[email protected]>
    Date:   Wed Nov 13 15:16:44 2019 +0100

    tcg: add "-accel tcg,tb-size" and deprecate "-tb-size"

That was trying to keep some old behavior for accel cmdlines.  It's not
even a migration ABI, but cmdline ABI.

Hence OBJECT_COMPAT might be useful whenever we want to persist some ABI.
It can be machine compat properties, it can be something else that has
nothing to do with machine types.  The accel example used a separate entry
in object_compat_props[] (index 0) for the same purpose, out of three:

/*
 * Global property defaults
 * Slot 0: accelerator's global property defaults
 * Slot 1: machine's global property defaults
 * Slot 2: global properties from legacy command line option
 * Each is a GPtrArray of GlobalProperty.
 * Applied in order, later entries override earlier ones.
 */
static GPtrArray *object_compat_props[3];

> 
> IOW, QEMU machine types version 9.1 or later are no longer runnable
> on many old kernels.
> 
> Over the years we've had a good number of occassions where we want
> defaults changes, or worse, where we auto-negotiate features, which
> depend on host kernel version.
> 
> I've suggested in the past that IMHO we're missing a concept of a
> "versioned platform", to complement the "versioned machine" concept.
> 
> That would let mgmt apps decide what platform compatibility level
> they required, independantly of choosing machine types, and so avoid
> creating runability constraints on machine types.

Yes, I agree some kind of "versioned platform" would be nice.  In practise,
I wonder if it needs to be versioned at all, or something like a
query-platform QMP API that will return the capabilities of the host in
QEMU's view.  Maybe versioning isn't needed here.

Taking USO* feature for virtio-net as example, it should report what kind
of USO* features are supported on this host.

IMHO it doesn't need to be the "yet another" weapon to define guest ABI /
QEMU ABI.

Mgmt should leverage that interface to query and get platform informations
across the whole cluster, then find a minimum set. Maybe also plus
something the user would specify, for example, an user may want to
explicitly disable USO feature on the whole cluster, then mgmt should also
take it into account.

After the min subset of platform features selected, mgmt will need to map
them into device properties and apply then when booting VM.  Then guest ABI
is guaranteed.

So I think it might still be good we stick with the solo way to define the
ABI, I hope we can stick with machine types.

There're also other things that may not be attached to platform
capabilities, like the current discussion in:

https://lore.kernel.org/all/[email protected]/

That is about some new capability of network backend (in this case, TAP)
that we should always enable for new QEMUs, but disable for old QEMUs.
Such won't appear at all in query-platform..  Current solution proposed in
that series was adding per-device special QMP commands to query and set
these features.  However IMHO essentially it's the same problem that
object_compat_props() is solving.  It's just that we need it to work
outside of QDEV.

With OBJECT_COMPAT, we could QOMify TAP into an object and inherit from
OBJECT_COMPAT.

Thanks,

> 
> > So if we do introduce some mechanism to provide different defaults for
> > compatibility with older versions, it has to be separate from machine
> > types.
> > 
> > Maybe it would make most sense to address this on the QAPI level then
> > and finally fully QAPIfy the command line. Adding defaults to the QAPI
> > schema is something that has come up again and again, so maybe we could
> > introduce that and do it in a versioned way from the start.
> 
> 
> With regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 

-- 
Peter Xu


Reply via email to