On 10.12.25 19:17, Peter Xu wrote:
On Wed, Dec 10, 2025 at 11:52:04AM +0000, Daniel P. Berrangé wrote:
On Wed, Dec 10, 2025 at 12:27:30PM +0100, Kevin Wolf wrote:
Am 09.12.2025 um 17:28 hat Peter Xu geschrieben:
[This is an RFC series, as being marked out.  It is trying to collect
  opinions.  It's not for merging yet]

Background
==========

It all starts with machine compat properties..

Machine compat properties are the major weapon we use currently in QEMU to
define a proper guest ABI, so that whenever we migration a VM instance from
whatever QEMU version1 to another QEMU version2, as long as the machine
type is the same, logically the ABI is guaranteed, and migration should
succeed.  If it didn't, it's a bug.

These compat properties are only attached to qdev for now.  It almost
worked.

Said that, it's also not true - we already have non-qdev users of such, by
explicitly code it up to apply the compat fields.  Please refer to the
first patch commit message for details (meanwhile latter patches will
convert them into a generic model).

Obviously, we have demands to leverage machine compat properties even
outside of qdev.  It can be a network backend, it can be an object (for
example, memory backends), it can be a migration object, and more.

This doesn't feel obvious to me at all. A machine type defines what
hardware the guest sees. Guest hardware is essentially qdev.

I don't see any reasons why a backend should be interested in what guest
hardware looks like, that would seem like a bad layering violation. Many
backends can even exist without a guest at all, and are also used in
tools like qemu-storage-daemon. Having a machine type in a tool that
doesn't run a guest doesn't make any sense.

The sev-guest compat property for 'legacy-vm-type' is an interesting
example.

This property ultimately controls which of two different kernel ioctls
for KVM are used for initializing the SEV guest. It can casue guest
VM measurement changes, but none the less this is not really a guest
ABI knob, it is a host kernel compatibility knob.  You need a newer
host kernel version if this is set to 'off'.

So by associating this legacy-vm-type type with the machine type,
we don't affect the guest hardware, but we *do* impact the ability
to use that machine type depedning on what kernel version you have.

Yes.

Maybe I emphasized too much on "guest ABI" in the cover letter, so it can
be confusing when not reading into the details of the patchset (I did
mention all the existing users in patch 1, then converted all existing
users in patch 2-5).

Besides SEV, I can also quickly go over the rest ones if that wasn't clear
we're already using this feature.. in a open-coded way.  Maybe that'll make
it slightly easier to grasp for reviewers.

The current use case for hostmem (2nd example) on compat properties, see:

     commit fa0cb34d2210cc749b9a70db99bb41c56ad20831
     Author: Marc-André Lureau <[email protected]>
     Date:   Wed Sep 12 16:18:00 2018 +0400

     hostmem: use object id for memory region name with >= 4.0

So that's not strictly "guest ABI", but the goal was to persist the name of
MRs so that migration is not broken.  It's not strictly "guest ABI" but
more like "guest ABI for migration" - even if the guest OS cannot see the
names of MRs, migration can see (via ramblocks).  So it can be more than
the guest HWs.

3rd example in accel:

     commit fe174132478b4e7b0086f2305a511fd94c9aca8b
     Author: Paolo Bonzini <[email protected]>
     Date:   Wed Nov 13 15:16:44 2019 +0100

     tcg: add "-accel tcg,tb-size" and deprecate "-tb-size"

That was trying to keep some old behavior for accel cmdlines.  It's not
even a migration ABI, but cmdline ABI.

Hence OBJECT_COMPAT might be useful whenever we want to persist some ABI.
It can be machine compat properties, it can be something else that has
nothing to do with machine types.  The accel example used a separate entry
in object_compat_props[] (index 0) for the same purpose, out of three:

/*
  * Global property defaults
  * Slot 0: accelerator's global property defaults
  * Slot 1: machine's global property defaults
  * Slot 2: global properties from legacy command line option
  * Each is a GPtrArray of GlobalProperty.
  * Applied in order, later entries override earlier ones.
  */
static GPtrArray *object_compat_props[3];


IOW, QEMU machine types version 9.1 or later are no longer runnable
on many old kernels.

Over the years we've had a good number of occassions where we want
defaults changes, or worse, where we auto-negotiate features, which
depend on host kernel version.

I've suggested in the past that IMHO we're missing a concept of a
"versioned platform", to complement the "versioned machine" concept.

That would let mgmt apps decide what platform compatibility level
they required, independantly of choosing machine types, and so avoid
creating runability constraints on machine types.

Yes, I agree some kind of "versioned platform" would be nice.  In practise,
I wonder if it needs to be versioned at all, or something like a
query-platform QMP API that will return the capabilities of the host in
QEMU's view.  Maybe versioning isn't needed here.

Taking USO* feature for virtio-net as example, it should report what kind
of USO* features are supported on this host.

IMHO it doesn't need to be the "yet another" weapon to define guest ABI /
QEMU ABI.

Mgmt should leverage that interface to query and get platform informations
across the whole cluster, then find a minimum set. Maybe also plus
something the user would specify, for example, an user may want to
explicitly disable USO feature on the whole cluster, then mgmt should also
take it into account.

After the min subset of platform features selected, mgmt will need to map
them into device properties and apply then when booting VM.  Then guest ABI
is guaranteed.

So I think it might still be good we stick with the solo way to define the
ABI, I hope we can stick with machine types.

There're also other things that may not be attached to platform
capabilities, like the current discussion in:

https://lore.kernel.org/all/[email protected]/

That is about some new capability of network backend (in this case, TAP)
that we should always enable for new QEMUs, but disable for old QEMUs.
Such won't appear at all in query-platform..  Current solution proposed in
that series was adding per-device special QMP commands to query and set
these features.  However IMHO essentially it's the same problem that
object_compat_props() is solving.  It's just that we need it to work
outside of QDEV.

With OBJECT_COMPAT, we could QOMify TAP into an object and inherit from
OBJECT_COMPAT.

Thanks,


My two cents.

1. QAPI vs QOM

First, QEMU has too many object-like interfaces: QOM, QAPI structures, 
vmstates..
Of course, it would be better to have one generic. Should it be QAPI or QOBJECT?

If I understand correctly, these series expand the QOBJECTS's zone of 
influence. And
Kevin argue, that it could be QAPI instead. Right?

QOBJECT is better in a way that it represent existing in-QEMU state. QAPI 
structures
are normally used only temporarily during QMP calls.

If everything was a QOBJECT, we probably should not have to invent all these 
-replace-
and -change- commands in QMP for block layer object, but only implement set/get 
for
corresponding QOBJECT properties. And we'd not have to unite the namespaces of
qom paths, block export names and block node names, they all would be qom 
paths..

So we could live in the world, where we need to implement only "action" QMP 
commands,
like "migrate", "quit", "cont". But to change/query the state, we always use 
qom-set
qom-get, and don't invent a new command for each piece of state.

On the other hand, QAPI is a lot better in a way that it have explicit json 
schema.
QAPI definition is a protocol specification in the same time, we don't have such
thing for QOM. Probably the best world would be a QAPI-like interface to 
internal
QEMU objects, which are defined by QAPI structures. A kind of combination of 
best
options of QOM and QAPI worlds. But that's only a dream.


2. Machine types

I'd not care about them too much. Machine type is a syntax sugar, it's simply a
"set of defaults".

So, I think it's OK to share the concept wider than guest-ABI. What's wrong if
we just rename "Machine Type" into "Set Of QOM Defaults", and follow Peter's
suggestion? This way it will not conflict with tools that doesn't start the
guest, or don't have frontends.


--
Best regards,
Vladimir

Reply via email to