Re: [PATCH RFC 00/10] qapi: remove all TARGET_* conditionals from the schema

Markus Armbruster Fri, 09 May 2025 11:57:34 -0700

Daniel P. Berrangé <berra...@redhat.com> writes:

> On Thu, May 08, 2025 at 02:09:37PM -0700, Pierrick Bouvier wrote:
>> On 5/8/25 6:58 AM, Daniel P. Berrangé wrote:
>> > Pierrick has proposed a series that introduces a concept of runtime
>> > conditionals to the QAPI schema, in order to adapt the TARGET_*
>> > conditionals currently used at build time:
>> > 
>> >    https://lists.nongnu.org/archive/html/qemu-devel/2025-05/msg01699.html
>> > 
>> > For the sake of comparison & evaluation, this series illustrates the
>> > alternative approach that we've discussed of entirely removing any
>> > concept of TARGET_* conditionals.
>> > 
>> > With this the QAPI schema varies solely based on CONFIG_* conditionals,
>> > and is thus invariant across different target emulators.
>> > 
>> > In this PoC I've taken the minimal effort approach to the problem.
>> > 
>> > The QAPI schema has removed the TARGET_* conditionals and in order to
>> > make all the emulators then compile, the stubs/ directory is populated
>> > with a bunch of files to provide dummy impls of the target specific QMP
>> > commands.
>> > 
>> > This is sufficient to make the current QEMU binaries build successfully.
>> > 
>> > To make the "single binary" concept work, however, would require
>> > additional followup work to eliminate the stubs.
>> > 
>> > Instead of having stubs we would need to de-couple the QMP command
>> > impl from the machine internals. This would likely require greater
>> > use of interfaces and/or virtual method dispatchers on the machine
>> > class. This would enable the 'qmp_XXXXX' command impls to exist
>> > once. Then they call out to virtual methods on the machine to provide
>> > the real impl, and/or generate an error if the virtual method is not
>> > implemented for the machine.
>> > 
>> 
>> Thanks for posting it Daniel.
>> 
>> I think your approach is pretty neat, and yes, it's much simpler than having
>> any compile time or runtime conditional to deal with that.
>> 
>> When we talked about that on previous thread, I thought the idea was to
>> expose *all* the commands to *all* the targets, which I didn't really
>> understand, considering we have target specific commands by design.
>> I understand better where you wanted to go, by extracting concerned commands
>> in dedicated files.
>> 
>> The only downside I can see is that some commands have to be there, but
>> return an "error, not implemented" at runtime. Fine for me, but some people
>> may argue against it.
>> 
>> A concern I might have as well is about how we'll deal if we want to hide
>> some commands in the future, based on various criterias
>> (is_heterogenenous()?). The mantra "define all, and let the build system
>> hide things" mantra means you can only have a single definition existing in
>> the binary, by design. But maybe it's not even a real concern, and I
>> definitely prefer to see problems before fixing them, so it's definitely not
>> blocking this approach.
>
> I think we have to distinguish between what made sense in the context
> of our original design, from what makes sense for our future design(s)
> and plans.


No argument.

The original design idea is simple: #ifdef CONFIG_FOO for QAPI schema,
to not generate dead code, and to not have silly stubs.  Even if you
don't care about wasting disk and address space, you probably care about
wasting developer time on writing silly stubs and waiting for dead code
to compile.

Initially, target-specific macros did not work in conditions.  That was
easy enough to fix, so we did.

However, target-specific conditions turned out less useful than
anticipated.  Adding a target-specific conditional anywhere in a
definition makes the entire definition and its users target-specific.
This can make handwritten code artificially target-specific, which is
undesirable.

Target-specific conditions now also complicate the single binary work.

> I can understand why we took the direction we did historically, but I
> think we have unwittingly created problems for ourselves that are
> looking increasingly worse than the problems we considered solved.
>
>
> In the other thread I pointed out my dislike for QAPI schema not being
> fully self-describing when we have conditionals present, even today,
> but there are other aspects that trouble me more wrt conditionals.

I'm not sure I have all this present in my mind...  Can you summarize
what troubles you?  Or point me to the message(s)?

> Since the schema is currently target specific, a mgmt app has to probe
> for QEMU capabilities once for every target binary. QEMU has ~30 system
> binaries, and if querying capabilities takes 250 milliseconds, then
> querying all binaries is ~ 7 seconds of work. Libvirt mitigates this
> overhead by caching the results of capabilities probes. We periodically
> suffer delays when we must invalidate our cache though, typically on
> software upgrades, and this is unpleasant for users who think we've
> got stuck or just have bad slow code.

How does cache invalidation work?  Timestamp of binary?

> Even if we had a QAPI schema that didn't vary per target, this is
> repeated probing is tricky to avoid when we have completely independant
> binaries. We would need QEMU to have some internal "build id", so that
> we could detect that all binaries came from the same build, to let us
> avoid re-probing each binary.

Back when I created QAPI/QMP introspection, I floated the idea to put
something into the QMP greeting to enable safe caching.  Libvirt
developers told me they didn't need that.  I don't remember the details,
but I guess the cache invalidation they already had was deemed good
enough.

> With your work to have a single binary that can impersonate any of our
> target specific binaries, *if* we had a QAPI schema that didn't vary
> per target, we would stand a real chance of being able to eliminate
> the need to probe capabilities once for every target.

This is actually independent of "single binary".

query-qmp-schema's value will be the same for all qemu-system-FOO built
from same QAPI schema source and configuration.

If there are no target-dependent conditionals, schema configuration is
the same regardless of target.

Of course, capability probing can go beyond query-qmp-schema, but you
get the idea.

> eg if we assume qemu-system-x86_64 is a symlink to "/usr/bin/qemu-system"
> (or whatever you call the common binary), then we only need to probe one
> binary - the target of the symlink - we can easily identify that all
> the legacy binary symlinks resolve to the same real binary with no special
> effort.

Have a message digest of query-qmp-schema's value in the QMP greeting.
Management application uses it as key to cache probe results.

> Conversely if we want the QAPI schema to be varying at runtime per target,
> it means we have to select a target before we can probe the schema in the
> binary. If we assume an install with only the new binary and none of the
> legacy target specific symlinks, we have a chicken & egg problem though.
>
> How do we know what targets are supported ? We should be able to use QAPI
> to query the available targets, but before we can use QAPI we need to
> select a target ?
>
> We could make QAPI only expose the pieces of schema that are *not*
> specific to a target initially, and then refine the schema once a
> target is selected. That would solve the immediate problem, but I
> think that kind of cleverness is just postponing the pain
>
> Consider that we eventually want a QEMU binary that can run CPUs from
> multiple targets concurrently.  That implies that a dynamic QAPI schema
> needs to turn on the target specific pieces of multiple targets at once.
> At that point a mgmt app fundamentally has to understand ahead of time
> that some features are target specific - the existance/absence of
> features in the QAPI schema can no longer express what is target
> specific, since we must expose the union of features all targets
> that are in use.
>
> So as we look to the future of QEMU, IMHO, the concept of using the QAPI
> schema definitions to dynamically express what is available per-target
> is unhelpful and likely even to be harmful.
>
> Assuming a schema that is fixed at build time ought to simplify QEMU
> maintainers' life, and while mgmt apps loose a little info, they will
> gain in other ways such as not having to query QEMU capabilities
> 30 times over.

Schema fixed at build time is a fundamental design choice we made for
QAPI.

It limits the questions schema introspection can answer.  Can affect
valid questions, but that doesn't mean we have to answer them via schema
introspection.  We've been tempted to answer them via schema
introspection because schema introspection has been such a success.

We may need to think about replacements better than "add yet another ad
hoc query command" and "QOM tree spelunking".  Not a blocker.

>> Overall, if I had to choose, I think I would pick your approach instead of
>> the "runtime conditional" route, it's just simpler and saner overall.
>> I would be happy to see you take ownership on that, as long as we can
>> deliver something in the next weeks. I can help on anything as well.
>
> I'm fine with you driving this forward, not least because I can't commit
> to having any predictable amount of time to write code myself for this.
>
>> @Markus, @Michael, which approach would you prefer to be followed for QAPI?
>
> With regards,
> Daniel

Re: [PATCH RFC 00/10] qapi: remove all TARGET_* conditionals from the schema

Reply via email to