Apologies for the delay. I finally got back to looking into this again
the past few days, and have some observations.
On 2026-01-20 9:22 PM, Peter Krempa wrote:
'virDomainSetVcpusFlags' is not the correct API here because it can
Right, I was just referring to that as an example, since that function
tends to be used more frequently to moderate the number of cpus online
than an individual vcpu unplug in our use case.
So while 'virDomainSetVcpu' does have the semi-synchronous behaviour we
had with the old-style APIs before we could add a new flag for the API
that just skips the call to qemuDomainWaitForDeviceRemoval and
qemuDomainRemoveVcpu and just returns success to avoid the possible
timeout if that's a problem for your application.
I've made a few modifications to libvirt to enable this behaviour - a
new pure-async implementation just sends the device_deleted message and
returns immediately after the ack from qemu.
I did a quick test with `virsh event` (using libvirt 9.x and qemu 7.2.0), and
cpu
hotunplug events do not seem to make it to virsh. I can confirm it's definitely
registered for the event because I can see the same event for other alias-based
detachment operations that I try (e.g. disk/iface) in the same invocation. The
libvirtd
log however includes the successful event that it received. Not *very* relevant
to the
discussion at the moment, just thought I'd mention it to see if I was doing
something wrong.
Ah, no you are doing things correctly here. Unfortunately we indeed
don't send out the device-deleted event on cpu unplug.
Was that intentional or merely an oversight?
Regardless, I've since fixed this; an event is now emitted (to a valid
application that is registered as a listener with libvirt) whenever a
vcpu gets hot-unplugged. Verifiable through `virsh event` invocations on
the CLI.
Specifically 'processDeviceDeletedEvent' calls
'qemuDomainRemoveVcpuAlias' which doesn't actually fire off an event.
I guess we could either define an schema for aliases for vCPUs to use
I've tweaked the alias lookup so that it also looks at newly-refreshed
vcpu definitions for the qemu-defined aliases for domain vcpus. Which
means an async hotplug through `virsh device-detach-alias <vcpu-alias>`
is now possible. This has the handy benefit of having
application-configurable timeouts for the unplug event to arrive, with
the timeouts now being the caller applications' concern.
Because the qemu-defined aliases are used instead of reworking the xml
schema to allow for individual vcpu aliases (which personally is too
cumbersome to maintain for large setups), there are two inherent advantages-
First, it avoids potential for confusion by having two aliases for the
same vcpu under the hood; i.e. the one that qemu already maintains and
another that a user would assign via the xml.
Second, it keeps xml validation simple - unless aliases are made
mandatory, it would get messy if libvirt was required to autogenerate
aliases wherever they are missing; especially if there are "holes" in
the list where the aliases are assigned, which libvirt would be expected
to fill in.
And finally, a bonus advantage - the number of files that are touched by
a patch for this change would be significantly less than if the xml
schema was tweaked to allow vcpu alias definition on the libvirt side as
well.
I'm currently working on adding tests for these changes; can share the
changeset if you'd like to see the still-unpolished changes in its
current state.