On Thu, Oct 16, 2025 at 12:23:35PM +0300, Vladimir Sementsov-Ogievskiy wrote: > On 16.10.25 11:32, Daniel P. Berrangé wrote: > > On Thu, Oct 16, 2025 at 12:02:45AM +0300, Vladimir Sementsov-Ogievskiy > > wrote: > > > On 15.10.25 23:07, Peter Xu wrote: > > > > On Wed, Oct 15, 2025 at 10:02:14PM +0300, Vladimir Sementsov-Ogievskiy > > > > wrote: > > > > > On 15.10.25 21:19, Peter Xu wrote: > > > > > > On Wed, Oct 15, 2025 at 04:21:32PM +0300, Vladimir > > > > > > Sementsov-Ogievskiy wrote: > > > > > > > This parameter enables backend-transfer feature: all devices > > > > > > > which support it will migrate their backends (for example a TAP > > > > > > > device, by passing open file descriptor to migration channel). > > > > > > > > > > > > > > Currently no such devices, so the new parameter is a noop. > > > > > > > > > > > > > > Next commit will add support for virtio-net, to migrate its > > > > > > > TAP backend. > > > > > > > > > > > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy > > > > > > > <[email protected]> > > > > > > > --- > > > > > > > > > > [..] > > > > > > > > > > > > --- a/qapi/migration.json > > > > > > > +++ b/qapi/migration.json > > > > > > > @@ -951,9 +951,16 @@ > > > > > > > # is @cpr-exec. The first list element is the program's > > > > > > > filename, > > > > > > > # the remainder its arguments. (Since 10.2) > > > > > > > # > > > > > > > +# @backend-transfer: Enable backend-transfer feature for devices > > > > > > > that > > > > > > > +# supports it. In general that means that backend state and > > > > > > > its > > > > > > > +# file descriptors are passed to the destination in the > > > > > > > migraton > > > > > > > +# channel (which must be a UNIX socket). Individual devices > > > > > > > +# declare the support for backend-transfer by per-device > > > > > > > +# backend-transfer option. (Since 10.2) > > > > > > > > > > > > Thanks. > > > > > > > > > > > > I still prefer the name "fd-passing" or anything more explicit than > > > > > > "backend-transfer". Maybe the current name is fine for TAP, only > > > > > > because > > > > > > TAP doesn't have its own VMSD to transfer? > > > > > > > > > > > > Consider a device that would be a backend that supports VMSDs > > > > > > already to be > > > > > > migrated, then if it starts to allow fd-passing, this name will > > > > > > stop being > > > > > > suitable there, because it used to "transfer backend" already, now > > > > > > it's > > > > > > just started to "fd-passing". > > > > > > > > > > > > Meanwhile, consider another example - what if a device is not a > > > > > > backend at > > > > > > all (e.g. vfio?), has its own VMSD, then want to do fd-passing? > > > > > > > > > > Reasonable. > > > > > > > > > > But consider also the discussion with Fabiano in v5, where he argues > > > > > against fds > > > > > (reasonable too): > > > > > > > > > > https://lore.kernel.org/qemu-devel/[email protected]/ > > > > > > > > > > (still, they were against my "fds" name for the parameter, which is > > > > > really too generic, fd-passing is not) > > > > > > > > > > and the arguments for backend-transfer (to read similar with > > > > > cpr-transfer) > > > > > > > > > > https://lore.kernel.org/qemu-devel/[email protected]/ > > > > > > > > > > > > > > > > > > > > > > In general, I think "fd" is really a core concept of this whole > > > > > > thing. > > > > > > > > > > I think, we can call "backend" any external object, linked by the fd. > > > > > > > > > > Still, backend/frontend terminology is so misleading, when applied to > > > > > complex systems (for me, at least), that I don't really like > > > > > "-backend" > > > > > word here. > > > > > > > > > > fd-passing is OK for me, I can resend with it, if arguments by Fabiano > > > > > not change your mind. > > > > > > > > Ah, I didn't notice the name has been discussed. > > > > > > > > I think it means you can vote for your own preference now because we > > > > have > > > > one vote for each. :) Let's also see whether Fabiano will come up with > > > > something better than both. > > > > > > > > You mentioned explicitly the file descriptors in the qapi doc, that's > > > > what > > > > I would strongly request for. The other thing is the unix socket > > > > check, it > > > > looks all good below now with it, thanks. No strong feelings on the > > > > names. > > > > > > > > > > After a bit more thinking, I leaning towards keeping backend-transfer. I > > > think > > > it's more meaningful for the user: > > > > > > If we call it "fd-passing", user may ask: > > > > > > Ok, what is it? Allow QEMU to pass some fds through migration stream, if > > > it > > > supports fds? Which fds? Why to pass them? Finally, why QEMU can't just > > > check > > > is it unix socket or not, and pass any fds it wants if it is? > > > > > > Logical question is, why not just drop the global capability, and check > > > only > > > is it unix socket or not? (OK, relying only on socket type is wrong > > > anyway, > > > as it may be some complex tunneling, which includes unix sockets, but > > > still > > > can't pass fds, but I think now about feature naming) > > > > > > But we really want an explicit switch for the feature. As qemu-update is > > > not the only case of local migration. The another case is changing the > > > backend. So for the user's choice is: > > > > > > 1. Remote migration: we can't reuse backends (files, sockets, host > > > devices), as > > > we are moving to another host. So, we don't enable "backend-transfer". We > > > don't > > > transfer the backend, we have to initialize new backend on another host. > > > > > > 2. Local migration to update QEMU, with minimal freeze-time and minimal > > > extra actions: use "backend-transfer", exactly to keep the backends > > > (vhost-user-server, TAP device in kernel, in-kernel vfio device state, > > > etc) > > > as is. > > > > > > 3. Local migration, but we want to reconfigure some backend, or switch > > > to another backend. We disable "backend-transfer" for one device. > > > > This implies that you're changing 'backend-transfer' against the > > device at time of each migration. > > > > This takes us back to the situation we've had historically where the > > behaviour of migration depends on global properties the mgmt app has > > set prior to the 'migrate' command being run. We've just tried to get > > away from that model by passing everything as parameters to the > > migrate command, so I'm loathe to see us invent a new way to have > > global state properties changing migration behaviour. > > > > This 'backend-transfer' device property is not really a device property, > > it is an indirect parameter to the 'migrate' command.
I was not seeing it like that. I was treating per-device parameter to be a flag showing whether the device is capable of passing over FDs, which is more like a device attribute. Those things (after set by machine type) should never change, and the only thing to be changed is the global "backend-transfer" boolean that can be set in the "migrate" QMP command, and should be decided by the admin when one wants to initiate the migration process. > > > > Ergo, if we need the ability to selectively migrate the backend state > > of individal devices, then instead of a property on the device, we > > should pass a list of device IDs as a parameter to the migrate > > command in QMP. I doubt whether we would really need that in reality. Likely the admin should only worry about whether setting the global "backend-transfer", the admin may not even need to know which device, and how many devices, will be beneficial to this feature enabled. It just says, "we're doing local migration and via unix sockets, so whatever devices can try to reuse their backends if possible". Thanks, -- Peter Xu
