On Fri, Aug 16, 2024 at 11:06:10AM -0400, Peter Xu wrote: > On Thu, Aug 15, 2024 at 04:55:20PM -0400, Steven Sistare wrote: > > On 8/13/2024 3:46 PM, Peter Xu wrote: > > > On Tue, Aug 06, 2024 at 04:56:18PM -0400, Steven Sistare wrote: > > > > > The flipside, however, is that localhost migration via 2 separate QEMU > > > > > processes has issues where both QEMUs want to be opening the very same > > > > > file, and only 1 of them can ever have them open. > > > > > > I thought we used to have similar issue with block devices, but I assume > > > it's solved for years (and whoever owns it will take proper file lock, > > > IIRC, and QEMU migration should properly serialize the time window on > > > who's > > > going to take the file lock). > > > > > > Maybe this is about something else? > > > > I don't have an example where this fails. > > > > I can cause "Failed to get "write" lock" errors if two qemu instances open > > the same block device, but the error is suppressed if you add the -incoming > > argument, due to this code: > > > > blk_attach_dev() > > if (runstate_check(RUN_STATE_INMIGRATE)) > > blk->disable_perm = true; > > Yep, this one is pretty much expected. > > > > > > > Indeed, and "files" includes unix domain sockets. > > > > More on this -- the second qemu to bind a unix domain socket for listening > > wins, and the first qemu loses it (because second qemu unlinks and recreates > > the socket path before binding on the assumption that it is stale). > > > > One must use a different name for the socket for second qemu, and clients > > that wish to connect must be aware of the new port. > > > > > > Network ports also conflict. > > > > cpr-exec avoids such problems, and is one of the advantages of the > > > > method that > > > > I forgot to promote. > > > > > > I was thinking that's fine, as the host ports should be the backend of the > > > VM ports only anyway so they don't need to be identical on both sides? > > > > > > IOW, my understanding is it's the guest IP/ports/... which should still be > > > stable across migrations, where the host ports can be different as long as > > > the host ports can forward guest port messages correctly? > > > > Yes, one must use a different host port number for the second qemu, and > > clients > > that wish to connect must be aware of the new port. > > > > That is my point -- cpr-transfer requires fiddling with such things. > > cpr-exec does not. > > Right, and my understanding is all these facilities are already there, so > no new code should be needed on reconnect issues if to support cpr-transfer > in Libvirt or similar management layers that supports migrations.
Note Libvirt explicitly blocks localhost migration today because solving all these clashing resource problems is a huge can of worms and it can't be made invisible to the user of libvirt in any practical way. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|