On Fri, Sep 19, 2025 at 13:00:34 +0100, Daniel P. Berrangé wrote: > On Fri, Sep 19, 2025 at 01:59:11PM +0200, Jiří Denemark wrote: > > On Thu, Sep 18, 2025 at 11:10:49 -0400, Peter Xu wrote: > > > On Thu, Sep 18, 2025 at 03:45:21PM +0100, Daniel P. Berrangé wrote: > > > > There needs to be a way to initiate post-copy recovery regardless > > > > of whether we've hit a keepalive timeout. Especially if we can > > > > see one QEMU in postcopy-paused, but not the other side, it > > > > doesn't appear to make sense to block the recovery process. > > > > > > > > The virDomainJobCancel command can do a migrate-cancel on the > > > > src, but it didn't look like we could do the same on the dst. > > > > Unless I've overlooked something, Libvirt needs to gain a way > > > > to explicitly force both sides into the postcopy-paused state, > > > > and thus be able to immediately initiate recovery. > > > > > > Right, if libvirt can do that then problem should have been solved too. > > > > I think we should be able to use the yank command to tell QEMU to close > > migration connections. I haven't tried it on the destination, but I > > guess it should work similarly to the source where it causes the > > migration to switch to postcopy-paused. It seems to be an equivalent of > > migrate-pause. So can we safely use yank in such situations? > > Can't we use migrate-pause on the target too ? IIUC that was what Peter > was suggesting earlier in the thread, unless I mis-interpreted ?
Ah ok, I missed that. Somehow I interpreted "Libvirt needs to gain a way to explicitly force both sides into the postcopy-paused" as "QEMU needs to allow us to do that" :-) Jirka