Hello Rogerio, did you find any way to run your platform code at-or-close to switchover time before starting the new QEMU process?
Anything you could share? Thanks, Claudio On 10/20/25 10:52, Rogério Vinhal Nunes via Devel wrote: > >> On 20 Oct 2025, at 09:36, Peter Krempa <[email protected]> wrote: >> >> > >> On Fri, Oct 17, 2025 at 13:47:54 +0000, Rogério Vinhal Nunes wrote: >>>> On 17 Oct 2025, at 12:59, Peter Krempa <[email protected]> wrote: >>>>>> On Thu, Oct 16, 2025 at 18:47:36 +0000, Rogério Vinhal Nunes wrote: >>> >>> [...] >>> >>>>> IIRC 'migrate' and 'prepare' happen when setting up the migration. >>>> 'start' happens right before starting the qemu process. At this >>>> point the migration will progress. 'started' happens after the >>>> migration was complete. >>>> >>>> So this provides means to setup and tear down resources. >>>> >>>> The open question still is whether you need that to happen precisely >>>> at switchover time. Thus the request for what you want to actually >>>> do. >>>> >>>> A hook at switchover time is obviously possible but until now we >>>> didn't yet get a good reason to have one. A reason against is that >>>> it introduces latency for the switchover. >>> >>> I need to synchronise our internal storage solution at the switchover >>> time because I need to unmap the resource from source and map on the >>> destination. It would be a behaviour similar to what happens with NBD, >>> but it's backed by our internal backend. >> >> So with NBD/NFS and others it works in a way where both the source and >> destination open the storage itself. QEMU internally ensures that it >> hands over the state cleanly and doesn't write from the source after >> handover. >> >> Can't your storage do that? That way you could do the setup before >> migration on the destination and tear-down after migration on source, >> thus eliminating the extra unbounded latency at switchover? > The problem is that the way it's currently designed it relies on cached > writes that can be propagated after the domain starts on the destination, so > we need the hook to, at least, flush the source before the destination > becomes rw. >> >>> >>>> >>>> >>>>> An alternative could be to have an option to wait for a resume >>>>> operation to progress as a client-defined migration flag exposing >>>>> the pre-switchover state. This way maybe we could work it as a >>>>> client feature rather than a hook? >>>> >>>> Once again specifying what you actually want to do would be helpful. >>>> >>>> E.g. I can suggest that you can migrate the VM as paused, which >>>> ensures that once the migration completes it will not continue >>>> execution on the destination, which could give you the chance for >>>> additional synchronisation. >>> >>> For us it's important to have the least amount of interruption as >>> possible, so we're very keen on a live migration here. >> >> That's the reason I think a synchronous hook, which will block the >> migration from switching over while the hook is executing, is not a >> great idea. > The hook is supposed to take order of ms whilst the migration of the memory > is supposed to take many seconds. I believe that pausing the domain will be > worse in manners of interruption. WRT to migrations that don't rely on it, we > could add a migration flag that enables this. >>
