On Mon, Aug 29, 2016 at 11:06:48AM -0400, Stefan Hajnoczi wrote: > At KVM Forum an interesting idea was proposed to avoid > bdrv_drain_all() during live migration. Mike Cui and Felipe Franciosi > mentioned running at queue depth 1. It needs more thought to make it > workable but I want to capture it here for discussion and to archive > it. > > bdrv_drain_all() is synchronous and can cause VM downtime if I/O > requests hang. We should find a better way of quiescing I/O that is > not synchronous. Up until now I thought we should simply add a > timeout to bdrv_drain_all() so it can at least fail (and live > migration would fail) if I/O is stuck instead of hanging the VM. But > the following approach is also interesting...
How would you decide what an acceptable timeout is for the drain operation ? At what point does a stuck drain op cause the VM to stall ? The drain call happens from the migration thread, so it shouldn't impact vcpu threads or the main event loop thread if it takes too long. > > During the iteration phase of live migration we could limit the queue > depth so points with no I/O requests in-flight are identified. At > these points the migration algorithm has the opportunity to move to > the next phase without requiring bdrv_drain_all() since no requests > are pending. > > Unprocessed requests are left in the virtio-blk/virtio-scsi virtqueues > so that the destination QEMU can process them after migration > completes. > > Unfortunately this approach makes convergence harder because the VM > might also be dirtying memory pages during the iteration phase. Now > we need to reach a spot where no I/O is in-flight *and* dirty memory > is under the threshold. It doesn't seem like this could easily fit in with post-copy. During the switchover from pre-copy to post-copy migration calls vm_stop_force_state which will trigger bdrv_drain_all(). The point at which you switch from pre to post copy mode is not controlled by QEMU, instead it is an explicit admin action triggered via a QMP command. Now the actual switch over is not synchronous with completion of the QMP command, so there is small scope for delaying it to a convenient time, but not by a very significant amount & certainly not anywhere near 30 seconds. Perhaps 1 second at the most. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|