At KVM Forum an interesting idea was proposed to avoid bdrv_drain_all() during live migration. Mike Cui and Felipe Franciosi mentioned running at queue depth 1. It needs more thought to make it workable but I want to capture it here for discussion and to archive it.
bdrv_drain_all() is synchronous and can cause VM downtime if I/O requests hang. We should find a better way of quiescing I/O that is not synchronous. Up until now I thought we should simply add a timeout to bdrv_drain_all() so it can at least fail (and live migration would fail) if I/O is stuck instead of hanging the VM. But the following approach is also interesting... During the iteration phase of live migration we could limit the queue depth so points with no I/O requests in-flight are identified. At these points the migration algorithm has the opportunity to move to the next phase without requiring bdrv_drain_all() since no requests are pending. Unprocessed requests are left in the virtio-blk/virtio-scsi virtqueues so that the destination QEMU can process them after migration completes. Unfortunately this approach makes convergence harder because the VM might also be dirtying memory pages during the iteration phase. Now we need to reach a spot where no I/O is in-flight *and* dirty memory is under the threshold. Thoughts? Stefan