John and I recently discussed asynchronous savevm and I wanted to post the ideas so they aren't forgotten. (We're not actively working on this feature.)
Asynchronous savevm has the same effect as the 'savevm' monitor command: it saves RAM, device state, and a snapshot of all disks at the point in time the command was issued. The current 'savevm' monitor command is synchronous so the guest and QEMU monitor are blocked while the operation runs (it can take a while!). Asynchronous savevm has the advantage of allowing the guest and QEMU monitor to continue while the operation is running. This sounds similar to live migration to file but remember that live migration's consistency point is when the guest is paused at the end of the iteration phase. The user has no control over *when* live migration captures the guest state. Therefore it's not a useful command for taking snapshots of guest state at a specific point in time - we need asynchronous savevm for that. Async savevm must copy-on-write guest RAM so the guest can continue writing to memory while the snapshot is being saved. Rik van Riel suggested using userfaultfd(2) to do this on Linux. Unlike post-copy live migration, we want to track memory writes (instead of missing page faults). The userfaultfd(2) flag UFFDIO_REGISTER_MODE_WP provides these semantics. Unfortunately I think UFFDIO_REGISTER_MODE_WP is not yet implemented? Once UFFDIO_REGISTER_MODE_WP is available QEMU can catch writes to guest RAM and copy the original pages to a buffer. If memory is dirtied too quickly then it's necessary to throttle the guest or fail the savevm operation. Perhaps this approach can be prototyped with mprotect and a SIGSEGV handler if anyone wants to get async savevm going. I don't know if there are any disadvantages to mprotecting guest RAM that the kvm kernel module is using. Hopefully in-kernel devices and vhost will continue to work. Stefan
signature.asc
Description: PGP signature