On 10/12/2016 05:04 PM, Stefan Hajnoczi wrote:
> John and I recently discussed asynchronous savevm and I wanted to post
> the ideas so they aren't forgotten. (We're not actively working on this
> Asynchronous savevm has the same effect as the 'savevm' monitor command:
> it saves RAM, device state, and a snapshot of all disks at the point in
> time the command was issued.
> The current 'savevm' monitor command is synchronous so the guest and
> QEMU monitor are blocked while the operation runs (it can take a
> while!). Asynchronous savevm has the advantage of allowing the guest
> and QEMU monitor to continue while the operation is running.
> This sounds similar to live migration to file but remember that live
> migration's consistency point is when the guest is paused at the end of
> the iteration phase. The user has no control over *when* live migration
> captures the guest state. Therefore it's not a useful command for
> taking snapshots of guest state at a specific point in time - we need
> asynchronous savevm for that.
> Async savevm must copy-on-write guest RAM so the guest can continue
> writing to memory while the snapshot is being saved. Rik van Riel
> suggested using userfaultfd(2) to do this on Linux.
> Unlike post-copy live migration, we want to track memory writes (instead
> of missing page faults). The userfaultfd(2) flag
> UFFDIO_REGISTER_MODE_WP provides these semantics. Unfortunately I think
> UFFDIO_REGISTER_MODE_WP is not yet implemented?
> Once UFFDIO_REGISTER_MODE_WP is available QEMU can catch writes to guest
> RAM and copy the original pages to a buffer. If memory is dirtied too
> quickly then it's necessary to throttle the guest or fail the savevm
> Perhaps this approach can be prototyped with mprotect and a SIGSEGV
> handler if anyone wants to get async savevm going. I don't know if
> there are any disadvantages to mprotecting guest RAM that the kvm kernel
> module is using. Hopefully in-kernel devices and vhost will continue to