On 2016/10/12 22:21, Dr. David Alan Gilbert wrote:
* Stefan Hajnoczi (stefa...@gmail.com) wrote:
John and I recently discussed asynchronous savevm and I wanted to post
the ideas so they aren't forgotten. (We're not actively working on this
Asynchronous savevm has the same effect as the 'savevm' monitor command:
it saves RAM, device state, and a snapshot of all disks at the point in
time the command was issued.
The current 'savevm' monitor command is synchronous so the guest and
QEMU monitor are blocked while the operation runs (it can take a
while!). Asynchronous savevm has the advantage of allowing the guest
and QEMU monitor to continue while the operation is running.
This sounds similar to live migration to file but remember that live
migration's consistency point is when the guest is paused at the end of
the iteration phase. The user has no control over *when* live migration
captures the guest state. Therefore it's not a useful command for
taking snapshots of guest state at a specific point in time - we need
asynchronous savevm for that.
Async savevm must copy-on-write guest RAM so the guest can continue
writing to memory while the snapshot is being saved. Rik van Riel
suggested using userfaultfd(2) to do this on Linux.
Unlike post-copy live migration, we want to track memory writes (instead
of missing page faults). The userfaultfd(2) flag
UFFDIO_REGISTER_MODE_WP provides these semantics. Unfortunately I think
UFFDIO_REGISTER_MODE_WP is not yet implemented?
A prototype of this has already been written by Hailiang Zhang;
Yes, I have updated it to 2th version in private, but unfortunately,
there are still some problems with UFFDIO_REGISTER_MODE_WP API in kernel,
It still can't support KVM, (only supports tcg mode).
I have given feedback to Andrea, but got no response ... :(
Once UFFDIO_REGISTER_MODE_WP is available QEMU can catch writes to guest
RAM and copy the original pages to a buffer. If memory is dirtied too
quickly then it's necessary to throttle the guest or fail the savevm
The only limit there is the size of the buffer, waiting for space will
do the throttling.
Yes, We can optimize it by extend the size of buffer and use multiple fds to
handle the user fault.
Perhaps this approach can be prototyped with mprotect and a SIGSEGV
handler if anyone wants to get async savevm going. I don't know if
there are any disadvantages to mprotecting guest RAM that the kvm kernel
module is using. Hopefully in-kernel devices and vhost will continue to
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK