Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> ---
There is documentation draft for the feature. Here is nothing about bitmap migration protocol, it is commented in migration/block-dirty-bitmap.c. Capability name differs with other patches. Here - postcopy-bitmaps, and in patches dirty-bitmaps. In the following series it would be fixed to postcopy-bitmaps or x-postcopy-bitmaps (more like RAM) docs/migration.txt | 90 +++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 72 insertions(+), 18 deletions(-) diff --git a/docs/migration.txt b/docs/migration.txt index fda8d61..1d94b32 100644 --- a/docs/migration.txt +++ b/docs/migration.txt @@ -298,7 +298,7 @@ In most migration scenarios there is only a single data path that runs from the source VM to the destination, typically along a single fd (although possibly with another fd or similar for some fast way of throwing pages across). -However, some uses need two way communication; in particular the Postcopy +However, some uses need two way communication; in particular the RAM postcopy destination needs to be able to request pages on demand from the source. For these scenarios there is a 'return path' from the destination to the source; @@ -321,32 +321,62 @@ the amount of migration traffic and time it takes, the down side is that during the postcopy phase, a failure of *either* side or the network connection causes the guest to be lost. -In postcopy the destination CPUs are started before all the memory has been +== Sorts of state data == +States data, which should be migrated may be divided into three groups: + 1. Precopy only - data, which _must_ be transferred before destination CPUs + are started. + 2. Compatible - data, which may be transferred both in precopy and postcopy + phases (RAM). + 3. Postcopy only - data, which _must_ be transferred after destination CPUs + are started (dirty bitmaps). + +Note: also, any type of data may be transferred in the stopped state, when both +source and destination are stopped. + +Postcopy phase starts after the destination CPUs are started (and after stopped +phase, of-course), if the following conditions are met: + 1. Some postcopy migration capabilities are turned on. + 2. Current state data to be transferred is too large to be transferred in a + stopped state. + 3. Current precopy-only data is small enough to be transferred in the + stopped state. + 4. One of the following (or both): + 4a. Postcopy is forced by migrate_start_postcopy + 4b. State data which _may_ be transferred as precopy + ( = precopy-only + compatible ) is small enough to be transferred + in the stopped state. + +== migrate_start_postcopy == + +Issuing 'migrate_start_postcopy' command during precopy migration will cause the +transition from precopy to postcopy. It can be issued immediately after +migration is started or any time later on. Issuing it after the end of a +migration is harmless. This command is not guaranteed to cause immediate start +of destination and switch to postcopy (see above). + +Note: During the postcopy phase, the bandwidth limits set using +migrate_set_speed are ignored. + +Most postcopy related things are explained in 'RAM Postcopy' section, as RAM +postcopy was the first postcopy mechanism in Qemu and it dictated overall +architecture. + +== RAM Postcopy == +In RAM postcopy the destination CPUs are started before all the memory has been transferred, and accesses to pages that are yet to be transferred cause a fault that's translated by QEMU into a request to the source QEMU. -Postcopy can be combined with precopy (i.e. normal migration) so that if precopy -doesn't finish in a given time the switch is made to postcopy. +RAM postcopy can be combined with precopy (i.e. normal migration) so that if +precopy doesn't finish in a given time the switch is made to postcopy. -=== Enabling postcopy === +=== Enabling RAM postcopy === -To enable postcopy, issue this command on the monitor prior to the +To enable RAM postcopy, issue this command on the monitor prior to the start of migration: migrate_set_capability x-postcopy-ram on -The normal commands are then used to start a migration, which is still -started in precopy mode. Issuing: - -migrate_start_postcopy - -will now cause the transition from precopy to postcopy. -It can be issued immediately after migration is started or any -time later on. Issuing it after the end of a migration is harmless. - -Note: During the postcopy phase, the bandwidth limits set using -migrate_set_speed is ignored (to avoid delaying requested pages that -the destination is waiting for). +Then, to switch to postcopy, 'migrate_start_postcopy' command may be used. === Postcopy device transfer === @@ -482,3 +512,27 @@ request for a page that has already been sent is ignored. Duplicate requests such as this can happen as a page is sent at about the same time the destination accesses it. + +== Block dirty bitmaps postcopy == + +Postcopy is good place to migrate dirty bitmaps as they are not critical data, +and if postcopy fails, we will just drop bitmaps and do full backup instead of +next incremental and nothing worse. + +The good thing here is that bitmaps postcopy doesn't mean RAM postcopy, so if +only postcopy-bitmaps migration capability is on RAM would be migrated as usual +in precopy. Also, block dirty bitmap migration doesn't use return path as RAM +postcopy. + +Dirty bitmap migration state data is postcopy-only (see above). So, it is +migrated only in stopped state or in postcopy phase. + +Only named dirty bitmaps, associated with root nodes and non-root named nodes +are migrated. If destination Qemu is already containing a dirty bitmap with the +same name as a migrated bitmap (for the same node), then, if their +granularities are the same the migration will be done, otherwise the error will +be generated. If destination Qemu doesn't contain such bitmap it will be +created. + +The protocol of migration is specified (and realized) in +migration/block-dirty-bitmap.c. -- 1.8.3.1