This series introduces a new POSTCOPY_DEVICE state that is active (both, on source and destination side), while the destination loads the device state. Before this series, if the destination machine failed during the device load, the source side would stay stuck POSTCOPY_ACTIVE with no way of recovery. With this series, if the migration fails while in POSTCOPY_DEVICE state, the source side can safely resume, as destination has not started yet.
RFC: https://lore.kernel.org/all/[email protected]/ V1: https://lore.kernel.org/all/[email protected]/ V2 changes: - removed old patch 2, that changed migration_has_failed() Patch 2: - moved postcopy_ram_listen_thread() to postcopy_ram.c as per TODO, suggested by Fabiano Patch 3: - introduced separate postcopy-exit-on-error setting instead of reusing existing exit-on-error setting, suggested by Fabiano and Jirka - merged migration_incoming_finish() and migration_incoming_state_destroy() into migration_incoming_cleanup() and added migration_incoming_cleanup_bh(), suggested by Fabiano Patch 4: - introduced POSTCOPY_DEVICE state also to destination, suggested by Jirka - moved POSTCOPY_DEVICE->POSTCOPY_ACTIVE transition from return path thread to main migration thread, suggested by Peter Juraj Marcin (3): migration: Move postcopy_ram_listen_thread() to postcopy-ram.c migration: Refactor all incoming cleanup into migration_incoming_cleanup() migration: Introduce POSTCOPY_DEVICE state Peter Xu (1): migration: Do not try to start VM if disk activation fails migration/migration-hmp-cmds.c | 2 +- migration/migration.c | 148 +++++++++++++++++--------- migration/migration.h | 7 +- migration/postcopy-ram.c | 137 ++++++++++++++++++++++++ migration/postcopy-ram.h | 3 + migration/savevm.c | 137 ++---------------------- migration/savevm.h | 2 + migration/trace-events | 1 + qapi/migration.json | 17 ++- system/vl.c | 3 +- tests/qtest/migration/precopy-tests.c | 3 +- 11 files changed, 274 insertions(+), 186 deletions(-) -- 2.51.0
