Hi, While working on downstream issues with postcopy, I ended up writing a set of tests for issuing qmp_migrate_cancel() at various points during the migration. That exposed some bugs, which this series attempts to fix.
There is also a fix for the issue Daniel found: https://gitlab.com/qemu-project/qemu/-/issues/2633 I'm also sending the test code. It creates one test per MIGRATION_STATUS_ state. Each test starts a migration, waits for that specific state to be reached, issues qmp_migrate_cancel() and checks that the migration state changes to cancelled (for now only cancelling migration from the source side). I was initially worried that this would be too racy, but so far each test has survived 1000 iterations. I'm thinking it's worth merging, specially because even after working on this I haven't been able to clear the questions we have in our todo list [1], so we'll probably need more work around this area in the future. 1- https://wiki.qemu.org/ToDo/LiveMigration#Migration_cancel_concurrency CI run: https://gitlab.com/farosas/qemu/-/pipelines/1569870481 Thanks Fabiano Rosas (6): tests/qtest/migration: Introduce migration_test_add_suffix migration: Kick postcopy threads on cancel migration: Fix postcopy listen thread exit migration: Make sure postcopy recovery doesn't hang when cancelling migration: Fix hang after error in destination setup phase tests/qtest/migration: Add a cancel test migration/channel.c | 11 +- migration/migration.c | 58 +++++--- migration/migration.h | 2 +- migration/postcopy-ram.c | 14 +- migration/savevm.c | 60 ++++---- tests/qtest/migration-helpers.c | 24 ++++ tests/qtest/migration-helpers.h | 2 + tests/qtest/migration-test.c | 243 ++++++++++++++++++++++++++++++++ 8 files changed, 365 insertions(+), 49 deletions(-) -- 2.35.3