When migrate_cancel a multifd migration, if run sequence like this: [source] [destination]
multifd_send_sync_main[finish] multifd_recv_thread wait &p->sem_sync shutdown to_dst_file detect error from_src_file send RAM_SAVE_FLAG_EOS[fail] [no chance to run multifd_recv_sync_main] multifd_load_cleanup join multifd receive thread forever will lead destination qemu hung at following stack: pthread_join qemu_thread_join multifd_load_cleanup process_incoming_migration_co coroutine_trampoline Signed-off-by: Ivan Ren <ivan...@tencent.com> --- migration/ram.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index e4eb9c441f..504c8ccb03 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1291,6 +1291,11 @@ int multifd_load_cleanup(Error **errp) MultiFDRecvParams *p = &multifd_recv_state->params[i]; if (p->running) { + /* + * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code, + * however try to wakeup it without harm in cleanup phase. + */ + qemu_sem_post(&p->sem_sync); qemu_thread_join(&p->thread); } object_unref(OBJECT(p->c)); -- 2.17.2 (Apple Git-113)