On Tue, Aug 15, 2017 at 09:37:14AM +0100, Daniel P. Berrange wrote: > On Tue, Aug 15, 2017 at 02:17:06PM +0800, Peter Xu wrote: > > Store the task tag for migration types: tcp/unix/fd/exec in current > > MigrationIncomingState struct. > > > > For defered migration, no need to store task tag since there is no task > > running in the main loop at all. For RDMA, let's mark it as todo. > > > > Signed-off-by: Peter Xu <pet...@redhat.com> > > --- > > migration/migration.c | 22 ++++++++++++++++++---- > > migration/migration.h | 2 ++ > > 2 files changed, 20 insertions(+), 4 deletions(-) > > > > diff --git a/migration/migration.c b/migration/migration.c > > index c9b7085..daf356b 100644 > > --- a/migration/migration.c > > +++ b/migration/migration.c > > @@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void) > > mis->from_src_file = NULL; > > } > > > > + mis->listen_task_tag = 0; > > qemu_event_destroy(&mis->main_thread_load_event); > > } > > > > @@ -265,25 +266,31 @@ int migrate_send_rp_req_pages(MigrationIncomingState > > *mis, const char *rbname, > > void qemu_start_incoming_migration(const char *uri, Error **errp) > > { > > const char *p; > > + guint task_tag = 0; > > + MigrationIncomingState *mis = migration_incoming_get_current(); > > > > qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort); > > if (!strcmp(uri, "defer")) { > > deferred_incoming_migration(errp); > > } else if (strstart(uri, "tcp:", &p)) { > > - tcp_start_incoming_migration(p, errp); > > + task_tag = tcp_start_incoming_migration(p, errp); > > #ifdef CONFIG_RDMA > > } else if (strstart(uri, "rdma:", &p)) { > > + /* TODO: store task tag for RDMA migrations */ > > rdma_start_incoming_migration(p, errp); > > #endif > > } else if (strstart(uri, "exec:", &p)) { > > - exec_start_incoming_migration(p, errp); > > + task_tag = exec_start_incoming_migration(p, errp); > > } else if (strstart(uri, "unix:", &p)) { > > - unix_start_incoming_migration(p, errp); > > + task_tag = unix_start_incoming_migration(p, errp); > > } else if (strstart(uri, "fd:", &p)) { > > - fd_start_incoming_migration(p, errp); > > + task_tag = fd_start_incoming_migration(p, errp); > > } else { > > error_setg(errp, "unknown migration protocol: %s", uri); > > + return; > > } > > + > > + mis->listen_task_tag = task_tag; > > This is unsafe as currently written. The main loop watches that are > associated with these tags are all unregistered when incoming migration > starts. So if you save them, and then unregister later when postcopy > is "stuck", then you're likely unregistering a tag associated with > some other random part of QEMU, because the saved value is no longer > valid.
IIUC for incoming migration, the watch is not detached until migration_fd_process_incoming() completes. So: - for precopy, the watch should be detached when migration completes - for postcopy, the watch should be detached when postcopy starts, then main loop thread returns, while the ram_load_thread starts to continue the migration. So basically the watch is detaching itself after migration_fd_process_incoming() completes. To handle that, this patch has this change: @@ -422,6 +429,13 @@ void migration_fd_process_incoming(QEMUFile *f) co = qemu_coroutine_create(process_incoming_migration_co, f); qemu_coroutine_enter(co); } + + /* + * When reach here, we should not need the listening port any + * more. We'll detach the listening task soon, let's reset the + * listen task tag. + */ + mis->listen_task_tag = 0; Would this make sure the listen_task_tag is always valid if nonzero? Thanks, -- Peter Xu