On Tue, Mar 13, 2018 at 12:35:56PM +0000, Dr. David Alan Gilbert wrote: [...]
> > Yes I think if without OOB we should be fine since even the cleanup is > > running with the BQL. > > > > Now I don't have good idea to solve this problem except introducing a > > lock. How about I add a patch to introduce the mgmt_lock, which > > currently only protect the QEMUFile? Like: > > > > ---------------------------------- > > diff --git a/migration/migration.c b/migration/migration.c > > index f31fcbb0d5..00c630326d 100644 > > --- a/migration/migration.c > > +++ b/migration/migration.c > > @@ -1195,8 +1195,10 @@ static void migrate_fd_cleanup(void *opaque) > > if (multifd_save_cleanup(&local_err) != 0) { > > error_report_err(local_err); > > } > > + qemu_mutex_lock(&s->mgmt_lock); > > qemu_fclose(s->to_dst_file); > > s->to_dst_file = NULL; > > + qemu_mutex_unlock(&s->mgmt_lock); > > } > > > > assert((s->state != MIGRATION_STATUS_ACTIVE) && > > @@ -2493,8 +2495,10 @@ static MigThrError postcopy_pause(MigrationState *s) > > > > /* Current channel is possibly broken. Release it. */ > > assert(s->to_dst_file); > > + qemu_mutex_lock(&s->mgmt_lock); > > qemu_file_shutdown(s->to_dst_file); > > qemu_fclose(s->to_dst_file); > > s->to_dst_file = NULL; > > + qemu_mutex_unlock(&s->mgmt_lock); > > That's only safe if we know qemu_fclose() can never block; otherwise > we're not allowed to take the same lock in the OOB command. > > I think perhaps it's safer to always do something like: > tmp = atomic_xchg(s->to_dst_file, NULL); > qemu_file_shutdown(tmp); > qemu_fclose(tmp); > > then the OOB code can do the same? > Would that work - avoiding the lock? According to what we discussed offlist: I'll still introduce that lock, but instead I'll move the close() out of the lock section since that can block. I'll see whether I need a repost tomorrow, or after 2.12 release if the series will never have a chance for 2.12. Thanks, -- Peter Xu