On Wed, Jul 21, 2021 at 04:15:27PM -0500, Eric Blake wrote: > On Wed, Jul 21, 2021 at 03:34:06PM -0400, Peter Xu wrote: > > Accessing from_dst_file is potentially racy in current code base like below: > > > > if (s->from_dst_file) > > do_something(s->from_dst_file); > > > > Because from_dst_file can be reset right after the check in another > > thread (rp_thread). One example is migrate_fd_cancel(). > > > > Use the same qemu_file_lock to protect it too, just like to_dst_file. > > > > When it's safe to access without lock, comment it. > > > > There's one special reference in migration_thread() that can be replaced by > > the newly introduced rp_thread_created flag. > > > > Reported-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > > Signed-off-by: Peter Xu <pet...@redhat.com> > > --- > > migration/migration.c | 32 +++++++++++++++++++++++++------- > > migration/migration.h | 8 +++++--- > > migration/ram.c | 1 + > > 3 files changed, 31 insertions(+), 10 deletions(-) > > > > diff --git a/migration/migration.c b/migration/migration.c > > index 21b94f75a3..fa70400f98 100644 > > --- a/migration/migration.c > > +++ b/migration/migration.c > > @@ -1879,10 +1879,12 @@ static void migrate_fd_cancel(MigrationState *s) > > QEMUFile *f = migrate_get_current()->to_dst_file; > > trace_migrate_fd_cancel(); > > > > + qemu_mutex_lock(&s->qemu_file_lock); > > if (s->rp_state.from_dst_file) { > > /* shutdown the rp socket, so causing the rp thread to shutdown */ > > qemu_file_shutdown(s->rp_state.from_dst_file); > > } > > + qemu_mutex_unlock(&s->qemu_file_lock); > > Worth using WITH_QEMU_LOCK_GUARD?
Sure. > > > @@ -2827,11 +2845,13 @@ out: > > * Maybe there is something we can do: it looks like a > > * network down issue, and we pause for a recovery. > > */ > > - qemu_fclose(rp); > > - ms->rp_state.from_dst_file = NULL; > > + migration_release_from_dst_file(ms); > > rp = NULL; > > if (postcopy_pause_return_path_thread(ms)) { > > - /* Reload rp, reset the rest */ > > + /* > > + * Reload rp, reset the rest. Referencing it is save since > > s/save/safe/ Will fix. I'll wait for some more comments before I repost. Thanks, -- Peter Xu