multifd: Allow to sync with sender threads only

Peter Xu Thu, 05 Dec 2024 12:36:10 -0800

On Thu, Dec 05, 2024 at 05:16:05PM -0300, Fabiano Rosas wrote:
> Peter Xu <pet...@redhat.com> writes:
> 
> > Teach multifd_send_sync_main() to sync with threads only.
> >
> > We already have such requests, which is when mapped-ram is enabled with
> > multifd.  In that case, no SYNC messages will be pushed to the stream when
> > multifd syncs the sender threads because there's no destination threads
> > waiting for that.  The whole point of the sync is to make sure all threads
> > flushed their jobs.
> >
> > So fundamentally we have a request to do the sync in different ways:
> >
> >   - Either to sync the threads only,
> >   - Or to sync the threads but also with the destination side
> >
> > Mapped-ram did it already because of the use_packet check in the sync
> > handler of the sender thread.  It works.
> >
> > However it may stop working when e.g. VFIO may start to reuse multifd
> > channels to push device states.  In that case VFIO has similar request on
> > "thread-only sync" however we can't check a flag because such sync request
> > can still come from RAM which needs the on-wire notifications.
> >
> > Paving way for that by allowing the multifd_send_sync_main() to specify
> > what kind of sync the caller needs.  We can use it for mapped-ram already.
> >
> > No functional change intended.
> >
> > Signed-off-by: Peter Xu <pet...@redhat.com>
> > ---
> >  migration/multifd.h        | 16 +++++++++++++---
> >  migration/multifd-nocomp.c |  8 +++++++-
> >  migration/multifd.c        | 14 ++++++++------
> >  3 files changed, 28 insertions(+), 10 deletions(-)
> >
> > diff --git a/migration/multifd.h b/migration/multifd.h
> > index 50d58c0c9c..6b2f60a917 100644
> > --- a/migration/multifd.h
> > +++ b/migration/multifd.h
> > @@ -19,6 +19,15 @@
> >  typedef struct MultiFDRecvData MultiFDRecvData;
> >  typedef struct MultiFDSendData MultiFDSendData;
> >  
> > +typedef enum {
> > +    /* No sync request */
> > +    MULTIFD_SYNC_NONE = 0,
> > +    /* Sync on the sender threads without pushing messages */
> > +    MULTIFD_SYNC_THREADS,
> > +    /* Sync on the sender threads, meanwhile push "SYNC" message to the 
> > wire */
> 
> s/meanwhile//
> 
> > +    MULTIFD_SYNC_THREADS_AND_NOTIFY,
> > +} MultiFDSyncReq;
> 
> I think I'd prefer the local vs. remote terminology I introduced in my
> proposal [1] for cleaning up the multifd_flush_after_each_section() code:


I'm ok with your naming, as long as the comment will explain.

> 
> LOCAL - sync the local threads between themselves
> REMOTE - put a message on the stream for the remote end to perform a
>          sync on their threads.
> 
> Down below you're passing the
> MULTIFD_SYNC_THREADS_AND_NOTIFY into the send thread, but the "sync
> threads" part of this is really done outside the thread, so that part
> doesn't have a meaning inside the thread.
> 
> 1- https://lore.kernel.org/r/875xo8n4ue....@suse.de
> 
> Also, please provide your input there^, it would be nice to unify the
> terminology and reasoning about both changes.

Yes, I'm mostly flushing my inbox in time order unless prioritized, so I'm
getting there today or tomorrow.

> 
> > +
> >  bool multifd_send_setup(void);
> >  void multifd_send_shutdown(void);
> >  void multifd_send_channel_created(void);
> > @@ -28,7 +37,7 @@ void multifd_recv_shutdown(void);
> >  bool multifd_recv_all_channels_created(void);
> >  void multifd_recv_new_channel(QIOChannel *ioc, Error **errp);
> >  void multifd_recv_sync_main(void);
> > -int multifd_send_sync_main(void);
> > +int multifd_send_sync_main(MultiFDSyncReq req);
> >  bool multifd_queue_page(RAMBlock *block, ram_addr_t offset);
> >  bool multifd_recv(void);
> >  MultiFDRecvData *multifd_get_recv_data(void);
> > @@ -143,7 +152,7 @@ typedef struct {
> >      /* multifd flags for each packet */
> >      uint32_t flags;
> >      /*
> > -     * The sender thread has work to do if either of below boolean is set.
> > +     * The sender thread has work to do if either of below field is set.
> >       *
> >       * @pending_job:  a job is pending
> >       * @pending_sync: a sync request is pending
> > @@ -152,7 +161,8 @@ typedef struct {
> >       * cleared by the multifd sender threads.
> >       */
> >      bool pending_job;
> > -    bool pending_sync;
> > +    MultiFDSyncReq pending_sync;
> > +
> >      MultiFDSendData *data;
> >  
> >      /* thread local variables. No locking required */
> > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
> > index 55191152f9..f64c4c9abd 100644
> > --- a/migration/multifd-nocomp.c
> > +++ b/migration/multifd-nocomp.c
> > @@ -345,6 +345,8 @@ retry:
> >  
> >  int multifd_ram_flush_and_sync(void)
> >  {
> > +    MultiFDSyncReq req;
> > +
> >      if (!migrate_multifd()) {
> >          return 0;
> >      }
> > @@ -356,7 +358,11 @@ int multifd_ram_flush_and_sync(void)
> >          }
> >      }
> >  
> > -    return multifd_send_sync_main();
> > +    /* File migrations only need to sync with threads */
> > +    req = migrate_mapped_ram() ?
> > +        MULTIFD_SYNC_THREADS : MULTIFD_SYNC_THREADS_AND_NOTIFY;
> > +
> > +    return multifd_send_sync_main(req);
> >  }
> >  
> >  bool multifd_send_prepare_common(MultiFDSendParams *p)
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 498e71fd10..77645e87a0 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -523,7 +523,7 @@ static int multifd_zero_copy_flush(QIOChannel *c)
> >      return ret;
> >  }
> >  
> > -int multifd_send_sync_main(void)
> > +int multifd_send_sync_main(MultiFDSyncReq req)
> >  {
> >      int i;
> >      bool flush_zero_copy;
> > @@ -543,8 +543,8 @@ int multifd_send_sync_main(void)
> >           * We should be the only user so far, so not possible to be set by
> >           * others concurrently.
> >           */
> > -        assert(qatomic_read(&p->pending_sync) == false);
> > -        qatomic_set(&p->pending_sync, true);
> > +        assert(qatomic_read(&p->pending_sync) == MULTIFD_SYNC_NONE);
> > +        qatomic_set(&p->pending_sync, req);
> 
> Hmm, isn't it easier to skip the whole loop if req ==
> MULTIFD_SYNC_THREADS? I don't remember why we kept this loop here for
> mapped-ram.

The "thread-only" version of request (or, in your preferred naming, "local"
sync request) says: "please flush all the works enqueued in sender thread".
Sync is still needed even for mapped-ram to make sure pwrite()s all land.
Also needed for VFIO.

> 
> >          qemu_sem_post(&p->sem);
> >      }
> >      for (i = 0; i < migrate_multifd_channels(); i++) {
> > @@ -635,14 +635,16 @@ static void *multifd_send_thread(void *opaque)
> >               */
> >              qatomic_store_release(&p->pending_job, false);
> >          } else {
> > +            MultiFDSyncReq req = qatomic_read(&p->pending_sync);
> > +
> >              /*
> >               * If not a normal job, must be a sync request.  Note that
> >               * pending_sync is a standalone flag (unlike pending_job), so
> >               * it doesn't require explicit memory barriers.
> >               */
> > -            assert(qatomic_read(&p->pending_sync));
> > +            assert(req != MULTIFD_SYNC_NONE);
> >  
> > -            if (use_packets) {
> > +            if (req == MULTIFD_SYNC_THREADS_AND_NOTIFY) {
> 
> Good, more explicit.
> 
> >                  p->flags = MULTIFD_FLAG_SYNC;
> >                  multifd_send_fill_packet(p);
> >                  ret = qio_channel_write_all(p->c, (void *)p->packet,
> > @@ -654,7 +656,7 @@ static void *multifd_send_thread(void *opaque)
> >                  stat64_add(&mig_stats.multifd_bytes, p->packet_len);
> >              }
> >  
> > -            qatomic_set(&p->pending_sync, false);
> > +            qatomic_set(&p->pending_sync, MULTIFD_SYNC_NONE);
> 
> It's a bit weird that MULTIFD_SYNC_THREADS will never have an use inside
> the thread.

It has; it guarantees that existing queued pending_job is completed.

> Makes me think it should never exist in the first place. But
> then we're back into pending_sync + use_packets... looks like it would
> be less convoluted to skip the loop up there and assert(!use_packets) in
> here.
> 
> Unless I'm missing something...
> 
> >              qemu_sem_post(&p->sem_sync);
> >          }
> >      }
> 

-- 
Peter Xu

Re: [PATCH 2/2] migration/multifd: Allow to sync with sender threads only

Reply via email to