On Fri, Feb 02, 2024 at 05:47:05PM -0300, Fabiano Rosas wrote:
> pet...@redhat.com writes:
> 
> > From: Peter Xu <pet...@redhat.com>
> >
> > The current multifd_queue_page() is not easy to read and follow.  It is not
> > good with a few reasons:
> >
> >   - No helper at all to show what exactly does a condition mean; in short,
> >   readability is low.
> >
> >   - Rely on pages->ramblock being cleared to detect an empty queue.  It's
> >   slightly an overload of the ramblock pointer, per Fabiano [1], which I
> >   also agree.
> >
> >   - Contains a self recursion, even if not necessary..
> >
> > Rewrite this function.  We add some comments to make it even clearer on
> > what it does.
> >
> > [1] https://lore.kernel.org/r/87wmrpjzew....@suse.de
> >
> > Signed-off-by: Peter Xu <pet...@redhat.com>
> 
> Reviewed-by: Fabiano Rosas <faro...@suse.de>
> 
> Patch looks good, but I have a question below.
> 
> > ---
> >  migration/multifd.c | 56 ++++++++++++++++++++++++++++++---------------
> >  1 file changed, 37 insertions(+), 19 deletions(-)
> >
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 35d4e8ad1f..4ab8e6eff2 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -506,35 +506,53 @@ static bool multifd_send_pages(void)
> >      return true;
> >  }
> >  
> > +static inline bool multifd_queue_empty(MultiFDPages_t *pages)
> > +{
> > +    return pages->num == 0;
> > +}
> > +
> > +static inline bool multifd_queue_full(MultiFDPages_t *pages)
> > +{
> > +    return pages->num == pages->allocated;
> > +}
> > +
> > +static inline void multifd_enqueue(MultiFDPages_t *pages, ram_addr_t 
> > offset)
> > +{
> > +    pages->offset[pages->num++] = offset;
> > +}
> > +
> >  /* Returns true if enqueue successful, false otherwise */
> >  bool multifd_queue_page(RAMBlock *block, ram_addr_t offset)
> >  {
> > -    MultiFDPages_t *pages = multifd_send_state->pages;
> > -    bool changed = false;
> > +    MultiFDPages_t *pages;
> > +
> > +retry:
> > +    pages = multifd_send_state->pages;
> >  
> > -    if (!pages->block) {
> > +    /* If the queue is empty, we can already enqueue now */
> > +    if (multifd_queue_empty(pages)) {
> >          pages->block = block;
> > +        multifd_enqueue(pages, offset);
> > +        return true;
> >      }
> >  
> > -    if (pages->block == block) {
> > -        pages->offset[pages->num] = offset;
> > -        pages->num++;
> > -
> > -        if (pages->num < pages->allocated) {
> > -            return true;
> > +    /*
> > +     * Not empty, meanwhile we need a flush.  It can because of either:
> > +     *
> > +     * (1) The page is not on the same ramblock of previous ones, or,
> > +     * (2) The queue is full.
> > +     *
> > +     * After flush, always retry.
> > +     */
> > +    if (pages->block != block || multifd_queue_full(pages)) {
> > +        if (!multifd_send_pages()) {
> > +            return false;
> >          }
> > -    } else {
> > -        changed = true;
> > -    }
> > -
> > -    if (!multifd_send_pages()) {
> > -        return false;
> > -    }
> > -
> > -    if (changed) {
> > -        return multifd_queue_page(block, offset);
> > +        goto retry;
> >      }
> >  
> > +    /* Not empty, and we still have space, do it! */
> > +    multifd_enqueue(pages, offset);
> 
> Hm, here you're missing the flush of the last group of pages of the last
> ramblock. Just like current code...
> 
> ...which means we're relying on the multifd_send_pages() at
> multifd_send_sync_main() to send the last few pages. So how can that
> work when multifd_flush_after_each_section==false? Because it skips the
> sync flag, but would also skip the last send. I'm confused.

IIUC it won't skip the final flush of the last pages.  See
find_dirty_block():

            if (migrate_multifd() &&
                !migrate_multifd_flush_after_each_section()) {
                QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
                int ret = multifd_send_sync_main();
                if (ret < 0) {
                    return ret;
                }
                qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
                qemu_fflush(f);
            }

IMHO this should be the last flush of the pages when we loop one more
round.

Maybe what you're talking about this one (of ram_save_complete())?

    if (migrate_multifd() && !migrate_multifd_flush_after_each_section()) {
        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
    }

I remember we talked about this somewhere in your "file" series,
but.. AFAIU this last RAM_SAVE_FLAG_MULTIFD_FLUSH might be redundant, it
just needs some justifications to double check I didn't miss something.

Now multifd_queue_page() is kind of lazy-mode on flushing, I think that may
make some sense (we assign job unless required, so maybe there's higher
chance that one thread is free?), but I'm not sure whether that's a huge
deal if NIC is the bandwidth, because in that case we'll wait for sender
threads anyway, and they should all be busy at any time.

However even if we flush immediately as long as full, we'd still better
check queue is empty before completion of migration for sure, to make sure
nothing is left.

-- 
Peter Xu


Reply via email to