> -----Original Message-----
> From: Peter Xu <pet...@redhat.com>
> Sent: Thursday, March 28, 2024 3:26 AM
> To: Liu, Yuan1 <yuan1....@intel.com>
> Cc: Daniel P. Berrangé <berra...@redhat.com>; faro...@suse.de; qemu-
> de...@nongnu.org; hao.xi...@bytedance.com; bryan.zh...@bytedance.com; Zou,
> Nanhai <nanhai....@intel.com>
> Subject: Re: [PATCH v5 5/7] migration/multifd: implement initialization of
> qpl compression
> 
> On Fri, Mar 22, 2024 at 12:40:32PM -0400, Peter Xu wrote:
> > > > void multifd_recv_zero_page_process(MultiFDRecvParams *p)
> > > > {
> > > >     for (int i = 0; i < p->zero_num; i++) {
> > > >         void *page = p->host + p->zero[i];
> > > >         if (!buffer_is_zero(page, p->page_size)) {
> > > >             memset(page, 0, p->page_size);
> > > >         }
> > > >     }
> > > > }
> >
> > It may not matter much (where I also see your below comments), but just
> to
> > mention another solution to avoid this read is that we can maintain
> > RAMBlock->receivedmap for precopy (especially, multifd, afaiu multifd
> > doesn't yet update this bitmap.. even if normal precopy does), then here
> > instead of scanning every time, maybe we can do:
> >
> >   /*
> >    * If it's the 1st time receiving it, no need to clear it as it must
> be
> >    * all zeros now.
> >    */
> >   if (bitmap_test(rb->receivedmap, page_offset)) {
> >       memset(page, 0, ...);
> >   } else {
> >       bitmap_set(rb->receivedmap, page_offset);
> >   }
> >
> > And we also always set the bit when !zero too.
> >
> > My rational is that it's unlikely a zero page if it's sent once or more,
> > while OTOH for the 1st time we receive it, it must be a zero page, so no
> > need to scan for the 1st round.
> 
> Thinking about this, I'm wondering whether we should have this regardless.
> IIUC now multifd will always require two page faults on destination for
> anonymous guest memories (I suppose shmem/hugetlb is fine as no zero page
> in those worlds).  Even though it should be faster than DMA faults, it
> still is unwanted.
> 
> I'll take a note myself as todo to do some measurements in the future
> first.  However if anyone thinks that makes sense and want to have a look,
> please say so.  It'll be more than welcomed.

Yes, I think this is a better improvement to avoid two page faults. I can test
the performance impact of this change on SVM-capable devices and give some data
later. As we saw before, the IOTLB flush occurs via COW, with the change, the 
impact of the COW should be gone.

If you need more testing and analysis on this, please let me know

Reply via email to