> -----Original Message----- > From: John Johnson <john.g.john...@oracle.com> > Sent: 14 February 2022 18:50 > To: Thanos Makatos <thanos.maka...@nutanix.com> > Cc: qemu-devel@nongnu.org > Subject: Re: [RFC v4 20/21] vfio-user: migration support > > > > > On Feb 11, 2022, at 5:31 AM, Thanos Makatos > <thanos.maka...@nutanix.com> wrote: > > > > > >> -----Original Message----- > >> From: Qemu-devel <qemu-devel- > >> bounces+thanos.makatos=nutanix....@nongnu.org> On Behalf Of John > >> Johnson > >> Sent: 12 January 2022 00:44 > >> To: qemu-devel@nongnu.org > >> Subject: [RFC v4 20/21] vfio-user: migration support > >> > >> > >> +static int vfio_user_dirty_bitmap(VFIOProxy *proxy, > >> + struct vfio_iommu_type1_dirty_bitmap > >> *cmd, > >> + struct vfio_iommu_type1_dirty_bitmap_get > >> + *dbitmap) > >> +{ > >> + g_autofree struct { > >> + VFIOUserDirtyPages msg; > >> + VFIOUserBitmapRange range; > >> + } *msgp = NULL; > >> + int msize, rsize; > >> + > >> + /* > >> + * If just the command is sent, the returned bitmap isn't needed. > >> + * The bitmap structs are different from the ioctl() versions, > >> + * ioctl() returns the bitmap in a local VA > >> + */ > >> + if (dbitmap != NULL) { > >> + msize = sizeof(*msgp); > >> + rsize = msize + dbitmap->bitmap.size; > >> + msgp = g_malloc0(rsize); > >> + msgp->range.iova = dbitmap->iova; > >> + msgp->range.size = dbitmap->size; > >> + msgp->range.bitmap.pgsize = dbitmap->bitmap.pgsize; > >> + msgp->range.bitmap.size = dbitmap->bitmap.size; > >> + } else { > >> + msize = rsize = sizeof(VFIOUserDirtyPages); > >> + msgp = g_malloc0(rsize); > >> + } > >> + > >> + vfio_user_request_msg(&msgp->msg.hdr, VFIO_USER_DIRTY_PAGES, > msize, > >> 0); > >> + msgp->msg.argsz = rsize - sizeof(VFIOUserHdr); > >> + msgp->msg.flags = cmd->flags; > >> + > >> + vfio_user_send_wait(proxy, &msgp->msg.hdr, NULL, rsize, false); > >> + if (msgp->msg.hdr.flags & VFIO_USER_ERROR) { > >> + return -msgp->msg.hdr.error_reply; > >> + } > > > > We need to check argsz in the response, in which case the client needs to > > retry > with a larger argsz. > > > > The client needs to retry if the server reply can be larger than the > client > expects, such as GET_REGION_INFO, where the client doesn’t know how many > capabilities > will be returned a priori. > > In this case, argsz is derived from dbitmap->bitmap.size, which was set > in > vfio_get_dirty_bitmap() to be large enough to cover the entire range: > > pages = REAL_HOST_PAGE_ALIGN(range->size) / qemu_real_host_page_size; > range->bitmap.size = ROUND_UP(pages, sizeof(__u64) * BITS_PER_BYTE) / > BITS_PER_BYTE; > > so I’m not sure we’d ever see a case where the reply is larger than expected.
The client is doing the right thing, and most likely at this stage a different argsz would mean a server bug. I did actually run into this case because of a server bug, should we at least check and report and error if it's not what we expect?