On Mon, Jul 28, 2025 at 6:24 PM Jonah Palmer <jonah.pal...@oracle.com> wrote:
>
>
>
> On 7/28/25 11:30 AM, Eugenio Perez Martin wrote:
> > On Tue, Jul 22, 2025 at 2:41 PM Jonah Palmer <jonah.pal...@oracle.com> 
> > wrote:
> >>
> >> Iterative live migration for virtio-net sends an initial
> >> VMStateDescription while the source is still active. Because data
> >> continues to flow for virtio-net, the guest's avail index continues to
> >> increment after last_avail_idx had already been sent. This causes the
> >> destination to often see something like this from virtio_error():
> >>
> >> VQ 0 size 0x100 Guest index 0x0 inconsistent with Host index 0xc: delta 
> >> 0xfff4
> >>
> >> This patch suppresses this consistency check if we're loading the
> >> initial VMStateDescriptions via iterative migration and unsuppresses
> >> it for the stop-and-copy phase when the final VMStateDescriptions
> >> (carrying the correct indices) are loaded.
> >>
> >> A temporary VirtIODevMigration migration data structure is introduced here 
> >> to
> >> represent the iterative migration process for a VirtIODevice. For now it
> >> just holds a flag to indicate whether or not the initial
> >> VMStateDescription was sent during the iterative live migration process.
> >>
> >> Signed-off-by: Jonah Palmer <jonah.pal...@oracle.com>
> >> ---
> >>   hw/net/virtio-net.c        | 13 +++++++++++++
> >>   hw/virtio/virtio.c         | 32 ++++++++++++++++++++++++--------
> >>   include/hw/virtio/virtio.h |  6 ++++++
> >>   3 files changed, 43 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> >> index 86a6fe5b91..b7ac5e8278 100644
> >> --- a/hw/net/virtio-net.c
> >> +++ b/hw/net/virtio-net.c
> >> @@ -3843,12 +3843,19 @@ static void virtio_net_save_cleanup(void *opaque)
> >>
> >>   static int virtio_net_load_setup(QEMUFile *f, void *opaque, Error **errp)
> >>   {
> >> +    VirtIONet *n = opaque;
> >> +    VirtIODevice *vdev = VIRTIO_DEVICE(n);
> >> +    vdev->migration = g_new0(VirtIODevMigration, 1);
> >> +    vdev->migration->iterative_vmstate_loaded = false;
> >> +
> >>       return 0;
> >>   }
> >>
> >>   static int virtio_net_load_state(QEMUFile *f, void *opaque, int 
> >> version_id)
> >>   {
> >>       VirtIONet *n = opaque;
> >> +    VirtIODevice *vdev = VIRTIO_DEVICE(n);
> >> +    VirtIODevMigration *mig = vdev->migration;
> >>       uint64_t flag;
> >>
> >>       flag = qemu_get_be64(f);
> >> @@ -3861,6 +3868,7 @@ static int virtio_net_load_state(QEMUFile *f, void 
> >> *opaque, int version_id)
> >>           case VNET_MIG_F_INIT_STATE:
> >>           {
> >>               vmstate_load_state(f, &vmstate_virtio_net, n, 
> >> VIRTIO_NET_VM_VERSION);
> >> +            mig->iterative_vmstate_loaded = true;
> >
> > This code will need to change if we send the status iteratively more
> > than once. For example, if the guest changes the mac address, the
> > number of vqs, etc.
> >
>
> Hopefully we can reach a solution where we'd only need to call the full
> vmstate_load_state(f, &vmstate_virtio_net, ...) for a virtio-net device
> once and then handle any changes afterwards individually.
>
> Perhaps, maybe for simplicity, we could just send the
> sub-states/subsections (instead of the whole state again) iteratively if
> there were any changes in the fields that those sub-states/subsections
> govern.
>
> Definitely something I'll keep in mind as this series develops.
>
> > In my opinion, we should set a flag named "in_iterative_migration" (or
> > equivalent) in virtio_net_load_setup and clear it in
> > virtio_net_load_cleanup. That's enough to tell in virtio_load if we
> > should perform actions like checking for inconsistent indices.
> >
>
> I did actually try something like this but I realized that the
> .load_cleanup and .save_cleanup hooks actually fire at the very end of
> live migration (e.g. during the stop-and-copy phase). I thought they
> fired at the end of the iterative portion of live migration, but this
> didn't appear to be the case.
>

Ok that makes a lot of sense. What about .switchover_start ? We need
the switchover capability though, not sure if it is a good idea to
mandate it as a requirement. So yes, maybe this patch is the most
reliable way to do so.


> >>               break;
> >>           }
> >>           default:
> >> @@ -3875,6 +3883,11 @@ static int virtio_net_load_state(QEMUFile *f, void 
> >> *opaque, int version_id)
> >>
> >>   static int virtio_net_load_cleanup(void *opaque)
> >>   {
> >> +    VirtIONet *n = opaque;
> >> +    VirtIODevice *vdev = VIRTIO_DEVICE(n);
> >> +    g_free(vdev->migration);
> >> +    vdev->migration = NULL;
> >> +
> >>       return 0;
> >>   }
> >>
> >> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> >> index 5534251e01..68957ee7d1 100644
> >> --- a/hw/virtio/virtio.c
> >> +++ b/hw/virtio/virtio.c
> >> @@ -3222,6 +3222,7 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int 
> >> version_id)
> >>       int32_t config_len;
> >>       uint32_t num;
> >>       uint32_t features;
> >> +    bool inconsistent_indices;
> >>       BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
> >>       VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
> >>       VirtioDeviceClass *vdc = VIRTIO_DEVICE_GET_CLASS(vdev);
> >> @@ -3365,6 +3366,16 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int 
> >> version_id)
> >>           if (vdev->vq[i].vring.desc) {
> >>               uint16_t nheads;
> >>
> >> +           /*
> >> +            * Ring indices will be inconsistent during iterative 
> >> migration. The actual
> >> +            * indices will be sent later during the stop-and-copy phase.
> >> +            */
> >> +            if (vdev->migration) {
> >> +                inconsistent_indices = 
> >> !vdev->migration->iterative_vmstate_loaded;
> >> +            } else {
> >> +                inconsistent_indices = false;
> >> +            }
> >
> > Nit, "inconsistent_indices = vdev->migration &&
> > !vdev->migration->iterative_vmstate_loaded" ? I'm happy with the
> > current "if else" too, but I think the one line is clearer. Your call
> > :).
> >
>
> Ah, nice catch! I like the one-liner more :) Will change this for next
> series.
>
> >> +
> >>               /*
> >>                * VIRTIO-1 devices migrate desc, used, and avail ring 
> >> addresses so
> >>                * only the region cache needs to be set up.  Legacy devices 
> >> need
> >> @@ -3384,14 +3395,19 @@ virtio_load(VirtIODevice *vdev, QEMUFile *f, int 
> >> version_id)
> >>                   continue;
> >>               }
> >>
> >> -            nheads = vring_avail_idx(&vdev->vq[i]) - 
> >> vdev->vq[i].last_avail_idx;
> >> -            /* Check it isn't doing strange things with descriptor 
> >> numbers. */
> >> -            if (nheads > vdev->vq[i].vring.num) {
> >> -                virtio_error(vdev, "VQ %d size 0x%x Guest index 0x%x "
> >> -                             "inconsistent with Host index 0x%x: delta 
> >> 0x%x",
> >> -                             i, vdev->vq[i].vring.num,
> >> -                             vring_avail_idx(&vdev->vq[i]),
> >> -                             vdev->vq[i].last_avail_idx, nheads);
> >> +            if (!inconsistent_indices) {
> >> +                nheads = vring_avail_idx(&vdev->vq[i]) - 
> >> vdev->vq[i].last_avail_idx;
> >> +                /* Check it isn't doing strange things with descriptor 
> >> numbers. */
> >> +                if (nheads > vdev->vq[i].vring.num) {
> >> +                    virtio_error(vdev, "VQ %d size 0x%x Guest index 0x%x "
> >> +                                 "inconsistent with Host index 0x%x: 
> >> delta 0x%x",
> >> +                                 i, vdev->vq[i].vring.num,
> >> +                                 vring_avail_idx(&vdev->vq[i]),
> >> +                                 vdev->vq[i].last_avail_idx, nheads);
> >> +                    inconsistent_indices = true;
> >> +                }
> >> +            }
> >> +            if (inconsistent_indices) {
> >>                   vdev->vq[i].used_idx = 0;
> >>                   vdev->vq[i].shadow_avail_idx = 0;
> >>                   vdev->vq[i].inuse = 0;
> >> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> >> index 214d4a77e9..06b6e6ba65 100644
> >> --- a/include/hw/virtio/virtio.h
> >> +++ b/include/hw/virtio/virtio.h
> >> @@ -98,6 +98,11 @@ enum virtio_device_endian {
> >>       VIRTIO_DEVICE_ENDIAN_BIG,
> >>   };
> >>
> >> +/* VirtIODevice iterative live migration data structure */
> >> +typedef struct VirtIODevMigration {
> >> +    bool iterative_vmstate_loaded;
> >> +} VirtIODevMigration;
> >> +
> >>   /**
> >>    * struct VirtIODevice - common VirtIO structure
> >>    * @name: name of the device
> >> @@ -151,6 +156,7 @@ struct VirtIODevice
> >>       bool disable_legacy_check;
> >>       bool vhost_started;
> >>       VMChangeStateEntry *vmstate;
> >> +    VirtIODevMigration *migration;
> >>       char *bus_name;
> >>       uint8_t device_endian;
> >>       /**
> >> --
> >> 2.47.1
> >>
> >
>


Reply via email to