On Tue, Aug 05, 2025 at 03:53:09PM -0400, Steven Sistare wrote:
> On 8/5/2025 9:54 AM, Fabiano Rosas wrote:
> > Steve Sistare <steven.sist...@oracle.com> writes:
> > 
> > > Tap and vhost devices can be preserved during cpr-transfer using
> > > traditional live migration methods, wherein the management layer
> > > creates new interfaces for the target and fiddles with 'ip link'
> > > to deactivate the old interface and activate the new.
> > > 
> > > However, CPR can simply send the file descriptors to new QEMU,
> > > with no special management actions required.  The user enables
> > > this behavior by specifing '-netdev tap,cpr=on'.  The default
> > > is cpr=off.
> > > 
> > > Steve Sistare (8):
> > >    migration: stop vm earlier for cpr
> > >    migration: cpr setup notifier
> > >    vhost: reset vhost devices for cpr
> > >    cpr: delete all fds
> > >    Revert "vhost-backend: remove vhost_kernel_reset_device()"
> > >    tap: common return label
> > >    tap: cpr support
> > >    tap: postload fix for cpr
> > > 
> > >   qapi/net.json             |   5 +-
> > >   include/hw/virtio/vhost.h |   1 +
> > >   include/migration/cpr.h   |   3 +-
> > >   include/net/tap.h         |   1 +
> > >   hw/net/virtio-net.c       |  20 +++++++
> > >   hw/vfio/device.c          |   2 +-
> > >   hw/virtio/vhost-backend.c |   6 ++
> > >   hw/virtio/vhost.c         |  32 +++++++++++
> > >   migration/cpr.c           |  24 ++++++--
> > >   migration/migration.c     |  38 ++++++++-----
> > >   net/tap-win32.c           |   5 ++
> > >   net/tap.c                 | 141 
> > > +++++++++++++++++++++++++++++++++++-----------
> > >   12 files changed, 223 insertions(+), 55 deletions(-)
> > 
> > Hi Steve,
> > 
> > Patches 1-2 seem to potentially interact with your arm pending
> > interrupts fix. Do we want them together?
> 
> Good observation, thanks!.  I may need patches 1-2 to completely close
> the dropped interrupt race.  I will do more testing to verify that.

Sorry to respond late.. Could I request (for each of the patches 1 & 2)
in-code comments explaining the order of events?

For example, patch 1 moved stop_vm even earlier for CPR.  It used to be
early because it wants to avoid dirty tracking: this is something I didn't
realize but remembered after re-read the doc..  Now it further needs to
avoid the notifiers.  A comment above stop_vm for cpr explaining all these
order of events would be really helpful (including any necessary doc update).

-- 
Peter Xu


Reply via email to