On Tue, Aug 05, 2025 at 03:53:09PM -0400, Steven Sistare wrote: > On 8/5/2025 9:54 AM, Fabiano Rosas wrote: > > Steve Sistare <steven.sist...@oracle.com> writes: > > > > > Tap and vhost devices can be preserved during cpr-transfer using > > > traditional live migration methods, wherein the management layer > > > creates new interfaces for the target and fiddles with 'ip link' > > > to deactivate the old interface and activate the new. > > > > > > However, CPR can simply send the file descriptors to new QEMU, > > > with no special management actions required. The user enables > > > this behavior by specifing '-netdev tap,cpr=on'. The default > > > is cpr=off. > > > > > > Steve Sistare (8): > > > migration: stop vm earlier for cpr > > > migration: cpr setup notifier > > > vhost: reset vhost devices for cpr > > > cpr: delete all fds > > > Revert "vhost-backend: remove vhost_kernel_reset_device()" > > > tap: common return label > > > tap: cpr support > > > tap: postload fix for cpr > > > > > > qapi/net.json | 5 +- > > > include/hw/virtio/vhost.h | 1 + > > > include/migration/cpr.h | 3 +- > > > include/net/tap.h | 1 + > > > hw/net/virtio-net.c | 20 +++++++ > > > hw/vfio/device.c | 2 +- > > > hw/virtio/vhost-backend.c | 6 ++ > > > hw/virtio/vhost.c | 32 +++++++++++ > > > migration/cpr.c | 24 ++++++-- > > > migration/migration.c | 38 ++++++++----- > > > net/tap-win32.c | 5 ++ > > > net/tap.c | 141 > > > +++++++++++++++++++++++++++++++++++----------- > > > 12 files changed, 223 insertions(+), 55 deletions(-) > > > > Hi Steve, > > > > Patches 1-2 seem to potentially interact with your arm pending > > interrupts fix. Do we want them together? > > Good observation, thanks!. I may need patches 1-2 to completely close > the dropped interrupt race. I will do more testing to verify that.
Sorry to respond late.. Could I request (for each of the patches 1 & 2) in-code comments explaining the order of events? For example, patch 1 moved stop_vm even earlier for CPR. It used to be early because it wants to avoid dirty tracking: this is something I didn't realize but remembered after re-read the doc.. Now it further needs to avoid the notifiers. A comment above stop_vm for cpr explaining all these order of events would be really helpful (including any necessary doc update). -- Peter Xu