On 8/28/2025 11:48 AM, Steven Sistare wrote:
On 8/23/2025 5:53 PM, Vladimir Sementsov-Ogievskiy wrote:
On 17.07.25 21:39, Steve Sistare wrote:
Tap and vhost devices can be preserved during cpr-transfer using
traditional live migration methods, wherein the management layer
creates new interfaces for the target and fiddles with 'ip link'
to deactivate the old interface and activate the new.

However, CPR can simply send the file descriptors to new QEMU,
with no special management actions required.  The user enables
this behavior by specifing '-netdev tap,cpr=on'.  The default
is cpr=off.

Hi Steve!

First, me trying to test the series:

Thank-you Vladimir for all the work you are doing in this area.  I have
reproduced the "virtio_net_set_queue_pairs: Assertion `!r' failed." bug.
Let me dig into that before I study the larger questions you pose
about preserving tap/vhost-user-blk in local migration versus cpr.

I have reproduced your journey!  I fixed the assertion, the vnet_hdr, and
the blocking fd problems which you allude to.  The attached patch fixes
them, and will be squashed into the series.

Ben, you also reported the !r assertion failure, so this fix should help
you also.

SOURCE:

sudo build/qemu-system-x86_64 -display none -vga none -device 
pxb-pcie,bus_nr=128,bus=pcie.0,id=pcie.1 -device 
pcie-root-port,id=s0,slot=0,bus=pcie.1 -device 
pcie-root-port,id=s1,slot=1,bus=pcie.1 -device 
pcie-root-port,id=s2,slot=2,bus=pcie.1 -hda 
/home/vsementsov/work/vms/newfocal.raw -m 4G -enable-kvm -M q35 -vnc :0 
-nodefaults -vga std -qmp stdio -msg timestamp -S -object 
memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on -machine 
memory-backend=ram0 -machine aux-ram-share=on

{"execute": "qmp_capabilities"}
{"return": {}}
{"execute": "netdev_add", "arguments": {"cpr": true, "script": "no", "downscript": "no", "vhostforce": false, "vhost": false, "queues": 4, 
"ifname": "tap0", "type": "tap", "id": "netdev.1"}}
{"return": {}}
{"execute": "device_add", "arguments": {"disable-legacy": "off", "bus": "s1", "netdev": "netdev.1", "driver": "virtio-net-pci", "vectors": 18, 
"mq": true, "romfile": "", "mac": "d6:0d:75:f8:0f:b7", "id": "vnet.1"}}
{"return": {}}
{"execute": "cont"}
{"timestamp": {"seconds": 1755977653, "microseconds": 248749}, "event": 
"RESUME"}
{"return": {}}
{"timestamp": {"seconds": 1755977657, "microseconds": 366274}, "event": "NIC_RX_FILTER_CHANGED", "data": 
{"name": "vnet.1", "path": "/machine/peripheral/vnet.1/virtio-backend"}}
{"execute": "migrate-set-parameters", "arguments": {"mode": "cpr-transfer"}}
{"return": {}}
{"execute": "migrate", "arguments": {"channels": [{"channel-type": "main", "addr": {"path": "/tmp/migr.sock", "transport": "socket", "type": "unix"}}, 
{"channel-type": "cpr", "addr": {"path": "/tmp/cpr.sock", "transport": "socket", "type": "unix"}}]}}
{"timestamp": {"seconds": 1755977767, "microseconds": 835571}, "event": "STOP"}
{"return": {}}

TARGET:

sudo build/qemu-system-x86_64 -display none -vga none -device 
pxb-pcie,bus_nr=128,bus=pcie.0,id=pcie.1 -device 
pcie-root-port,id=s0,slot=0,bus=pcie.1 -device 
pcie-root-port,id=s1,slot=1,bus=pcie.1 -device 
pcie-root-port,id=s2,slot=2,bus=pcie.1 -hda 
/home/vsementsov/work/vms/newfocal.raw -m 4G -enable-kvm -M q35 -vnc :1 
-nodefaults -vga std -qmp stdio -S -object 
memory-backend-file,id=ram0,size=4G,mem-p
ath=/dev/shm/ram0,share=on -machine memory-backend=ram0 -machine aux-ram-share=on -incoming defer -incoming '{"channel-type": "cpr","addr": 
{ "transport": "socket","type": "unix", "path": "/tmp/cpr.sock"}}'

<need to wait until "migrate" on source>

{"execute": "qmp_capabilities"}
{"return": {}}
{"execute": "netdev_add", "arguments": {"cpr": true, "script": "no", "downscript": "no", "vhostforce": false, "vhost": false, "queues": 4, 
"ifname": "tap0", "type": "tap", "id": "netdev.1"}}
{"return": {}}
{"execute": "device_add", "arguments": {"disable-legacy": "off", "bus": "s1", "netdev": "netdev.1", "driver": "virtio-net-pci", "vectors": 18, 
"mq": true, "romfile": "", "mac": "d6:0d:75:f8:0f:b7", "id": "vnet.1"}}
could not disable queue
qemu-system-x86_64: ../hw/net/virtio-net.c:771: virtio_net_set_queue_pairs: 
Assertion `!r' failed.
fish: Job 1, 'sudo build/qemu-system-x86_64 -…' terminated by signal SIGABRT 
(Abort)

So, it crashes on device_add..

Second, I've come a long way, backporting you TAP v1 series together with 
needed parts of CPR and migration channels to QEMU 7.2, fixing different issues 
(like, avoid reinitialization of vnet_hdr length on target, avoid simultaneous 
use of tap on source an target, avoid making the fd blocking again on target), 
and it finally started to work.

But next, I went to support similar migration for vhost-user-blk, and that was 
a lot more complex. No reason to pass an fd in preliminary stage, when source 
is running (like in CPR), because:

1. we just can't use the fd on target at all, until we stop use it on source, 
otherwise we just break vhost-user-blk protocol on the wire (unlike TAP, where 
some ioctls called on target doesn't break source)
2. we have to pass enough additional variables, which are simpler to pass 
through normal migration channel (how to pass anything except fds through cpr 
channel?)

You can pass extra state through the cpr channel.  See for example 
vmstate_cpr_vfio_device,
and how vmstate_cpr_vfio_devices is defined as a sub-section of 
vmstate_cpr_state.

So, I decided to go another way, and just migrate everything backend-related including 
fds through main migration channel. Of course, this requires deep reworking of device 
initialization in case of incoming migration (but for vhost-user-blk we need it anyway). 
The feature is in my series "[PATCH 00/33] vhost-user-blk: live-backend local 
migration" (you are in CC).

You did a lot of work in those series!
I suspect much less rework of initialization is required if you pass variables 
in cpr state.

The success with vhost-user-blk (of-course) make me rethink TAP migration too: try to 
avoid using additional cpr channel and unusual waiting for QMP interface on target. And, 
I've just sent an RFC: "[RFC 0/7] virtio-net: live-TAP local migration"

Is there a use case for this outside of CPR?
CPR is intended to be the "local migration" solution that does it all :)
But if you do proceed with your local migration tap solution, I would want
to see that CPR could also use your code paths.

- Steve
From 191521210222638940c17d4daefc313d4ad61aa3 Mon Sep 17 00:00:00 2001
From: Steve Sistare <steven.sist...@oracle.com>
Date: Thu, 28 Aug 2025 13:47:15 -0700
Subject: [PATCH] tap: cpr fixes

Fix "virtio_net_set_queue_pairs: Assertion `!r' failed."
Fix "virtio-net: saved image requires vnet_hdr=on"
Do not change blocking mode of incoming cpr fd's.

Reported-by: Ben Chaney <bcha...@akamai.com>
Reported-by: Vladimir Sementsov-Ogievskiy <vsement...@yandex-team.ru>

Signed-off-by: Steve Sistare <steven.sist...@oracle.com>
---
 hw/net/virtio-net.c | 6 ++++++
 io/channel-socket.c | 5 ++++-
 net/tap.c           | 2 ++
 stubs/cpr.c         | 5 +++++
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 7c40cca..7dd8a80 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -37,6 +37,7 @@
 #include "qapi/qapi-types-migration.h"
 #include "qapi/qapi-events-migration.h"
 #include "hw/virtio/virtio-access.h"
+#include "migration/cpr.h"
 #include "migration/misc.h"
 #include "standard-headers/linux/ethtool.h"
 #include "system/system.h"
@@ -758,6 +759,11 @@ static void virtio_net_set_queue_pairs(VirtIONet *n)
     int i;
     int r;
 
+    if (cpr_is_incoming()) {
+        /* peers are already attached, do nothing */
+        return;
+    }
+
     if (n->nic->peer_deleted) {
         return;
     }
diff --git a/io/channel-socket.c b/io/channel-socket.c
index 3b7ca92..736d39d 100644
--- a/io/channel-socket.c
+++ b/io/channel-socket.c
@@ -24,6 +24,7 @@
 #include "io/channel-socket.h"
 #include "io/channel-util.h"
 #include "io/channel-watch.h"
+#include "migration/cpr.h"
 #include "trace.h"
 #include "qapi/clone-visitor.h"
 #ifdef CONFIG_LINUX
@@ -498,7 +499,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg,
             }
 
             /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */
-            qemu_socket_set_block(fd);
+            if (!cpr_is_incoming()) {
+                qemu_socket_set_block(fd);
+            }
 
 #ifndef MSG_CMSG_CLOEXEC
             qemu_set_cloexec(fd);
diff --git a/net/tap.c b/net/tap.c
index 25ff96f..f95ffe9 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -1042,6 +1042,8 @@ free_fail:
                 if (cpr && fd >= 0) {
                     cpr_save_fd(name, TAP_FD_INDEX(i), fd);
                 }
+            } else {
+                vnet_hdr = tap->has_vnet_hdr ? tap->vnet_hdr : 1;
             }
             if (fd == -1) {
                 ret = -1;
diff --git a/stubs/cpr.c b/stubs/cpr.c
index 6a5c320..86a507c 100644
--- a/stubs/cpr.c
+++ b/stubs/cpr.c
@@ -19,3 +19,8 @@ int cpr_find_fd(const char *name, int id)
 {
     return -1;
 }
+
+bool cpr_is_incoming(void)
+{
+    return false;
+}
-- 
1.8.3.1

Reply via email to