On 8/28/2025 11:48 AM, Steven Sistare wrote:
On 8/23/2025 5:53 PM, Vladimir Sementsov-Ogievskiy wrote:On 17.07.25 21:39, Steve Sistare wrote:Tap and vhost devices can be preserved during cpr-transfer using traditional live migration methods, wherein the management layer creates new interfaces for the target and fiddles with 'ip link' to deactivate the old interface and activate the new.However, CPR can simply send the file descriptors to new QEMU, with no special management actions required. The user enables this behavior by specifing '-netdev tap,cpr=on'. The default is cpr=off.Hi Steve! First, me trying to test the series:Thank-you Vladimir for all the work you are doing in this area. I have reproduced the "virtio_net_set_queue_pairs: Assertion `!r' failed." bug. Let me dig into that before I study the larger questions you pose about preserving tap/vhost-user-blk in local migration versus cpr.
I have reproduced your journey! I fixed the assertion, the vnet_hdr, and the blocking fd problems which you allude to. The attached patch fixes them, and will be squashed into the series. Ben, you also reported the !r assertion failure, so this fix should help you also.
SOURCE: sudo build/qemu-system-x86_64 -display none -vga none -device pxb-pcie,bus_nr=128,bus=pcie.0,id=pcie.1 -device pcie-root-port,id=s0,slot=0,bus=pcie.1 -device pcie-root-port,id=s1,slot=1,bus=pcie.1 -device pcie-root-port,id=s2,slot=2,bus=pcie.1 -hda /home/vsementsov/work/vms/newfocal.raw -m 4G -enable-kvm -M q35 -vnc :0 -nodefaults -vga std -qmp stdio -msg timestamp -S -object memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on -machine memory-backend=ram0 -machine aux-ram-share=on {"execute": "qmp_capabilities"} {"return": {}} {"execute": "netdev_add", "arguments": {"cpr": true, "script": "no", "downscript": "no", "vhostforce": false, "vhost": false, "queues": 4, "ifname": "tap0", "type": "tap", "id": "netdev.1"}} {"return": {}} {"execute": "device_add", "arguments": {"disable-legacy": "off", "bus": "s1", "netdev": "netdev.1", "driver": "virtio-net-pci", "vectors": 18, "mq": true, "romfile": "", "mac": "d6:0d:75:f8:0f:b7", "id": "vnet.1"}} {"return": {}} {"execute": "cont"} {"timestamp": {"seconds": 1755977653, "microseconds": 248749}, "event": "RESUME"} {"return": {}} {"timestamp": {"seconds": 1755977657, "microseconds": 366274}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "vnet.1", "path": "/machine/peripheral/vnet.1/virtio-backend"}} {"execute": "migrate-set-parameters", "arguments": {"mode": "cpr-transfer"}} {"return": {}} {"execute": "migrate", "arguments": {"channels": [{"channel-type": "main", "addr": {"path": "/tmp/migr.sock", "transport": "socket", "type": "unix"}}, {"channel-type": "cpr", "addr": {"path": "/tmp/cpr.sock", "transport": "socket", "type": "unix"}}]}} {"timestamp": {"seconds": 1755977767, "microseconds": 835571}, "event": "STOP"} {"return": {}} TARGET: sudo build/qemu-system-x86_64 -display none -vga none -device pxb-pcie,bus_nr=128,bus=pcie.0,id=pcie.1 -device pcie-root-port,id=s0,slot=0,bus=pcie.1 -device pcie-root-port,id=s1,slot=1,bus=pcie.1 -device pcie-root-port,id=s2,slot=2,bus=pcie.1 -hda /home/vsementsov/work/vms/newfocal.raw -m 4G -enable-kvm -M q35 -vnc :1 -nodefaults -vga std -qmp stdio -S -object memory-backend-file,id=ram0,size=4G,mem-p ath=/dev/shm/ram0,share=on -machine memory-backend=ram0 -machine aux-ram-share=on -incoming defer -incoming '{"channel-type": "cpr","addr": { "transport": "socket","type": "unix", "path": "/tmp/cpr.sock"}}' <need to wait until "migrate" on source> {"execute": "qmp_capabilities"} {"return": {}} {"execute": "netdev_add", "arguments": {"cpr": true, "script": "no", "downscript": "no", "vhostforce": false, "vhost": false, "queues": 4, "ifname": "tap0", "type": "tap", "id": "netdev.1"}} {"return": {}} {"execute": "device_add", "arguments": {"disable-legacy": "off", "bus": "s1", "netdev": "netdev.1", "driver": "virtio-net-pci", "vectors": 18, "mq": true, "romfile": "", "mac": "d6:0d:75:f8:0f:b7", "id": "vnet.1"}} could not disable queue qemu-system-x86_64: ../hw/net/virtio-net.c:771: virtio_net_set_queue_pairs: Assertion `!r' failed. fish: Job 1, 'sudo build/qemu-system-x86_64 -…' terminated by signal SIGABRT (Abort) So, it crashes on device_add.. Second, I've come a long way, backporting you TAP v1 series together with needed parts of CPR and migration channels to QEMU 7.2, fixing different issues (like, avoid reinitialization of vnet_hdr length on target, avoid simultaneous use of tap on source an target, avoid making the fd blocking again on target), and it finally started to work. But next, I went to support similar migration for vhost-user-blk, and that was a lot more complex. No reason to pass an fd in preliminary stage, when source is running (like in CPR), because: 1. we just can't use the fd on target at all, until we stop use it on source, otherwise we just break vhost-user-blk protocol on the wire (unlike TAP, where some ioctls called on target doesn't break source) 2. we have to pass enough additional variables, which are simpler to pass through normal migration channel (how to pass anything except fds through cpr channel?)
You can pass extra state through the cpr channel. See for example vmstate_cpr_vfio_device, and how vmstate_cpr_vfio_devices is defined as a sub-section of vmstate_cpr_state.
So, I decided to go another way, and just migrate everything backend-related including fds through main migration channel. Of course, this requires deep reworking of device initialization in case of incoming migration (but for vhost-user-blk we need it anyway). The feature is in my series "[PATCH 00/33] vhost-user-blk: live-backend local migration" (you are in CC).
You did a lot of work in those series! I suspect much less rework of initialization is required if you pass variables in cpr state.
The success with vhost-user-blk (of-course) make me rethink TAP migration too: try to avoid using additional cpr channel and unusual waiting for QMP interface on target. And, I've just sent an RFC: "[RFC 0/7] virtio-net: live-TAP local migration"
Is there a use case for this outside of CPR? CPR is intended to be the "local migration" solution that does it all :) But if you do proceed with your local migration tap solution, I would want to see that CPR could also use your code paths. - Steve
From 191521210222638940c17d4daefc313d4ad61aa3 Mon Sep 17 00:00:00 2001 From: Steve Sistare <steven.sist...@oracle.com> Date: Thu, 28 Aug 2025 13:47:15 -0700 Subject: [PATCH] tap: cpr fixes Fix "virtio_net_set_queue_pairs: Assertion `!r' failed." Fix "virtio-net: saved image requires vnet_hdr=on" Do not change blocking mode of incoming cpr fd's. Reported-by: Ben Chaney <bcha...@akamai.com> Reported-by: Vladimir Sementsov-Ogievskiy <vsement...@yandex-team.ru> Signed-off-by: Steve Sistare <steven.sist...@oracle.com> --- hw/net/virtio-net.c | 6 ++++++ io/channel-socket.c | 5 ++++- net/tap.c | 2 ++ stubs/cpr.c | 5 +++++ 4 files changed, 17 insertions(+), 1 deletion(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 7c40cca..7dd8a80 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -37,6 +37,7 @@ #include "qapi/qapi-types-migration.h" #include "qapi/qapi-events-migration.h" #include "hw/virtio/virtio-access.h" +#include "migration/cpr.h" #include "migration/misc.h" #include "standard-headers/linux/ethtool.h" #include "system/system.h" @@ -758,6 +759,11 @@ static void virtio_net_set_queue_pairs(VirtIONet *n) int i; int r; + if (cpr_is_incoming()) { + /* peers are already attached, do nothing */ + return; + } + if (n->nic->peer_deleted) { return; } diff --git a/io/channel-socket.c b/io/channel-socket.c index 3b7ca92..736d39d 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -24,6 +24,7 @@ #include "io/channel-socket.h" #include "io/channel-util.h" #include "io/channel-watch.h" +#include "migration/cpr.h" #include "trace.h" #include "qapi/clone-visitor.h" #ifdef CONFIG_LINUX @@ -498,7 +499,9 @@ static void qio_channel_socket_copy_fds(struct msghdr *msg, } /* O_NONBLOCK is preserved across SCM_RIGHTS so reset it */ - qemu_socket_set_block(fd); + if (!cpr_is_incoming()) { + qemu_socket_set_block(fd); + } #ifndef MSG_CMSG_CLOEXEC qemu_set_cloexec(fd); diff --git a/net/tap.c b/net/tap.c index 25ff96f..f95ffe9 100644 --- a/net/tap.c +++ b/net/tap.c @@ -1042,6 +1042,8 @@ free_fail: if (cpr && fd >= 0) { cpr_save_fd(name, TAP_FD_INDEX(i), fd); } + } else { + vnet_hdr = tap->has_vnet_hdr ? tap->vnet_hdr : 1; } if (fd == -1) { ret = -1; diff --git a/stubs/cpr.c b/stubs/cpr.c index 6a5c320..86a507c 100644 --- a/stubs/cpr.c +++ b/stubs/cpr.c @@ -19,3 +19,8 @@ int cpr_find_fd(const char *name, int id) { return -1; } + +bool cpr_is_incoming(void) +{ + return false; +} -- 1.8.3.1