On 17.07.25 21:39, Steve Sistare wrote:
Tap and vhost devices can be preserved during cpr-transfer using
traditional live migration methods, wherein the management layer
creates new interfaces for the target and fiddles with 'ip link'
to deactivate the old interface and activate the new.
However, CPR can simply send the file descriptors to new QEMU,
with no special management actions required. The user enables
this behavior by specifing '-netdev tap,cpr=on'. The default
is cpr=off.
Hi Steve!
First, me trying to test the series:
SOURCE:
sudo build/qemu-system-x86_64 -display none -vga none -device
pxb-pcie,bus_nr=128,bus=pcie.0,id=pcie.1 -device
pcie-root-port,id=s0,slot=0,bus=pcie.1 -device
pcie-root-port,id=s1,slot=1,bus=pcie.1 -device
pcie-root-port,id=s2,slot=2,bus=pcie.1 -hda
/home/vsementsov/work/vms/newfocal.raw -m 4G -enable-kvm -M q35 -vnc :0
-nodefaults -vga std -qmp stdio -msg timestamp -S -object
memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on -machine
memory-backend=ram0 -machine aux-ram-share=on
{"execute": "qmp_capabilities"}
{"return": {}}
{"execute": "netdev_add", "arguments": {"cpr": true, "script": "no", "downscript": "no", "vhostforce": false, "vhost": false, "queues": 4,
"ifname": "tap0", "type": "tap", "id": "netdev.1"}}
{"return": {}}
{"execute": "device_add", "arguments": {"disable-legacy": "off", "bus": "s1", "netdev": "netdev.1", "driver": "virtio-net-pci", "vectors": 18,
"mq": true, "romfile": "", "mac": "d6:0d:75:f8:0f:b7", "id": "vnet.1"}}
{"return": {}}
{"execute": "cont"}
{"timestamp": {"seconds": 1755977653, "microseconds": 248749}, "event":
"RESUME"}
{"return": {}}
{"timestamp": {"seconds": 1755977657, "microseconds": 366274}, "event": "NIC_RX_FILTER_CHANGED", "data":
{"name": "vnet.1", "path": "/machine/peripheral/vnet.1/virtio-backend"}}
{"execute": "migrate-set-parameters", "arguments": {"mode": "cpr-transfer"}}
{"return": {}}
{"execute": "migrate", "arguments": {"channels": [{"channel-type": "main", "addr": {"path": "/tmp/migr.sock", "transport": "socket", "type": "unix"}},
{"channel-type": "cpr", "addr": {"path": "/tmp/cpr.sock", "transport": "socket", "type": "unix"}}]}}
{"timestamp": {"seconds": 1755977767, "microseconds": 835571}, "event": "STOP"}
{"return": {}}
TARGET:
sudo build/qemu-system-x86_64 -display none -vga none -device
pxb-pcie,bus_nr=128,bus=pcie.0,id=pcie.1 -device
pcie-root-port,id=s0,slot=0,bus=pcie.1 -device
pcie-root-port,id=s1,slot=1,bus=pcie.1 -device
pcie-root-port,id=s2,slot=2,bus=pcie.1 -hda
/home/vsementsov/work/vms/newfocal.raw -m 4G -enable-kvm -M q35 -vnc :1
-nodefaults -vga std -qmp stdio -S -object
memory-backend-file,id=ram0,size=4G,mem-p
ath=/dev/shm/ram0,share=on -machine memory-backend=ram0 -machine aux-ram-share=on -incoming defer -incoming '{"channel-type": "cpr","addr":
{ "transport": "socket","type": "unix", "path": "/tmp/cpr.sock"}}'
<need to wait until "migrate" on source>
{"execute": "qmp_capabilities"}
{"return": {}}
{"execute": "netdev_add", "arguments": {"cpr": true, "script": "no", "downscript": "no", "vhostforce": false, "vhost": false, "queues": 4,
"ifname": "tap0", "type": "tap", "id": "netdev.1"}}
{"return": {}}
{"execute": "device_add", "arguments": {"disable-legacy": "off", "bus": "s1", "netdev": "netdev.1", "driver": "virtio-net-pci", "vectors": 18,
"mq": true, "romfile": "", "mac": "d6:0d:75:f8:0f:b7", "id": "vnet.1"}}
could not disable queue
qemu-system-x86_64: ../hw/net/virtio-net.c:771: virtio_net_set_queue_pairs:
Assertion `!r' failed.
fish: Job 1, 'sudo build/qemu-system-x86_64 -…' terminated by signal SIGABRT
(Abort)
So, it crashes on device_add..
Second, I've come a long way, backporting you TAP v1 series together with
needed parts of CPR and migration channels to QEMU 7.2, fixing different issues
(like, avoid reinitialization of vnet_hdr length on target, avoid simultaneous
use of tap on source an target, avoid making the fd blocking again on target),
and it finally started to work.
But next, I went to support similar migration for vhost-user-blk, and that was
a lot more complex. No reason to pass an fd in preliminary stage, when source
is running (like in CPR), because:
1. we just can't use the fd on target at all, until we stop use it on source,
otherwise we just break vhost-user-blk protocol on the wire (unlike TAP, where
some ioctls called on target doesn't break source)
2. we have to pass enough additional variables, which are simpler to pass
through normal migration channel (how to pass anything except fds through cpr
channel?)
So, I decided to go another way, and just migrate everything backend-related including
fds through main migration channel. Of course, this requires deep reworking of device
initialization in case of incoming migration (but for vhost-user-blk we need it anyway).
The feature is in my series "[PATCH 00/33] vhost-user-blk: live-backend local
migration" (you are in CC).
The success with vhost-user-blk (of-course) make me rethink TAP migration too: try to
avoid using additional cpr channel and unusual waiting for QMP interface on target. And,
I've just sent an RFC: "[RFC 0/7] virtio-net: live-TAP local migration"
What do you think?
--
Best regards,
Vladimir