On Wed, Apr 30, 2025 at 11:23 AM Paul B. Henson via Users <
users@lists.libvirt.org> wrote:

> I'm using libvirt under Debian 12 (9.0.0-4+deb12u2 w/qemu
> 7.2+dfsg-7+deb12u12).
>
> I have a vm using sr-iov, and configured it with a failover macvtap
> interface so I could live migrate it. However, there is a significant
> delay at the end of the migration resulting in a lot of lost traffic. If
> I only have the macvtap interface, migration completes immediately at
> the end of the transfer of memory with no loss of traffic.
>
> I enabled debug logging, and found the following. On the source system,
> it logs that the system is paused for the cutover:
>
> 2025-04-30 01:08:12.526+0000: 1696180: debug :
> qemuMigrationAnyCompleted:1957 : Migration paused before switchover
>
> at that point, for almost a minute, the source system just keeps
> printing the same statistics:
>
> 2025-04-30 01:08:12.923+0000: 1696272: info :
> qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_REPLY:
> mon=0x7f8fdc0ad2f0 reply={"return": {"expected-downtime": 300, "status":
> "device", "setup-time": 297, "total-time": 26107, "ram": {"total":
> 137452265472, "postcopy-requests": 0, "dirty-sync-count": 3,
> "multifd-bytes": 2821784576, "pages-per-second": 297855,
> "downtime-bytes": 13208, "page-size": 4096, "remaining": 0,
> "postcopy-bytes": 0, "mbps": 9786.9158461538464, "transferred":
> 3117658825, "dirty-sync-missed-zero-copy": 0, "precopy-bytes":
> 295861041, "duplicate": 32874480, "dirty-pages-rate": 56, "skipped": 0,
> "normal-bytes": 2804301824, "normal": 684644}}, "id": "libvirt-577"}
> [...]
> 2025-04-30 01:09:06.290+0000: 1696272: info :
> qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_REPLY:
> mon=0x7f8fdc0ad2f0 reply={"return": {"expected-downtime": 300, "status":
> "device", "setup-time": 297, "total-time"
> : 79474, "ram": {"total": 137452265472, "postcopy-requests": 0,
> "dirty-sync-count": 3, "multifd-bytes": 2821784576, "pages-per-second":
> 297855, "downtime-bytes": 13208, "page-size": 4096, "remaining": 0,
> "postcopy-bytes"
> : 0, "mbps": 9786.9158461538464, "transferred": 3117658825,
> "dirty-sync-missed-zero-copy": 0, "precopy-bytes": 295861041,
> "duplicate": 32874480, "dirty-pages-rate": 56, "skipped": 0,
> "normal-bytes": 2804301824, "normal":
>   684644}}, "id": "libvirt-629"}
>
> until finally it completes:
>
> 2025-04-30 01:09:06.327+0000: 1696272: info :
> qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_EVENT:
> mon=0x7f8fdc0ad2f0 event={"timestamp": {"seconds": 1745975346,
> "microseconds": 327382}, "event": "MIGRATION", "dat
> a": {"status": "completed"}}
>
>
> On the destination side, it says something about negotiating failover
> for the network link:
>
> 2025-04-30 01:08:12.923+0000: 1384503: info :
> qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_EVENT: mon=
> 0x7fc7900ab2f0 event={"timestamp": {"seconds": 1745975292,
> "microseconds": 922783}, "event": "FAILOVER_NEGOTIA
> TED", "data": {"device-id": "ua-sr-iov-backup"}}
>
> Then nothing happens for about a minute until it says it is done:
>
> 2025-04-30 01:09:06.328+0000: 1384503: debug :
> qemuMonitorJSONIOProcessLine:189 : Line [{"timestamp": {"second
> s": 1745975346, "microseconds": 327991}, "event": "MIGRATION", "data":
> {"status": "completed"}}]
>
>
> Any thoughts on what is going on here to cause this delay? It's clearly
> somehow related to the sv-iov component of the migration.
>
What is your model of sr-iov cards? And what is the the XML configration
for sr-iov?
The additional downtime could be caused by VFIO migrations. The downtime of
VFIO
has been reduced after libvirt-10.5.0(
https://gitlab.com/libvirt/libvirt/-/commit/1cc7737f69 )
and QEMU 8.1 (
https://github.com/qemu/qemu/blob/master/qapi/migration.json#L462)

You can try to update to these versions or above to see if the downtime is
improved.

>
> Thanks much…
>
>

Reply via email to