Sorry for reviving this old thread, I lost the best timing to follow up
on this while I was on vacation. I have been working on this and found
out some discrepancy, please see below.
On 4/5/23 04:37, Eugenio Perez Martin wrote:
Hi!
As mentioned in the last upstream virtio-networking meeting, one of
the factors that adds more downtime to migration is the handling of
the guest memory (pin, map, etc). At this moment this handling is
bound to the virtio life cycle (DRIVER_OK, RESET). In that sense, the
destination device waits until all the guest memory / state is
migrated to start pinning all the memory.
The proposal is to bind it to the char device life cycle (open vs
close),
Hmmm, really? If it's the life cycle for char device, the next guest /
qemu launch on the same vhost-vdpa device node won't make it work.
so all the guest memory can be pinned for all the guest / qemu
lifecycle.
I think to tie pinning to guest / qemu process life cycle makes more
sense. Essentially this pinning part needs to be decoupled from the
iotlb mapping abstraction layer, and can / should work as a standalone
uAPI. Such that QEMU at the destination may launch and pin all guest's
memory as needed without having to start the device, while awaiting any
incoming migration request. Though problem is, there's no existing vhost
uAPI that could properly serve as the vehicle for that. SET_OWNER /
SET_MEM_TABLE / RESET_OWNER seems a remote fit.. Any objection against
introducing a new but clean vhost uAPI for pinning guest pages, subject
to guest's life cycle?
Another concern is the use_va stuff, originally it tags to the device
level and is made static at the time of device instantiation, which is
fine. But others to come just find a new home at per-group level or
per-vq level struct. Hard to tell whether or not pinning is actually
needed for the latter use_va friends, as they are essentially tied to
the virtio life cycle or feature negotiation. While guest / Qemu starts
way earlier than that. Perhaps just ignore those sub-device level use_va
usages? Presumably !use_va at the device level is sufficient to infer
the need of pinning for device?
Regards,
-Siwei
This has two main problems:
* At this moment the reset semantics forces the vdpa device to unmap
all the memory. So this change needs a vhost vdpa feature flag.
* This may increase the initialization time. Maybe we can delay it if
qemu is not the destination of a LM. Anyway I think this should be
done as an optimization on top.
Any ideas or comments in this regard?
Thanks!