> A migration compatibility interface has not been determined for vfio. > We currently rely on the vendor drivers to provide their own internal > validation and harmlessly reject migration from an incompatible device. > It would be great if we could make progress on this, but it's a > difficult problem, and one that I hope we can further address once we > have a base level of migration support. > > It's great to revisit ideas, but proclaiming a uAPI is bad solely > because the data transfer is opaque, without defining why that's bad,
That makes sense. I feel what is missing from all of these discussions is comparison with an existing Out-of-Process solution - namely vhost-user. As a result I feel the proposals tend to forget some of the lessons learned designing that interface. In particular I personally see cross-version and cross vendor migration as a litmus test: it is a hard problem, one that 1. I do not believe vendors will be motivated enough to solve by themselves 2. I don't believe QEMU will be able to add after the fact for the reason that "supporting QEMU" will come to not imply any level of compatibility whatsoever. That was a hard learned lesson and that's the reason I (and maybe Jason, too) keep harping on that, not that it's so burningly important by itself. I think at this point we have an opportunity to make people document their interfaces up to a point and also actually somewhat standardize them, using upstream inclusion as a carrot. Some big vendors will probably ignore it, small ones hopefully won't. After X years margins become thin, vendors lose interest, and we are at that point glad we have standards and documentation. > evaluating the feasibility and implementation of defining a well > specified data format rather than protocol, including cross-vendor > support, or proposing any sort of alternative is not so helpful imo. For example, with a registry of supported device/vendor/subsystem tuples and a list of compatibility features and a documented migration data format for each, maintained in QEMU, with a handshake validating that would create a kind of a registry documenting what is compatible with what. That could then serve for debugging, validation, and also help push people towards more standard interfaces. That is just one idea. > Note that we also migrate guest memory as opaque data; we don't require > knowing the data structures it holds or how regions are used, we simply > look for changes and transfer the new data. That's not so different > from a vendor driver passing us a blob of data as "information it needs > to replicate the device state at the target." I don't really understand this argument. At the device level we know exactly how is each region used: some are IO, some are RAM. In fact one can migrate between systems released years apart. -- MST