On Mon, Aug 23, 2021 at 06:05:07PM -0400, Michael S. Tsirkin wrote: > On Mon, Aug 23, 2021 at 03:18:51PM -0400, Peter Xu wrote: > > On Mon, Aug 23, 2021 at 02:49:12PM -0400, Eduardo Habkost wrote: > > > On Wed, Aug 18, 2021 at 03:43:18PM -0400, Peter Xu wrote: > > > > QEMU creates -device objects in order as specified by the user's > > > > cmdline. > > > > However that ordering may not be the ideal order. For example, some > > > > platform > > > > devices (vIOMMUs) may want to be created earlier than most of the rest > > > > devices (e.g., vfio-pci, virtio). > > > > > > > > This patch orders the QemuOptsList of '-device's so they'll be sorted > > > > first > > > > before kicking off the device realizations. This will allow the device > > > > realization code to be able to use APIs like > > > > pci_device_iommu_address_space() > > > > correctly, because those functions rely on the platfrom devices being > > > > realized. > > > > > > > > Now we rely on vmsd->priority which is defined as MigrationPriority to > > > > provide > > > > the ordering, as either VM init and migration completes will need such > > > > an > > > > ordering. In the future we can move that priority information out of > > > > vmsd. > > > > > > > > Signed-off-by: Peter Xu <pet...@redhat.com> > > > > > > Can we be 100% sure that changing the ordering of every single > > > device being created won't affect guest ABI? (I don't think we can) > > > > That's a good question, however I doubt whether there's any real-world guest > > ABI for that. As a developer, I normally specify cmdline parameter in an > > adhoc > > way, so that I assume most parameters are not sensitive to ordering and I > > can > > tune the ordering as wish. I'm not sure whether that's common for qemu > > users, > > I would expect so, but I may have missed something that I'm not aware of. > > > > Per my knowledge the only "guest ABI" change is e.g. when we specify > > "vfio-pci" > > to be before "intel-iommu": it'll be constantly broken before this patchset, > > while after this series it'll be working. It's just that I don't think > > those > > "guest ABI" is necessary to be kept, and that's exactly what I want to fix > > with > > the patchset.. > > > > > > > > How many device types in QEMU have non-default vmsd priority? > > > > Not so much; here's the list of priorities and the devices using it: > > > > |--------------------+---------| > > | priority | devices | > > |--------------------+---------| > > | MIG_PRI_IOMMU | 3 | > > | MIG_PRI_PCI_BUS | 7 | > > | MIG_PRI_VIRTIO_MEM | 1 | > > | MIG_PRI_GICV3_ITS | 1 | > > | MIG_PRI_GICV3 | 1 | > > |--------------------+---------| > > iommu is probably ok. I think virtio mem is ok too, > in that it is normally created by virtio-mem-pci ...
Hmm this reminded me whether virtio-mem-pci could have another devfn allocated after being moved.. But frankly I still doubt whether we should guarantee that guest ABI on user not specifying addr=XXX in pci device parameters - I feel like it's a burden that we don't need to carry. (Btw, trying to keep the order is one thing; declare it guest ABI would be another thing to me) > > > > > All the rest devices are using the default (0) priority. > > > > > > > > Can we at least ensure devices with the same priority won't be > > > reordered, just to be safe? (qsort() doesn't guarantee that) > > > > > > If very few device types have non-default vmsd priority and > > > devices with the same priority aren't reordered, the risk of > > > compatibility breakage would be much smaller. > > > > I'm also wondering whether it's a good thing to break some guest ABI due to > > this change, if possible. > > > > Let's imagine something breaks after applied, then the only reason should be > > that qsort() changed the order of some same-priority devices and it's not > > the > > same as user specified any more. Then, does it also means there's yet > > another > > ordering requirement that we didn't even notice? > > > > I doubt whether that'll even happen (or I think there'll be report already, > > as > > in qemu man page there's no requirement on parameter ordering). In all > > cases, > > instead of "keeping the same priority devices in the same order as the user > > has > > specified", IMHO we should make the broken devices to have different > > priorities > > so the ordering will be guaranteed by qemu internal, rather than how user > > specified it. > > Well giving user control of guest ABI is a reasonable thing to do, > it is realize order that users do not really care about. Makes sense. > > I guess we could move pci slot allocation out of realize > so it does not depend on realize order? Yes that sounds like another approach, but it seems to require more changes. Thanks, -- Peter Xu