On Wed, Oct 20, 2021 at 03:44:08PM +0200, David Hildenbrand wrote: > On 18.08.21 21:42, Peter Xu wrote: > > This is a long pending issue that we haven't fixed. The issue is in QEMU we > > have implicit device ordering requirement when realizing, otherwise some of > > the > > device may not work properly. > > > > The initial requirement comes from when vfio-pci starts to work with > > vIOMMUs. > > To make sure vfio-pci will get the correct DMA address space, the vIOMMU > > device > > needs to be created before vfio-pci otherwise vfio-pci will stop working > > when > > the guest enables the vIOMMU and the device at the same time. > > > > AFAIU Libvirt should have code that guarantees that. For QEMU cmdline > > users, > > they need to pay attention or things will stop working at some point. > > > > Recently there's a growing and similar requirement on vDPA. It's not a hard > > requirement so far but vDPA has patches that try to workaround this issue. > > > > This patchset allows us to realize the devices in the order that e.g. > > platform > > devices will be created first (bus device, IOMMU, etc.), then the rest of > > normal devices. It's done simply by ordering the QemuOptsList of "device" > > entries before realization. The priority so far comes from migration > > priorities which could be a little bit odd, but that's really about the same > > problem and we can clean that part up in the future. > > > > Libvirt can still keep its ordering for sure so old QEMU will still work, > > however that won't be needed for new qemus after this patchset, so with the > > new > > binary we should be able to specify qemu cmdline as wish on '-device'. > > > > Logically this should also work for vDPA and the workaround code can be done > > with more straightforward approaches. > > > > Please review, thanks. > > Hi Peter, looks like I have another use case: > > vhost devices can heavily restrict the number of available memslots: > e.g., upstream KVM ~64k, vhost-user usually 32 (!). With virtio-mem > intending to make use of multiple memslots [1] and auto-detecting how > many to use based on currently avilable memslots when plugging and > realizing the virtio-mem device, this implies that realizing vhost > devices (especially vhost-user device) after virtio-mem devices can > similarly result in issues: when trying realization of the vhost device > with restricted memslots, QEMU will bail out. > > So similarly, we'd want to realize any vhost-* before any virtio-mem device.
Ordering virtio-mem vs vhost-* devices doesn't feel like a good solution to this problem. eg if you start a guest with several vhost-* devices, then virtio-mem auto-decides to use all/most remaining memslots, we've now surely broken the abiltiy to then hotplug more vhost-* devices at runtime by not leaving memslots for them. I think virtio-mem configuration needs to be stable in its memslot usage regardless of how many other types of devices are present, and not auto-adjust how many it consumes. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|