On Mon, Oct 19, 2020 at 08:39:03AM +0000, Liu, Yi L wrote: > Hi Jason, > > Good to see your response.
Ah, I was away > > > > Second, IOMMU nested translation is a per IOMMU domain > > > > capability. Since IOMMU domains are managed by VFIO/VDPA > > > > (alloc/free domain, attach/detach device, set/get domain attribute, > > > > etc.), reporting/enabling the nesting capability is an natural > > > > extension to the domain uAPI of existing passthrough frameworks. > > > > Actually, VFIO already includes a nesting enable interface even > > > > before this series. So it doesn't make sense to generalize this uAPI > > > > out. > > > > The subsystem that obtains an IOMMU domain for a device would have to > > register it with an open FD of the '/dev/sva'. That is the connection > > between the two subsystems. It would be some simple kernel internal > > stuff: > > > > sva = get_sva_from_file(fd); > > Is this fd provided by userspace? I suppose the /dev/sva has a set of uAPIs > which will finally program page table to host iommu driver. As far as I know, > it's weird for VFIO user. Why should VFIO user connect to a /dev/sva fd after > it sets a proper iommu type to the opened container. VFIO container already > stands for an iommu context with which userspace could program page mapping > to host iommu. Again the point is to dis-aggregate the vIOMMU related stuff from VFIO so it can be shared between more subsystems that need it. I'm sure there will be some weird overlaps because we can't delete any of the existing VFIO APIs, but that should not be a blocker. Having VFIO run in a mode where '/dev/sva' provides all the IOMMU handling is a possible path. If your plan is to just opencode everything into VFIO then I don't see how VDPA will work well, and if proper in-kernel abstractions are built I fail to see how routing some of it through userspace is a fundamental problem. > > sva_register_device_to_pasid(sva, pasid, pci_device, iommu_domain); > > So this is supposed to be called by VFIO/VDPA to register the info to > /dev/sva. > right? And in dev/sva, it will also maintain the device/iommu_domain and pasid > info? will it be duplicated with VFIO/VDPA? Each part needs to have the information it needs? > > > > Moreover, mapping page fault to subdevice requires pre- > > > > registering subdevice fault data to IOMMU layer when binding > > > > guest page table, while such fault data can be only retrieved from > > > > parent driver through VFIO/VDPA. > > > > Not sure what this means, page fault should be tied to the PASID, any > > hookup needed for that should be done in-kernel when the device is > > connected to the PASID. > > you may refer to chapter 184.108.40.206 of VT-d spec. Page request is reported to > software together with the requestor id of the device. For the page request > injects to guest, it should have the device info. Whoever provides the vIOMMU emulation and relays the page fault to the guest has to translate the RID - what does that have to do with VFIO? How will VPDA provide the vIOMMU emulation? Jason _______________________________________________ iommu mailing list email@example.com https://lists.linuxfoundation.org/mailman/listinfo/iommu