On 2/27/26 23:04, Jason Gunthorpe wrote: > On Fri, Feb 27, 2026 at 01:52:15PM -0800, Alex Mastro wrote: >> On Fri, Feb 27, 2026 at 03:48:07PM -0400, Jason Gunthorpe wrote: >>>>> I actually would like to go the other way and have VFIO always have a >>>>> DMABUF under the VMA's it mmaps because that will make it easy to >>>>> finish the type1 emulation which requires finding dmabufs for the >>>>> VMAs. >>> >>> This is a still better idea since it avoid duplicating the VMA flow >>> into two parts.. >> >> I suppose this would also compose with your idea to use dma-buf for >> iommufd_compat support of VFIO_IOMMU_MAP_DMA of vfio device fd-backed mmap()s >> [1]? Instead of needing to materialize a new dma-buf, you could use the >> existing >> backing one? > > Yeah, that too > > I think it is a fairly easy progression: > > 1) mmap_prepare() allocates a new dmabuf file * and sticks it in > desc->vm_file. Rework so all the vma_ops are using vm_file that is > a dmabuf. The allocated dmabuf has a singleton range
Interesting approach to fix this, but I would suggest something even simpler: Use the same structure as base class for the VFIO and DMA-buf file for your vma->vm_file->private_data object. The DMA-buf file actually contains the real ranges exposed by it and pointing to the exporting VFIO, while the one for the VFIO is just a dummy covering the whole range and pointing to itself. This way you should be able to use the same vm_operations_struct for VMAs mapped through both DMA-buf and the VFIO file descriptors. Independent of how you implement this just one additional warning: huge_fault has caused a number of really hard to debug problems on x86. As far as I know background is that on x86 pte_special() only works on true leave pte but not pmd/pud. That in turn results in some nasty surprises when your PFNs are potentially backed by struct pages, e.g. for direct I/O. For example on the resulting mmap() get_user_pages_fast() works, but get_user_pages() doesn't. I hope that those problems aren't applicable here, but if it is Thomas from the Intel XE team can give you more details on that stuff. Regards, Christian. > 2) Teach the fault handlers to support full range semantics > 3) Use dmabuf revoke variables/etc in the mmap fault handlers > 4) Move the address space from the vfio to the dmabuf > 5) Allow mmaping the dmabuf fd directly which is now only a couple lines > > I forget how all the different mmap implementations in vfio interact > though - but I think the above is good for vfio-pci > > Jason
