On 2/27/26 23:04, Jason Gunthorpe wrote:
> On Fri, Feb 27, 2026 at 01:52:15PM -0800, Alex Mastro wrote:
>> On Fri, Feb 27, 2026 at 03:48:07PM -0400, Jason Gunthorpe wrote:
>>>>> I actually would like to go the other way and have VFIO always have a
>>>>> DMABUF under the VMA's it mmaps because that will make it easy to
>>>>> finish the type1 emulation which requires finding dmabufs for the
>>>>> VMAs.
>>>
>>> This is a still better idea since it avoid duplicating the VMA flow
>>> into two parts..
>>
>> I suppose this would also compose with your idea to use dma-buf for
>> iommufd_compat support of VFIO_IOMMU_MAP_DMA of vfio device fd-backed mmap()s
>> [1]? Instead of needing to materialize a new dma-buf, you could use the 
>> existing
>> backing one?
> 
> Yeah, that too
> 
> I think it is a fairly easy progression:
> 
> 1) mmap_prepare() allocates a new dmabuf file * and sticks it in
>    desc->vm_file. Rework so all the vma_ops are using vm_file that is
>    a dmabuf. The allocated dmabuf has a singleton range

Interesting approach to fix this, but I would suggest something even simpler:

Use the same structure as base class for the VFIO and DMA-buf file for your 
vma->vm_file->private_data object.

The DMA-buf file actually contains the real ranges exposed by it and pointing 
to the exporting VFIO, while the one for the VFIO is just a dummy covering the 
whole range and pointing to itself.

This way you should be able to use the same vm_operations_struct for VMAs 
mapped through both DMA-buf and the VFIO file descriptors.


Independent of how you implement this just one additional warning: huge_fault 
has caused a number of really hard to debug problems on x86.

As far as I know background is that on x86 pte_special() only works on true 
leave pte but not pmd/pud.

That in turn results in some nasty surprises when your PFNs are potentially 
backed by struct pages, e.g. for direct I/O. For example on the resulting 
mmap() get_user_pages_fast() works, but get_user_pages() doesn't.

I hope that those problems aren't applicable here, but if it is Thomas from the 
Intel XE team can give you more details on that stuff.

Regards,
Christian.

> 2) Teach the fault handlers to support full range semantics
> 3) Use dmabuf revoke variables/etc in the mmap fault handlers
> 4) Move the address space from the vfio to the dmabuf
> 5) Allow mmaping the dmabuf fd directly which is now only a couple lines
> 
> I forget how all the different mmap implementations in vfio interact
> though - but I think the above is good for vfio-pci
> 
> Jason

Reply via email to