Hi Christian,

> Subject: Re: [PATCH v4 1/5] PCI/P2PDMA: Don't enforce ACS check for device
> functions of Intel GPUs
> 
> >>
> >> On 23.09.25 15:38, Jason Gunthorpe wrote:
> >>> On Tue, Sep 23, 2025 at 03:28:53PM +0200, Christian König wrote:
> >>>> On 23.09.25 15:12, Jason Gunthorpe wrote:
> >>>>>> When you want to communicate addresses in a device specific
> address
> >>>>>> space you need a device specific type for that and not abuse
> >>>>>> phys_addr_t.
> >>>>>
> >>>>> I'm not talking about abusing phys_addr_t, I'm talking about putting a
> >>>>> legitimate CPU address in there.
> >>>>>
> >>>>> You can argue it is hack in Xe to reverse engineer the VRAM offset
> >>>>> from a CPU physical, and I would be sympathetic, but it does allow
> >>>>> VFIO to be general not specialized to Xe.
> >>>>
> >>>> No, exactly that doesn't work for all use cases. That's why I'm
> >>>> pushing back so hard on using phys_addr_t or CPU addresses.
> >>>>
> >>>> See the CPU address is only valid temporary because the VF BAR is
> >>>> only a window into the device memory.
> >>>
> >>> I know, generally yes.
> >>>
> >>> But there should be no way that a VFIO VF driver in the hypervisor
> >>> knows what is currently mapped to the VF's BAR. The only way I can
> >>> make sense of what Xe is doing here is if the VF BAR is a static
> >>> aperture of the VRAM..
> >>>
> >>> Would be nice to know the details.
> >>
> >> Yeah, that's why i asked how VFIO gets the information which parts of the
> >> it's BAR should be part of the DMA-buf?
> >>
> >> That would be really interesting to know.
> > As Jason guessed, we are relying on the GPU VF being a Large BAR
> > device here. In other words, as you suggested, this will not work if the
> > VF BAR size is not as big as its actual VRAM portion. We can certainly add
> > this check but we have not seen either the GPU PF or VF getting detected
> > as Small BAR devices in various test environments.
> >
> > So, given the above, once a VF device is bound to vfio-pci driver and
> > assigned to a Guest VM (launched via Qemu), Qemu's vfio layer maps
> > all the VF's resources including the BARs. This mapping info (specifically
> HVA)
> > is leveraged (by Qemu) to identity the offset at which the Guest VM's buffer
> > is located (in the BAR) and this info is then provided to vfio-pci kernel 
> > driver
> > which finally creates the dmabuf (with BAR Addresses).
> 
> In that case I strongly suggest to add a private DMA-buf interface for the 
> DMA-
> bufs exported by vfio-pci which returns which BAR and offset the DMA-buf
> represents.
Does this private dmabuf interface already exist or does it need to be created
from the ground up?

If it already exists, could you please share an example/reference of how you
have used it with amdgpu or other drivers?

If it doesn't exist, I was wondering if it should be based on any particular 
best
practices/ideas (or design patterns) that already exist in other drivers?

> 
> Ideally using the same structure Qemu used to provide the offset to the vfio-
> pci driver, but not a must have.
> 
> This way the driver for the GPU PF (XE) can leverage this interface, validates
> that the DMA-buf comes from a VF it feels responsible for and do the math to
> figure out in which parts of the VRAM needs to be accessed to scanout the
> picture.
Sounds good. This is definitely a viable path forward and it looks like we are 
all
in agreement with this idea.

I guess we can start exploring how to implement the private dmabuf interface
mechanism right away.

Thanks,
Vivek

> 
> This way this private vfio-pci interface can also be used by iommufd for
> example.
> 
> Regards,
> Christian.
> 
> >
> > Thanks,
> > Vivek
> >
> >>
> >> Regards,
> >> Christian.
> >>
> >>>
> >>>> What Simona agreed on is exactly what I proposed as well, that you
> >>>> get a private interface for exactly that use case.
> >>>
> >>> A "private" interface to exchange phys_addr_t between at least
> >>> VFIO/KVM/iommufd - sure no complaint with that.
> >>>
> >>> Jason
> >

Reply via email to