On Mon, Jun 01, 2026 at 08:17:15PM +0200, Christian König wrote: > On 6/1/26 19:47, Jason Gunthorpe wrote: > > On Mon, Jun 01, 2026 at 11:59:55AM +0200, Christian König wrote: > >>>> When you have a complete open source driver stack which utilizes > >>>> VFIO passthrough as the interface to communicate with the kernel > >>>> drivers then we can eventually talk about that. > >>> > >>> That decision is not up to dmabuf > >> > >> Yes it is. This is the DMA-buf API which is added here. > > > > It is a DMA-buf kernel API that is added, I think it is overreaching > > to try to veto a VFIO uAPI that calls it.. > > Well as long as that is a private interface between VFIO and mlx5 I > have no objection at all.
Well, as you know, we are using dmabuf to mediate many of these connections now. I don't mind a "private" interface as a starting point, but it does need to discoverable and negotiated without weird module dependencies or symbol_gets. > But when it starts to affect DMA-buf I need to make sure that it > works for everybody. And without even being able to test it that > becomes really tricky. They should have an argument how it can be used for CPU backed memory, IMHO. > > This exposes a PCI SIG defined TPH capability in a reasonable simple > > VFIO uAPI that can be re-used by any other device that happens to > > support TPH on inbound MMIO. The uAPI has sensible general semantics > > based around the PCI spec. > > That it's implementing an official PCI spec is a good argument. > > But on the other hand looking at the spec it's not really specifying > much since everything is architecture specific. Yeah, spec doesn't say what TPH does when it is received. It is intended as an opaque channel between the source and target. Even on the CPU DRAM side we make an opaque call into ACPI and the BIOS returns back the right value to use for the CPU. The whole thing is agressively opaque as to what the values mean to any particular device. So I don't have an issue with VFIO supplying a value for MMIO it owns, it fits the general architecture. > > Anyone can repeat the demonstration Meta outlined in their cover > > letter: Use this new VFIO uAPI, import the DMABUF to mlx5, use a PCI > > analyzer and you will see the PCI SIG defined TPH bits set the way the > > VFIO uAPI says they should be set. > > > > There is nothing uniquely tied to Meta's device here, or unusable by > > someone else's devices. Arguably this is actually a mlx5 feature to > > allow VFIO to control its TPH generation HW. > > Would it be possible to demonstrate the functionality with some FPGA > implementing an PCIe endpoint? Sure, you don't even need a special endpoint, any endpoint that doesn't explode when it receives a TPH is fine to illustrate that mlx5 is emitting it correctly. A fpga reference board with an out of the box PCIe IP demo is likely entirely sufficient, and you can use a FPGA logic analyzer to inspect the packets. Though keep in mind mlx5 is formally supporting TPH in a growing number of kernel contexts, so we do test and verify our device is working properly as an initiator. So I wouldn't advocate anyone actually use their time on FPGA :) > Doesn't needs to be anything funky, just the ability to exercise > this for basically everybody who can spend a few $ on the HW. Topologically you also probably need a PCIe switch as the CPU P2P likely discards the header. Jason
