On Thu, Jan 29, 2026 at 02:57:31PM -0400, Jason Gunthorpe wrote:
> On Wed, Jan 28, 2026 at 12:24:25PM -0800, Matthew Brost wrote:
> > On Wed, Jan 28, 2026 at 03:35:09PM -0400, Jason Gunthorpe wrote:
> > > On Wed, Jan 28, 2026 at 10:42:53AM -0800, Matthew Brost wrote:
> > > > Yes, this is exactly what I envision here. First, let me explain the
> > > > possible addressing modes on the UAL fabric:
> > > > 
> > > >  - Physical (akin to IOMMU passthrough)
> > > >  - Virtual (akin to IOMMU enabled)
> > > > 
> > > > Physical mode is straightforward — resolve the PFN to a cross-device
> > > > physical address, then install it into the initiator’s page tables along
> > > > with a bit indicating routing over the network. In this mode, the vfuncs
> > > > here are basically NOPs.
> > > 
> > > Ugh of course they would invent something so complicated.
> > 
> > Why wouldn't we... But conceptually really fairly close to IOMMU
> > paththrough vs. enabled.
> 
> Why do you need address virtualization on the scale up fabric :( I can
> see access control but full virtualization sounds like overkill,
> especially considering how slow it will necessarily be compared to the
> fabric itself.
> 
> We are already in a world where even PCI can't manage untranslated
> requests and a scale up fabric with 3TB/sec of bandwidth is somehow
> going to have address translation too? Doesn't seem reasonable.
> 

I don’t design hardware…

But inter-OS security wants virtualization. In practice, intra-OS (what
we’re talking about here) should always be physical, but it doesn’t have
to be. Thus, IMO, any common API we come up with should support all
conceivable addressing modes that might be implemented.

> > > I'm not convinced this should be hidden inside DRM. The DMA API is the
> > 
> > 
> > Well, what I’m suggesting isn’t in DRM. A UAL API would be its own
> > layer, much like the DMA API. Of course we could stick this in the DMA
> > API and make it high-speed-fabric-generic, etc., but I do think the
> > fabric functions would have their own signatures and semantics (see my
> > explanation around device_ual_alloc reclaim rules, what locks it is
> > allowed to take, etc.).
> 
> DMA API is already bus agnostic, I think there is no issue to plug in
> a ualink_device or whatever under there and make it do something

I have thought about this, which is why our idea was to roughly duplicate
the DMA API and layer it almost exactly the same. My only concern would
be the semantics.

dma_iova_alloc() ← This is reclaim-safe currently, AFAIK.

ual_iova_alloc() ← If this allocates GPU memory for page tables, it is
basically impossible to make reclaim-safe (i.e. call under a notifier
lock), avoid dma-resv locks (i.e., call in map_dma_buf) without
subsysem-level rewrites in DRM for allocating memory and driver-level
rewrites of the bind code / for Xe, Nouveau (likely Nova), and AMDGPU.

Then of course dma_addr_t now means something entirely different from
the original intent.

If we can work something out here, then yes, maybe we can just use the
DMA API, as I believe it should work aside from the semantic changes and
perhaps minor tweaks to go from struct page -> physical address over the
network.

> sensible, and it would be *particularly* easy if the address
> translation can slot in as an attached iommu.

I'm out of my depth on the IOMMU layer so I can't really comment.

Matt

> 
> Jason

Reply via email to