On Mon, 2025-10-27 at 18:14 -0700, Matthew Brost wrote:
> On Sat, Oct 25, 2025 at 02:04:10PM +0200, Thomas Hellström wrote:
> > Mimic the dma-buf method using dma_[map|unmap]_resource to map
> > for pcie-p2p dma.
> > 
> > There's an ongoing area of work upstream to sort out how this best
> > should be done. One method proposed is to add an additional
> > pci_p2p_dma_pagemap aliasing the device_private pagemap and use
> > the corresponding pci_p2p_dma_pagemap page as input for
> > dma_map_page(). However, that would incur double the amount of
> > memory and latency to set up the drm_pagemap and given the huge
> > amount of memory present on modern GPUs, that would really not
> > work.
> > Hence the simple approach used in this patch.
> > 
> > Signed-off-by: Thomas Hellström <[email protected]>
> > ---
> >  drivers/gpu/drm/xe/xe_svm.c | 44
> > ++++++++++++++++++++++++++++++++++---
> >  drivers/gpu/drm/xe/xe_svm.h |  1 +
> >  2 files changed, 42 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/xe/xe_svm.c
> > b/drivers/gpu/drm/xe/xe_svm.c
> > index 9dd96dad2cca..9814f95cb212 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.c
> > +++ b/drivers/gpu/drm/xe/xe_svm.c
> > @@ -3,6 +3,8 @@
> >   * Copyright © 2024 Intel Corporation
> >   */
> >  
> > +#include <linux/pci-p2pdma.h>
> > +
> >  #include <drm/drm_drv.h>
> >  #include <drm/drm_managed.h>
> >  #include <drm/drm_pagemap.h>
> > @@ -442,6 +444,24 @@ static u64 xe_page_to_dpa(struct page *page)
> >     return dpa;
> >  }
> >  
> > +static u64 xe_page_to_pcie(struct page *page)
> > +{
> > +   struct xe_pagemap *xpagemap = xe_page_to_pagemap(page);
> > +   struct xe_vram_region *vr = xe_pagemap_to_vr(xpagemap);
> > +   u64 hpa_base = xpagemap->hpa_base;
> > +   u64 ioaddr;
> > +   u64 pfn = page_to_pfn(page);
> > +   u64 offset;
> > +
> > +   xe_assert(vr->xe, is_device_private_page(page));
> > +   xe_assert(vr->xe, (pfn << PAGE_SHIFT) >= hpa_base);
> > +
> > +   offset = (pfn << PAGE_SHIFT) - hpa_base;
> > +   ioaddr = vr->io_start + offset;
> > +
> > +   return ioaddr;
> > +}
> > +
> >  enum xe_svm_copy_dir {
> >     XE_SVM_COPY_TO_VRAM,
> >     XE_SVM_COPY_TO_SRAM,
> > @@ -793,7 +813,10 @@ static bool xe_has_interconnect(struct
> > drm_pagemap_peer *peer1,
> >     struct device *dev1 = xe_peer_to_dev(peer1);
> >     struct device *dev2 = xe_peer_to_dev(peer2);
> >  
> > -   return dev1 == dev2;
> > +   if (dev1 == dev2)
> > +           return true;
> > +
> > +   return pci_p2pdma_distance(to_pci_dev(dev1), dev2, true)
> > >= 0;
> >  }
> >  
> >  static DRM_PAGEMAP_OWNER_LIST_DEFINE(xe_owner_list);
> > @@ -1530,13 +1553,27 @@ xe_drm_pagemap_device_map(struct
> > drm_pagemap *dpagemap,
> 
> This relates to my comment here [1]. Perhaps this is where we should
> build in the “map for atomic” logic and route it through get_pages? I
> think that could work quite nicely and wouldn’t require an additional
> “do these two page maps support atomics?” check.
> What do you think?

I think initially we should just make sure we use a local dpagemap for
atomic. When we implement atomic-supporting fast interconnects we could
look at something like this. In the migrate code we could easily
migrate what's needed to migrate for atomic support. For get_pages(),
we also need a check, but wouldn't it then be sufficient to verify that
all pages in a range are from the same dpagemap, like we do today and
in addition verify that the dpagemap in question supports atomic
operations?

/Thomas


> 
> Matt
> 
> [1]
> https://patchwork.freedesktop.org/patch/683511/?series=156525&rev=1#comment_1255409
> 
> >             addr = xe_page_to_dpa(page);
> >             prot = XE_INTERCONNECT_VRAM;
> >     } else {
> > -           addr = DMA_MAPPING_ERROR;
> > -           prot = 0;
> > +           addr = dma_map_resource(dev,
> > +                                   xe_page_to_pcie(page),
> > +                                   PAGE_SIZE << order, dir,
> > +                                   DMA_ATTR_SKIP_CPU_SYNC);
> > +           prot = XE_INTERCONNECT_P2P;
> >     }
> >  
> >     return drm_pagemap_addr_encode(addr, prot, order, dir);
> >  }
> >  
> > +static void xe_drm_pagemap_device_unmap(struct drm_pagemap
> > *dpagemap,
> > +                                   struct device *dev,
> > +                                   struct drm_pagemap_addr
> > addr)
> > +{
> > +   if (addr.proto != XE_INTERCONNECT_P2P)
> > +           return;
> > +
> > +   dma_unmap_resource(dev, addr.addr, PAGE_SIZE <<
> > addr.order,
> > +                      addr.dir, DMA_ATTR_SKIP_CPU_SYNC);
> > +}
> > +
> >  static void xe_pagemap_destroy_work(struct work_struct *work)
> >  {
> >     struct xe_pagemap *xpagemap = container_of(work,
> > typeof(*xpagemap), destroy_work);
> > @@ -1573,6 +1610,7 @@ static void xe_pagemap_destroy(struct
> > drm_pagemap *dpagemap, bool from_atomic_or
> >  
> >  static const struct drm_pagemap_ops xe_drm_pagemap_ops = {
> >     .device_map = xe_drm_pagemap_device_map,
> > +   .device_unmap = xe_drm_pagemap_device_unmap,
> >     .populate_mm = xe_drm_pagemap_populate_mm,
> >     .destroy = xe_pagemap_destroy,
> >  };
> > diff --git a/drivers/gpu/drm/xe/xe_svm.h
> > b/drivers/gpu/drm/xe/xe_svm.h
> > index 7cd7932f56c8..f5ed48993b6d 100644
> > --- a/drivers/gpu/drm/xe/xe_svm.h
> > +++ b/drivers/gpu/drm/xe/xe_svm.h
> > @@ -13,6 +13,7 @@
> >  #include <drm/drm_pagemap_util.h>
> >  
> >  #define XE_INTERCONNECT_VRAM DRM_INTERCONNECT_DRIVER
> > +#define XE_INTERCONNECT_P2P (XE_INTERCONNECT_VRAM + 1)
> >  
> >  struct drm_device;
> >  struct drm_file;
> > -- 
> > 2.51.0
> > 

Reply via email to