On Mon, 2025-10-27 at 18:14 -0700, Matthew Brost wrote: > On Sat, Oct 25, 2025 at 02:04:10PM +0200, Thomas Hellström wrote: > > Mimic the dma-buf method using dma_[map|unmap]_resource to map > > for pcie-p2p dma. > > > > There's an ongoing area of work upstream to sort out how this best > > should be done. One method proposed is to add an additional > > pci_p2p_dma_pagemap aliasing the device_private pagemap and use > > the corresponding pci_p2p_dma_pagemap page as input for > > dma_map_page(). However, that would incur double the amount of > > memory and latency to set up the drm_pagemap and given the huge > > amount of memory present on modern GPUs, that would really not > > work. > > Hence the simple approach used in this patch. > > > > Signed-off-by: Thomas Hellström <[email protected]> > > --- > > drivers/gpu/drm/xe/xe_svm.c | 44 > > ++++++++++++++++++++++++++++++++++--- > > drivers/gpu/drm/xe/xe_svm.h | 1 + > > 2 files changed, 42 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/xe/xe_svm.c > > b/drivers/gpu/drm/xe/xe_svm.c > > index 9dd96dad2cca..9814f95cb212 100644 > > --- a/drivers/gpu/drm/xe/xe_svm.c > > +++ b/drivers/gpu/drm/xe/xe_svm.c > > @@ -3,6 +3,8 @@ > > * Copyright © 2024 Intel Corporation > > */ > > > > +#include <linux/pci-p2pdma.h> > > + > > #include <drm/drm_drv.h> > > #include <drm/drm_managed.h> > > #include <drm/drm_pagemap.h> > > @@ -442,6 +444,24 @@ static u64 xe_page_to_dpa(struct page *page) > > return dpa; > > } > > > > +static u64 xe_page_to_pcie(struct page *page) > > +{ > > + struct xe_pagemap *xpagemap = xe_page_to_pagemap(page); > > + struct xe_vram_region *vr = xe_pagemap_to_vr(xpagemap); > > + u64 hpa_base = xpagemap->hpa_base; > > + u64 ioaddr; > > + u64 pfn = page_to_pfn(page); > > + u64 offset; > > + > > + xe_assert(vr->xe, is_device_private_page(page)); > > + xe_assert(vr->xe, (pfn << PAGE_SHIFT) >= hpa_base); > > + > > + offset = (pfn << PAGE_SHIFT) - hpa_base; > > + ioaddr = vr->io_start + offset; > > + > > + return ioaddr; > > +} > > + > > enum xe_svm_copy_dir { > > XE_SVM_COPY_TO_VRAM, > > XE_SVM_COPY_TO_SRAM, > > @@ -793,7 +813,10 @@ static bool xe_has_interconnect(struct > > drm_pagemap_peer *peer1, > > struct device *dev1 = xe_peer_to_dev(peer1); > > struct device *dev2 = xe_peer_to_dev(peer2); > > > > - return dev1 == dev2; > > + if (dev1 == dev2) > > + return true; > > + > > + return pci_p2pdma_distance(to_pci_dev(dev1), dev2, true) > > >= 0; > > } > > > > static DRM_PAGEMAP_OWNER_LIST_DEFINE(xe_owner_list); > > @@ -1530,13 +1553,27 @@ xe_drm_pagemap_device_map(struct > > drm_pagemap *dpagemap, > > This relates to my comment here [1]. Perhaps this is where we should > build in the “map for atomic” logic and route it through get_pages? I > think that could work quite nicely and wouldn’t require an additional > “do these two page maps support atomics?” check. > What do you think?
I think initially we should just make sure we use a local dpagemap for atomic. When we implement atomic-supporting fast interconnects we could look at something like this. In the migrate code we could easily migrate what's needed to migrate for atomic support. For get_pages(), we also need a check, but wouldn't it then be sufficient to verify that all pages in a range are from the same dpagemap, like we do today and in addition verify that the dpagemap in question supports atomic operations? /Thomas > > Matt > > [1] > https://patchwork.freedesktop.org/patch/683511/?series=156525&rev=1#comment_1255409 > > > addr = xe_page_to_dpa(page); > > prot = XE_INTERCONNECT_VRAM; > > } else { > > - addr = DMA_MAPPING_ERROR; > > - prot = 0; > > + addr = dma_map_resource(dev, > > + xe_page_to_pcie(page), > > + PAGE_SIZE << order, dir, > > + DMA_ATTR_SKIP_CPU_SYNC); > > + prot = XE_INTERCONNECT_P2P; > > } > > > > return drm_pagemap_addr_encode(addr, prot, order, dir); > > } > > > > +static void xe_drm_pagemap_device_unmap(struct drm_pagemap > > *dpagemap, > > + struct device *dev, > > + struct drm_pagemap_addr > > addr) > > +{ > > + if (addr.proto != XE_INTERCONNECT_P2P) > > + return; > > + > > + dma_unmap_resource(dev, addr.addr, PAGE_SIZE << > > addr.order, > > + addr.dir, DMA_ATTR_SKIP_CPU_SYNC); > > +} > > + > > static void xe_pagemap_destroy_work(struct work_struct *work) > > { > > struct xe_pagemap *xpagemap = container_of(work, > > typeof(*xpagemap), destroy_work); > > @@ -1573,6 +1610,7 @@ static void xe_pagemap_destroy(struct > > drm_pagemap *dpagemap, bool from_atomic_or > > > > static const struct drm_pagemap_ops xe_drm_pagemap_ops = { > > .device_map = xe_drm_pagemap_device_map, > > + .device_unmap = xe_drm_pagemap_device_unmap, > > .populate_mm = xe_drm_pagemap_populate_mm, > > .destroy = xe_pagemap_destroy, > > }; > > diff --git a/drivers/gpu/drm/xe/xe_svm.h > > b/drivers/gpu/drm/xe/xe_svm.h > > index 7cd7932f56c8..f5ed48993b6d 100644 > > --- a/drivers/gpu/drm/xe/xe_svm.h > > +++ b/drivers/gpu/drm/xe/xe_svm.h > > @@ -13,6 +13,7 @@ > > #include <drm/drm_pagemap_util.h> > > > > #define XE_INTERCONNECT_VRAM DRM_INTERCONNECT_DRIVER > > +#define XE_INTERCONNECT_P2P (XE_INTERCONNECT_VRAM + 1) > > > > struct drm_device; > > struct drm_file; > > -- > > 2.51.0 > >
