On Tue, Jan 27, 2026 at 04:48:34PM -0800, Matthew Brost wrote:
> The dma-map IOVA alloc, link, and sync APIs perform significantly better
> than dma-map / dma-unmap, as they avoid costly IOMMU synchronizations.
> This difference is especially noticeable when mapping a 2MB region in
> 4KB pages.
> 
> Use the IOVA alloc, link, and sync APIs for DRM pagemap, which create DMA
> mappings between the CPU and GPU for copying data.
> 
> Signed-off-by: Matthew Brost <[email protected]>
> ---
>  drivers/gpu/drm/drm_pagemap.c | 121 +++++++++++++++++++++++++++-------
>  1 file changed, 96 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_pagemap.c b/drivers/gpu/drm/drm_pagemap.c
> index 4b79d4019453..b928c89f4bd1 100644
> --- a/drivers/gpu/drm/drm_pagemap.c
> +++ b/drivers/gpu/drm/drm_pagemap.c
> @@ -287,6 +287,7 @@ drm_pagemap_migrate_map_device_pages(struct device *dev,
>   * @migrate_pfn: Array of page frame numbers of system pages or peer pages 
> to map.
>   * @npages: Number of system pages or peer pages to map.
>   * @dir: Direction of data transfer (e.g., DMA_BIDIRECTIONAL)
> + * @state: DMA IOVA state for mapping.
>   *
>   * This function maps pages of memory for migration usage in GPU SVM. It
>   * iterates over each page frame number provided in @migrate_pfn, maps the
> @@ -300,26 +301,79 @@ drm_pagemap_migrate_map_system_pages(struct device *dev,
>                                    struct drm_pagemap_addr *pagemap_addr,
>                                    unsigned long *migrate_pfn,
>                                    unsigned long npages,
> -                                  enum dma_data_direction dir)
> +                                  enum dma_data_direction dir,
> +                                  struct dma_iova_state *state)
>  {
> -     unsigned long i;
> +     struct page *dummy_page = NULL;
> +     unsigned long i, psize;
> +     bool try_alloc = false;
>  
>       for (i = 0; i < npages;) {
>               struct page *page = migrate_pfn_to_page(migrate_pfn[i]);
> -             dma_addr_t dma_addr;
> -             struct folio *folio;
> +             dma_addr_t dma_addr = -1;
>               unsigned int order = 0;
>  
> -             if (!page)
> -                     goto next;
> +             if (!page) {
> +                     if (!dummy_page)
> +                             goto next;
>  
> -             WARN_ON_ONCE(is_device_private_page(page));
> -             folio = page_folio(page);
> -             order = folio_order(folio);
> +                     page = dummy_page;

Why is this dummy_page required? Is it intended to introduce holes in the
IOVA space? If so, what necessitates those holes? You can have less mapped
than IOVA and dma_iova_*() API can handle it.

Thanks

Reply via email to