[PATCH v10 22/30] xen: gntdev: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. To avoid such issues, lets use a common dma-mapping wrappers operating directly on the struct sg_table objects and use scatterlist page iterators where possible. This, almost always, hides references to the nents and orig_nents entries, making the code robust, easier to follow and copy/paste safe. Signed-off-by: Marek Szyprowski Acked-by: Juergen Gross --- drivers/xen/gntdev-dmabuf.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index b1b6eebafd5d..4c13cbc99896 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -247,10 +247,9 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, if (sgt) { if (gntdev_dmabuf_attach->dir != DMA_NONE) - dma_unmap_sg_attrs(attach->dev, sgt->sgl, - sgt->nents, - gntdev_dmabuf_attach->dir, - DMA_ATTR_SKIP_CPU_SYNC); + dma_unmap_sgtable(attach->dev, sgt, + gntdev_dmabuf_attach->dir, + DMA_ATTR_SKIP_CPU_SYNC); sg_free_table(sgt); } @@ -288,8 +287,8 @@ dmabuf_exp_ops_map_dma_buf(struct dma_buf_attachment *attach, sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages, gntdev_dmabuf->nr_pages); if (!IS_ERR(sgt)) { - if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, - DMA_ATTR_SKIP_CPU_SYNC)) { + if (dma_map_sgtable(attach->dev, sgt, dir, + DMA_ATTR_SKIP_CPU_SYNC)) { sg_free_table(sgt); kfree(sgt); sgt = ERR_PTR(-ENOMEM); @@ -633,7 +632,7 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, /* Now convert sgt to array of pages and check for page validity. */ i = 0; - for_each_sg_page(sgt->sgl, &sg_iter, sgt->nents, 0) { + for_each_sgtable_page(sgt, &sg_iter, 0) { struct page *page = sg_page_iter_page(&sg_iter); /* * Check if page is valid: this can happen if we are given -- 2.17.1
[PATCH v10 21/30] drm: xen: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. Fix the code to refer to proper nents or orig_nents entries. This driver reports the number of the pages in the imported scatterlist, so it should refer to sg_table->orig_nents entry. Signed-off-by: Marek Szyprowski Acked-by: Oleksandr Andrushchenko --- drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c index 39ff95b75357..0e57c80058b2 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -216,7 +216,7 @@ xen_drm_front_gem_import_sg_table(struct drm_device *dev, return ERR_PTR(ret); DRM_DEBUG("Imported buffer of size %zu with nents %u\n", - size, sgt->nents); + size, sgt->orig_nents); return &xen_obj->base; } -- 2.17.1
[PATCH v2 15/21] drm: xen: fix sg_table nents vs. orig_nents misuse
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the numer of the created entries in the DMA address space. However the subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be called with the original number of entries passed to dma_map_sg. The sg_table->nents in turn holds the result of the dma_map_sg call as stated in include/linux/scatterlist.h. Adapt the code to obey those rules. Signed-off-by: Marek Szyprowski --- For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373 --- drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c index f0b85e0..ba4bdc5 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -215,7 +215,7 @@ struct drm_gem_object * return ERR_PTR(ret); DRM_DEBUG("Imported buffer of size %zu with nents %u\n", - size, sgt->nents); + size, sgt->orig_nents); return &xen_obj->base; } -- 1.9.1
[PATCH v2 18/21] xen: gntdev: fix sg_table nents vs. orig_nents misuse
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the numer of the created entries in the DMA address space. However the subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be called with the original number of entries passed to dma_map_sg. The sg_table->nents in turn holds the result of the dma_map_sg call as stated in include/linux/scatterlist.h. Adapt the code to obey those rules. Signed-off-by: Marek Szyprowski --- For more information, see '[PATCH v2 00/21] DRM: fix struct sg_table nents vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/4/373 --- drivers/xen/gntdev-dmabuf.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index 75d3bb9..ed749fd 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -248,7 +248,7 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, if (sgt) { if (gntdev_dmabuf_attach->dir != DMA_NONE) dma_unmap_sg_attrs(attach->dev, sgt->sgl, - sgt->nents, + sgt->orig_nents, gntdev_dmabuf_attach->dir, DMA_ATTR_SKIP_CPU_SYNC); sg_free_table(sgt); @@ -288,8 +288,10 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages, gntdev_dmabuf->nr_pages); if (!IS_ERR(sgt)) { - if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, - DMA_ATTR_SKIP_CPU_SYNC)) { + sgt->nents = dma_map_sg_attrs(attach->dev, sgt->sgl, + sgt->orig_nents, dir, + DMA_ATTR_SKIP_CPU_SYNC); + if (!sgt->nents) { sg_free_table(sgt); kfree(sgt); sgt = ERR_PTR(-ENOMEM); @@ -625,7 +627,7 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int count) /* Now convert sgt to array of pages and check for page validity. */ i = 0; - for_each_sg_page(sgt->sgl, &sg_iter, sgt->nents, 0) { + for_each_sg_page(sgt->sgl, &sg_iter, sgt->orig_nents, 0) { struct page *page = sg_page_iter_page(&sg_iter); /* * Check if page is valid: this can happen if we are given -- 1.9.1
[PATCH v3 16/25] xen: gntdev: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that dma_map_sg returns the numer of the created entries in the DMA address space. However the subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be called with the original number of the entries passed to dma_map_sg. The sg_table->nents in turn holds the result of the dma_map_sg call as stated in include/linux/scatterlist.h. A common mistake was to ignore a result of the dma_map_sg function and don't use the sg_table->orig_nents at all. To avoid such issues, lets use common dma-mapping wrappers operating directly on the struct sg_table objects and adjust references to the nents and orig_nents respectively. Signed-off-by: Marek Szyprowski --- For more information, see '[PATCH v3 00/25] DRM: fix struct sg_table nents vs. orig_nents misuse' thread: https://lkml.org/lkml/2020/5/5/187 --- drivers/xen/gntdev-dmabuf.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index 75d3bb9..4b22785 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -247,8 +247,7 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, if (sgt) { if (gntdev_dmabuf_attach->dir != DMA_NONE) - dma_unmap_sg_attrs(attach->dev, sgt->sgl, - sgt->nents, + dma_unmap_sgtable_attrs(attach->dev, sgt, gntdev_dmabuf_attach->dir, DMA_ATTR_SKIP_CPU_SYNC); sg_free_table(sgt); @@ -288,7 +287,7 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages, gntdev_dmabuf->nr_pages); if (!IS_ERR(sgt)) { - if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, + if (dma_map_sgtable_attrs(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC)) { sg_free_table(sgt); kfree(sgt); @@ -625,7 +624,7 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int count) /* Now convert sgt to array of pages and check for page validity. */ i = 0; - for_each_sg_page(sgt->sgl, &sg_iter, sgt->nents, 0) { + for_each_sg_page(sgt->sgl, &sg_iter, sgt->orig_nents, 0) { struct page *page = sg_page_iter_page(&sg_iter); /* * Check if page is valid: this can happen if we are given -- 1.9.1
[PATCH v4 27/38] xen: gntdev: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. To avoid such issues, lets use a common dma-mapping wrappers operating directly on the struct sg_table objects and use scatterlist page iterators where possible. This, almost always, hides references to the nents and orig_nents entries, making the code robust, easier to follow and copy/paste safe. Signed-off-by: Marek Szyprowski --- For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents vs. orig_nents misuse' thread: https://lore.kernel.org/dri-devel/20200512085710.14688-1-m.szyprow...@samsung.com/T/ --- drivers/xen/gntdev-dmabuf.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index 75d3bb9..ba6cad8 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -247,10 +247,9 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, if (sgt) { if (gntdev_dmabuf_attach->dir != DMA_NONE) - dma_unmap_sg_attrs(attach->dev, sgt->sgl, - sgt->nents, - gntdev_dmabuf_attach->dir, - DMA_ATTR_SKIP_CPU_SYNC); + dma_unmap_sgtable(attach->dev, sgt, + gntdev_dmabuf_attach->dir, + DMA_ATTR_SKIP_CPU_SYNC); sg_free_table(sgt); } @@ -288,8 +287,8 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages, gntdev_dmabuf->nr_pages); if (!IS_ERR(sgt)) { - if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, - DMA_ATTR_SKIP_CPU_SYNC)) { + if (dma_map_sgtable(attach->dev, sgt, dir, + DMA_ATTR_SKIP_CPU_SYNC)) { sg_free_table(sgt); kfree(sgt); sgt = ERR_PTR(-ENOMEM); @@ -625,7 +624,7 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int count) /* Now convert sgt to array of pages and check for page validity. */ i = 0; - for_each_sg_page(sgt->sgl, &sg_iter, sgt->nents, 0) { + for_each_sgtable_page(sgt, &sg_iter, 0) { struct page *page = sg_page_iter_page(&sg_iter); /* * Check if page is valid: this can happen if we are given -- 1.9.1
[PATCH v5 27/38] xen: gntdev: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. To avoid such issues, lets use a common dma-mapping wrappers operating directly on the struct sg_table objects and use scatterlist page iterators where possible. This, almost always, hides references to the nents and orig_nents entries, making the code robust, easier to follow and copy/paste safe. Signed-off-by: Marek Szyprowski --- For more information, see '[PATCH v5 00/38] DRM: fix struct sg_table nents vs. orig_nents misuse' thread: https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprow...@samsung.com/T/ --- drivers/xen/gntdev-dmabuf.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index 75d3bb9..ba6cad8 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -247,10 +247,9 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, if (sgt) { if (gntdev_dmabuf_attach->dir != DMA_NONE) - dma_unmap_sg_attrs(attach->dev, sgt->sgl, - sgt->nents, - gntdev_dmabuf_attach->dir, - DMA_ATTR_SKIP_CPU_SYNC); + dma_unmap_sgtable(attach->dev, sgt, + gntdev_dmabuf_attach->dir, + DMA_ATTR_SKIP_CPU_SYNC); sg_free_table(sgt); } @@ -288,8 +287,8 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages, gntdev_dmabuf->nr_pages); if (!IS_ERR(sgt)) { - if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, - DMA_ATTR_SKIP_CPU_SYNC)) { + if (dma_map_sgtable(attach->dev, sgt, dir, + DMA_ATTR_SKIP_CPU_SYNC)) { sg_free_table(sgt); kfree(sgt); sgt = ERR_PTR(-ENOMEM); @@ -625,7 +624,7 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int count) /* Now convert sgt to array of pages and check for page validity. */ i = 0; - for_each_sg_page(sgt->sgl, &sg_iter, sgt->nents, 0) { + for_each_sgtable_page(sgt, &sg_iter, 0) { struct page *page = sg_page_iter_page(&sg_iter); /* * Check if page is valid: this can happen if we are given -- 1.9.1
[PATCH v9 23/32] xen: gntdev: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. To avoid such issues, lets use a common dma-mapping wrappers operating directly on the struct sg_table objects and use scatterlist page iterators where possible. This, almost always, hides references to the nents and orig_nents entries, making the code robust, easier to follow and copy/paste safe. Signed-off-by: Marek Szyprowski Acked-by: Juergen Gross --- drivers/xen/gntdev-dmabuf.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index b1b6eebafd5d..4c13cbc99896 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -247,10 +247,9 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, if (sgt) { if (gntdev_dmabuf_attach->dir != DMA_NONE) - dma_unmap_sg_attrs(attach->dev, sgt->sgl, - sgt->nents, - gntdev_dmabuf_attach->dir, - DMA_ATTR_SKIP_CPU_SYNC); + dma_unmap_sgtable(attach->dev, sgt, + gntdev_dmabuf_attach->dir, + DMA_ATTR_SKIP_CPU_SYNC); sg_free_table(sgt); } @@ -288,8 +287,8 @@ dmabuf_exp_ops_map_dma_buf(struct dma_buf_attachment *attach, sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages, gntdev_dmabuf->nr_pages); if (!IS_ERR(sgt)) { - if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, - DMA_ATTR_SKIP_CPU_SYNC)) { + if (dma_map_sgtable(attach->dev, sgt, dir, + DMA_ATTR_SKIP_CPU_SYNC)) { sg_free_table(sgt); kfree(sgt); sgt = ERR_PTR(-ENOMEM); @@ -633,7 +632,7 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, /* Now convert sgt to array of pages and check for page validity. */ i = 0; - for_each_sg_page(sgt->sgl, &sg_iter, sgt->nents, 0) { + for_each_sgtable_page(sgt, &sg_iter, 0) { struct page *page = sg_page_iter_page(&sg_iter); /* * Check if page is valid: this can happen if we are given -- 2.17.1
[PATCH v9 22/32] drm: xen: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. Fix the code to refer to proper nents or orig_nents entries. This driver reports the number of the pages in the imported scatterlist, so it should refer to sg_table->orig_nents entry. Signed-off-by: Marek Szyprowski Acked-by: Oleksandr Andrushchenko --- drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c index 39ff95b75357..0e57c80058b2 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -216,7 +216,7 @@ xen_drm_front_gem_import_sg_table(struct drm_device *dev, return ERR_PTR(ret); DRM_DEBUG("Imported buffer of size %zu with nents %u\n", - size, sgt->nents); + size, sgt->orig_nents); return &xen_obj->base; } -- 2.17.1
[PATCH v5 39/38] drm: xen: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. Fix the code to refer to proper nents or orig_nents entries. This driver reports the number of the pages in the imported scatterlist, so it should refer to sg_table->orig_nents entry. Signed-off-by: Marek Szyprowski Acked-by: Oleksandr Andrushchenko --- For more information, see '[PATCH v5 00/38] DRM: fix struct sg_table nents vs. orig_nents misuse' thread: https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprow...@samsung.com/T/ This patch has been resurrected on Oleksandr Andrushchenko request. The patch was a part of v2 patchset, but then I've dropped it as it only fixes the debug output, thus I didn't consider it so critical. --- drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c index f0b85e0..ba4bdc5 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -215,7 +215,7 @@ struct drm_gem_object * return ERR_PTR(ret); DRM_DEBUG("Imported buffer of size %zu with nents %u\n", - size, sgt->nents); + size, sgt->orig_nents); return &xen_obj->base; } -- 1.9.1
[PATCH v6 24/36] xen: gntdev: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. To avoid such issues, lets use a common dma-mapping wrappers operating directly on the struct sg_table objects and use scatterlist page iterators where possible. This, almost always, hides references to the nents and orig_nents entries, making the code robust, easier to follow and copy/paste safe. Signed-off-by: Marek Szyprowski Acked-by: Juergen Gross --- drivers/xen/gntdev-dmabuf.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index 75d3bb948bf3..ba6cad871568 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -247,10 +247,9 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, if (sgt) { if (gntdev_dmabuf_attach->dir != DMA_NONE) - dma_unmap_sg_attrs(attach->dev, sgt->sgl, - sgt->nents, - gntdev_dmabuf_attach->dir, - DMA_ATTR_SKIP_CPU_SYNC); + dma_unmap_sgtable(attach->dev, sgt, + gntdev_dmabuf_attach->dir, + DMA_ATTR_SKIP_CPU_SYNC); sg_free_table(sgt); } @@ -288,8 +287,8 @@ dmabuf_exp_ops_map_dma_buf(struct dma_buf_attachment *attach, sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages, gntdev_dmabuf->nr_pages); if (!IS_ERR(sgt)) { - if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, - DMA_ATTR_SKIP_CPU_SYNC)) { + if (dma_map_sgtable(attach->dev, sgt, dir, + DMA_ATTR_SKIP_CPU_SYNC)) { sg_free_table(sgt); kfree(sgt); sgt = ERR_PTR(-ENOMEM); @@ -625,7 +624,7 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, /* Now convert sgt to array of pages and check for page validity. */ i = 0; - for_each_sg_page(sgt->sgl, &sg_iter, sgt->nents, 0) { + for_each_sgtable_page(sgt, &sg_iter, 0) { struct page *page = sg_page_iter_page(&sg_iter); /* * Check if page is valid: this can happen if we are given -- 2.17.1
[PATCH v6 36/36] drm: xen: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. Fix the code to refer to proper nents or orig_nents entries. This driver reports the number of the pages in the imported scatterlist, so it should refer to sg_table->orig_nents entry. Signed-off-by: Marek Szyprowski Acked-by: Oleksandr Andrushchenko --- drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c index f0b85e094111..ba4bdc5590ea 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -215,7 +215,7 @@ xen_drm_front_gem_import_sg_table(struct drm_device *dev, return ERR_PTR(ret); DRM_DEBUG("Imported buffer of size %zu with nents %u\n", - size, sgt->nents); + size, sgt->orig_nents); return &xen_obj->base; } -- 2.17.1
[PATCH v7 25/36] xen: gntdev: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. To avoid such issues, lets use a common dma-mapping wrappers operating directly on the struct sg_table objects and use scatterlist page iterators where possible. This, almost always, hides references to the nents and orig_nents entries, making the code robust, easier to follow and copy/paste safe. Signed-off-by: Marek Szyprowski Acked-by: Juergen Gross --- drivers/xen/gntdev-dmabuf.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index 75d3bb948bf3..ba6cad871568 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -247,10 +247,9 @@ static void dmabuf_exp_ops_detach(struct dma_buf *dma_buf, if (sgt) { if (gntdev_dmabuf_attach->dir != DMA_NONE) - dma_unmap_sg_attrs(attach->dev, sgt->sgl, - sgt->nents, - gntdev_dmabuf_attach->dir, - DMA_ATTR_SKIP_CPU_SYNC); + dma_unmap_sgtable(attach->dev, sgt, + gntdev_dmabuf_attach->dir, + DMA_ATTR_SKIP_CPU_SYNC); sg_free_table(sgt); } @@ -288,8 +287,8 @@ dmabuf_exp_ops_map_dma_buf(struct dma_buf_attachment *attach, sgt = dmabuf_pages_to_sgt(gntdev_dmabuf->pages, gntdev_dmabuf->nr_pages); if (!IS_ERR(sgt)) { - if (!dma_map_sg_attrs(attach->dev, sgt->sgl, sgt->nents, dir, - DMA_ATTR_SKIP_CPU_SYNC)) { + if (dma_map_sgtable(attach->dev, sgt, dir, + DMA_ATTR_SKIP_CPU_SYNC)) { sg_free_table(sgt); kfree(sgt); sgt = ERR_PTR(-ENOMEM); @@ -625,7 +624,7 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, /* Now convert sgt to array of pages and check for page validity. */ i = 0; - for_each_sg_page(sgt->sgl, &sg_iter, sgt->nents, 0) { + for_each_sgtable_page(sgt, &sg_iter, 0) { struct page *page = sg_page_iter_page(&sg_iter); /* * Check if page is valid: this can happen if we are given -- 2.17.1
[PATCH v7 24/36] drm: xen: fix common struct sg_table related issues
The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() function returns the number of the created entries in the DMA address space. However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and dma_unmap_sg must be called with the original number of the entries passed to the dma_map_sg(). struct sg_table is a common structure used for describing a non-contiguous memory buffer, used commonly in the DRM and graphics subsystems. It consists of a scatterlist with memory pages and DMA addresses (sgl entry), as well as the number of scatterlist entries: CPU pages (orig_nents entry) and DMA mapped pages (nents entry). It turned out that it was a common mistake to misuse nents and orig_nents entries, calling DMA-mapping functions with a wrong number of entries or ignoring the number of mapped entries returned by the dma_map_sg() function. Fix the code to refer to proper nents or orig_nents entries. This driver reports the number of the pages in the imported scatterlist, so it should refer to sg_table->orig_nents entry. Signed-off-by: Marek Szyprowski Acked-by: Oleksandr Andrushchenko --- drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/xen_drm_front_gem.c index f0b85e094111..ba4bdc5590ea 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -215,7 +215,7 @@ xen_drm_front_gem_import_sg_table(struct drm_device *dev, return ERR_PTR(ret); DRM_DEBUG("Imported buffer of size %zu with nents %u\n", - size, sgt->nents); + size, sgt->orig_nents); return &xen_obj->base; } -- 2.17.1
Re: [PATCH 09/30] mtd_blkdevs: use blk_mq_alloc_disk
D(&new->rq_list); > > - new->tag_set = kzalloc(sizeof(*new->tag_set), GFP_KERNEL); > - if (!new->tag_set) > - goto error3; > - > - new->rq = blk_mq_init_sq_queue(new->tag_set, &mtd_mq_ops, 2, > - BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING); > - if (IS_ERR(new->rq)) { > - ret = PTR_ERR(new->rq); > - new->rq = NULL; > - goto error4; > - } > - > if (tr->flush) > blk_queue_write_cache(new->rq, true, false); > > - new->rq->queuedata = new; > blk_queue_logical_block_size(new->rq, tr->blksize); > > blk_queue_flag_set(QUEUE_FLAG_NONROT, new->rq); > @@ -437,13 +433,13 @@ int add_mtd_blktrans_dev(struct mtd_blktrans_dev *new) > WARN_ON(ret); > } > return 0; > -error4: > + > +out_free_tag_set: > + blk_mq_free_tag_set(new->tag_set); > +out_kfree_tag_set: > kfree(new->tag_set); > -error3: > - put_disk(new->disk); > -error2: > +out_list_del: > list_del(&new->list); > -error1: > return ret; > } > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH 09/30] mtd_blkdevs: use blk_mq_alloc_disk
Hi Christoph, On 15.06.2021 17:58, Christoph Hellwig wrote: > On Tue, Jun 15, 2021 at 05:47:44PM +0200, Marek Szyprowski wrote: >> On 02.06.2021 08:53, Christoph Hellwig wrote: >>> Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue >>> allocation. >>> >>> Signed-off-by: Christoph Hellwig >> This patch landed in linux-next as commit 6966bb921def ("mtd_blkdevs: >> use blk_mq_alloc_disk"). It causes the following regression on my QEMU >> arm64 setup: > Please try the patch below: > > diff --git a/drivers/mtd/mtd_blkdevs.c b/drivers/mtd/mtd_blkdevs.c > index 5dc4c966ea73..6ce4bc57f919 100644 > --- a/drivers/mtd/mtd_blkdevs.c > +++ b/drivers/mtd/mtd_blkdevs.c > @@ -382,6 +382,7 @@ int add_mtd_blktrans_dev(struct mtd_blktrans_dev *new) > } > > new->disk = gd; > + new->rq = new->disk->queue; > gd->private_data = new; > gd->major = tr->major; > gd->first_minor = (new->devnum) << tr->part_bits; Right, this fixes the issue, thanks. Feel free to add: Reported-by: Marek Szyprowski Tested-by: Marek Szyprowski Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: exynos-mixer 14450000.mixer: [drm:exynos_drm_register_dma] *ERROR* Device 14450000.mixer lacks support for IOMMU
On 01.11.2023 03:50, Chuck Zmudzinski wrote: > On 10/31/2023 7:45 PM, Stefano Stabellini wrote: >> Unfortunately there is no easy solution. >> >> Do you know the version of the SMMU available on the platform? > I am trying to discern, but I doubt we have v3 because we are > working on a very old chromebook from 2012, and I am finding > patches for smmv3 in Linux not starting until 2015. It is good to > know about this option, though, for future work we might do on newer > devices. Just to clarify. Exynos SMMU is a custom hardware designed by Samsung, it is not based on ARM's SMMU. Linux has a separate driver for it. > ... Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v4 bpf-next 1/2] mm: Enforce VM_IOREMAP flag and range in ioremap_page_range.
On 05.03.2024 04:05, Alexei Starovoitov wrote: > From: Alexei Starovoitov > > There are various users of get_vm_area() + ioremap_page_range() APIs. > Enforce that get_vm_area() was requested as VM_IOREMAP type and range > passed to ioremap_page_range() matches created vm_area to avoid > accidentally ioremap-ing into wrong address range. > > Reviewed-by: Christoph Hellwig > Signed-off-by: Alexei Starovoitov > --- This patch landed in today's linux-next as commit 3e49a866c9dc ("mm: Enforce VM_IOREMAP flag and range in ioremap_page_range."). Unfortunately it triggers the following warning on all my test machines with PCI bridges. Here is an example reproduced with QEMU and ARM64 'virt' machine: pci-host-generic 401000.pcie: host bridge /pcie@1000 ranges: pci-host-generic 401000.pcie: IO 0x003eff..0x003eff -> 0x00 pci-host-generic 401000.pcie: MEM 0x001000..0x003efe -> 0x001000 pci-host-generic 401000.pcie: MEM 0x80..0xff -> 0x80 [ cut here ] vm_area at addr fbfffe80 is not marked as VM_IOREMAP WARNING: CPU: 0 PID: 1 at mm/vmalloc.c:315 ioremap_page_range+0x8c/0x174 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.8.0-rc6+ #14694 Hardware name: linux,dummy-virt (DT) pstate: 6005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ioremap_page_range+0x8c/0x174 lr : ioremap_page_range+0x8c/0x174 sp : 800083faba10 ... Call trace: ioremap_page_range+0x8c/0x174 pci_remap_iospace+0x74/0x88 devm_pci_remap_iospace+0x54/0xac devm_of_pci_bridge_init+0x160/0x1fc devm_pci_alloc_host_bridge+0xb4/0xd0 pci_host_common_probe+0x44/0x1a0 platform_probe+0x68/0xd8 really_probe+0x148/0x2b4 __driver_probe_device+0x78/0x12c driver_probe_device+0xdc/0x164 __driver_attach+0x9c/0x1ac bus_for_each_dev+0x74/0xd4 driver_attach+0x24/0x30 bus_add_driver+0xe4/0x1e8 driver_register+0x60/0x128 __platform_driver_register+0x28/0x34 gen_pci_driver_init+0x1c/0x28 do_one_initcall+0x74/0x2f4 kernel_init_freeable+0x28c/0x4dc kernel_init+0x24/0x1dc ret_from_fork+0x10/0x20 irq event stamp: 74360 hardirqs last enabled at (74359): [] console_unlock+0x120/0x12c hardirqs last disabled at (74360): [] el1_dbg+0x24/0x8c softirqs last enabled at (71258): [] __do_softirq+0x4a0/0x4e8 softirqs last disabled at (71245): [] do_softirq+0x10/0x1c ---[ end trace ]--- pci-host-generic 401000.pcie: error -22: failed to map resource [io 0x-0x] pci-host-generic 401000.pcie: Memory resource size exceeds max for 32 bits pci-host-generic 401000.pcie: ECAM at [mem 0x401000-0x401fff] for [bus 00-ff] pci-host-generic 401000.pcie: PCI host bridge to bus :00 pci_bus :00: root bus resource [bus 00-ff] pci_bus :00: root bus resource [mem 0x1000-0x3efe] pci_bus :00: root bus resource [mem 0x80-0xff] pci :00:00.0: [1b36:0008] type 00 class 0x06 conventional PCI endpoint It looks that PCI related code must be somehow adjusted for this change. > mm/vmalloc.c | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index d12a17fc0c17..f42f98a127d5 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -307,8 +307,21 @@ static int vmap_range_noflush(unsigned long addr, > unsigned long end, > int ioremap_page_range(unsigned long addr, unsigned long end, > phys_addr_t phys_addr, pgprot_t prot) > { > + struct vm_struct *area; > int err; > > + area = find_vm_area((void *)addr); > + if (!area || !(area->flags & VM_IOREMAP)) { > + WARN_ONCE(1, "vm_area at addr %lx is not marked as > VM_IOREMAP\n", addr); > + return -EINVAL; > + } > + if (addr != (unsigned long)area->addr || > + (void *)end != area->addr + get_vm_area_size(area)) { > + WARN_ONCE(1, "ioremap request [%lx,%lx) doesn't match vm_area > [%lx, %lx)\n", > + addr, end, (long)area->addr, > + (long)area->addr + get_vm_area_size(area)); > + return -ERANGE; > + } > err = vmap_range_noflush(addr, end, phys_addr, pgprot_nx(prot), >ioremap_max_page_shift); > flush_cache_vmap(addr, end); Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v3 3/3] PCI/MSI: Convert pci_msi_ignore_mask to per MSI domain flag
On 25.03.2025 10:20, Thomas Gleixner wrote: > On Tue, Mar 25 2025 at 09:11, Thomas Gleixner wrote: >> On Mon, Mar 24 2025 at 20:18, Roger Pau Monné wrote: >>> On Mon, Mar 24, 2025 at 07:58:14PM +0100, Daniel Gomez wrote: >>>> The issue is that info appears to be uninitialized. So, this worked for me: >>> Indeed, irq_domain->host_data is NULL, there's no msi_domain_info. As >>> this is x86, I was expecting x86 ot always use >>> x86_init_dev_msi_info(), but that doesn't seem to be the case. I >>> would like to better understand this. >> Indeed. On x86 this should not happen at all. On architectures, which do >> not use (hierarchical) interrupt domains, it will return NULL. >> >> So I really want to understand why this happens on x86 before such a >> "fix" is deployed. > So after staring at it some more it's clear. Without XEN, the domain > returned is the MSI parent domain, which is the vector domain in that > setup. That does not have a domain info set. But on legacy architectures > there is not even a domain. > > It's really wonderful that we have a gazillion ways to manage the > backends of PCI/MSI > > So none of the suggested pointer checks will cover it correctly. Though > there is already a function which allows to query MSI domain flags > independent of the underlying insanity. Sorry for not catching it in > review. > > Untested patch below. This fixes the panic observed on ARM64 RK3568-based Odroid-M1 board (arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts) on next-20250325. Thanks! Feel free to add to the final patch: Tested-by: Marek Szyprowski > > Thanks, > > tglx > --- > drivers/pci/msi/msi.c | 18 ++ > 1 file changed, 6 insertions(+), 12 deletions(-) > > --- a/drivers/pci/msi/msi.c > +++ b/drivers/pci/msi/msi.c > @@ -285,8 +285,6 @@ static void pci_msi_set_enable(struct pc > static int msi_setup_msi_desc(struct pci_dev *dev, int nvec, > struct irq_affinity_desc *masks) > { > - const struct irq_domain *d = dev_get_msi_domain(&dev->dev); > - const struct msi_domain_info *info = d->host_data; > struct msi_desc desc; > u16 control; > > @@ -297,7 +295,7 @@ static int msi_setup_msi_desc(struct pci > /* Lies, damned lies, and MSIs */ > if (dev->dev_flags & PCI_DEV_FLAGS_HAS_MSI_MASKING) > control |= PCI_MSI_FLAGS_MASKBIT; > - if (info->flags & MSI_FLAG_NO_MASK) > + if (pci_msi_domain_supports(dev, MSI_FLAG_NO_MASK, DENY_LEGACY)) > control &= ~PCI_MSI_FLAGS_MASKBIT; > > desc.nvec_used = nvec; > @@ -605,20 +603,18 @@ static void __iomem *msix_map_region(str >*/ > void msix_prepare_msi_desc(struct pci_dev *dev, struct msi_desc *desc) > { > - const struct irq_domain *d = dev_get_msi_domain(&dev->dev); > - const struct msi_domain_info *info = d->host_data; > - > desc->nvec_used = 1; > desc->pci.msi_attrib.is_msix= 1; > desc->pci.msi_attrib.is_64 = 1; > desc->pci.msi_attrib.default_irq= dev->irq; > desc->pci.mask_base = dev->msix_base; > - desc->pci.msi_attrib.can_mask = !(info->flags & > MSI_FLAG_NO_MASK) && > - > !desc->pci.msi_attrib.is_virtual; > > - if (desc->pci.msi_attrib.can_mask) { > + > + if (!pci_msi_domain_supports(dev, MSI_FLAG_NO_MASK, DENY_LEGACY) && > + !desc->pci.msi_attrib.is_virtual) { > void __iomem *addr = pci_msix_desc_addr(desc); > > + desc->pci.msi_attrib.can_mask = true; > desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL); > } > } > @@ -715,8 +711,6 @@ static int msix_setup_interrupts(struct > static int msix_capability_init(struct pci_dev *dev, struct msix_entry > *entries, > int nvec, struct irq_affinity *affd) > { > - const struct irq_domain *d = dev_get_msi_domain(&dev->dev); > - const struct msi_domain_info *info = d->host_data; > int ret, tsize; > u16 control; > > @@ -747,7 +741,7 @@ static int msix_capability_init(struct p > /* Disable INTX */ > pci_intx_for_msi(dev, 0); > > - if (!(info->flags & MSI_FLAG_NO_MASK)) { > + if (!pci_msi_domain_supports(dev, MSI_FLAG_NO_MASK, DENY_LEGACY)) { > /* >* Ensure that all table entries are masked to prevent >* stale entries from firing in a crash kernel. > > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v1 00/16] dma-mapping: migrate to physical address-based API
On 07.08.2025 16:19, Jason Gunthorpe wrote: > On Mon, Aug 04, 2025 at 03:42:34PM +0300, Leon Romanovsky wrote: >> Changelog: >> v1: >> * Added new DMA_ATTR_MMIO attribute to indicate >> PCI_P2PDMA_MAP_THRU_HOST_BRIDGE path. >> * Rewrote dma_map_* functions to use thus new attribute >> v0: https://lore.kernel.org/all/cover.1750854543.git.l...@kernel.org/ >> >> >> This series refactors the DMA mapping to use physical addresses >> as the primary interface instead of page+offset parameters. This >> change aligns the DMA API with the underlying hardware reality where >> DMA operations work with physical addresses, not page structures. > Lets elaborate this as Robin asked: > > This series refactors the DMA mapping API to provide a phys_addr_t > based, and struct-page free, external API that can handle all the > mapping cases we want in modern systems: > > - struct page based cachable DRAM > - struct page MEMORY_DEVICE_PCI_P2PDMA PCI peer to peer non-cachable MMIO > - struct page-less PCI peer to peer non-cachable MMIO > - struct page-less "resource" MMIO > > Overall this gets much closer to Matthew's long term wish for > struct-pageless IO to cachable DRAM. The remaining primary work would > be in the mm side to allow kmap_local_pfn()/phys_to_virt() to work on > phys_addr_t without a struct page. > > The general design is to remove struct page usage entirely from the > DMA API inner layers. For flows that need to have a KVA for the > physical address they can use kmap_local_pfn() or phys_to_virt(). This > isolates the struct page requirements to MM code only. Long term all > removals of struct page usage are supporting Matthew's memdesc > project which seeks to substantially transform how struct page works. > > Instead make the DMA API internals work on phys_addr_t. Internally > there are still dedicated 'page' and 'resource' flows, except they are > now distinguished by a new DMA_ATTR_MMIO instead of by callchain. Both > flows use the same phys_addr_t. > > When DMA_ATTR_MMIO is specified things work similar to the existing > 'resource' flow. kmap_local_pfn(), phys_to_virt(), phys_to_page(), > pfn_valid(), etc are never called on the phys_addr_t. This requires > rejecting any configuration that would need swiotlb. CPU cache > flushing is not required, and avoided, as ATTR_MMIO also indicates the > address have no cachable mappings. This effectively removes any > DMA API side requirement to have struct page when DMA_ATTR_MMIO is > used. > > In the !DMA_ATTR_MMIO mode things work similarly to the 'page' flow, > except on the common path of no cache flush, no swiotlb it never > touches a struct page. When cache flushing or swiotlb copying > kmap_local_pfn()/phys_to_virt() are used to get a KVA for CPU > usage. This was already the case on the unmap side, now the map side > is symmetric. > > Callers are adjusted to set DMA_ATTR_MMIO. Existing 'resource' users > must set it. The existing struct page based MEMORY_DEVICE_PCI_P2PDMA > path must also set it. This corrects some existing bugs where iommu > mappings for P2P MMIO were improperly marked IOMMU_CACHE. > > Since ATTR_MMIO is made to work with all the existing DMA map entry > points, particularly dma_iova_link(), this finally allows a way to use > the new DMA API to map PCI P2P MMIO without creating struct page. The > VFIO DMABUF series demonstrates how this works. This is intended to > replace the incorrect driver use of dma_map_resource() on PCI BAR > addresses. > > This series does the core code and modern flows. A followup series > will give the same treatement to the legacy dma_ops implementation. Thanks for the elaborate description, that's something that was missing in the previous attempt. I read again all the previous discussion and this explanation and there are still two things that imho needs more clarification. First - basing the API on the phys_addr_t. Page based API had the advantage that it was really hard to abuse it and call for something that is not 'a normal RAM'. I initially though that phys_addr_t based API will somehow simplify arch specific implementation, as some of them indeed rely on phys_addr_t internally, but I missed other things pointed by Robin. Do we have here any alternative? Second - making dma_map_phys() a single API to handle all cases. Do we really need such single function to handle all cases? To handle P2P case, the caller already must pass DMA_ATTR_MMIO, so it must somehow keep such information internally. Cannot it just call existing dma_map_resource(), so there will be clear distinction between these 2 cases (DMA to RAM and P2P DMA)? Do we need additional check for DMA_ATTR_MMIO for every typical DMA user? I know that branching is cheap, but this will probably increase code size for most of the typical users for no reason. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v4 03/16] dma-debug: refactor to use physical addresses for page mapping
, dir, addr, attrs); > > return addr; > } > @@ -194,7 +194,7 @@ void dma_unmap_page_attrs(struct device *dev, dma_addr_t > addr, size_t size, > else > ops->unmap_page(dev, addr, size, dir, attrs); > trace_dma_unmap_page(dev, addr, size, dir, attrs); > - debug_dma_unmap_page(dev, addr, size, dir); > + debug_dma_unmap_phys(dev, addr, size, dir); > } > EXPORT_SYMBOL(dma_unmap_page_attrs); > > @@ -712,7 +712,8 @@ struct page *dma_alloc_pages(struct device *dev, size_t > size, > if (page) { > trace_dma_alloc_pages(dev, page_to_virt(page), *dma_handle, > size, dir, gfp, 0); > - debug_dma_map_page(dev, page, 0, size, dir, *dma_handle, 0); > + debug_dma_map_phys(dev, page_to_phys(page), size, dir, > +*dma_handle, 0); > } else { > trace_dma_alloc_pages(dev, NULL, 0, size, dir, gfp, 0); > } > @@ -738,7 +739,7 @@ void dma_free_pages(struct device *dev, size_t size, > struct page *page, > dma_addr_t dma_handle, enum dma_data_direction dir) > { > trace_dma_free_pages(dev, page_to_virt(page), dma_handle, size, dir, 0); > - debug_dma_unmap_page(dev, dma_handle, size, dir); > + debug_dma_unmap_phys(dev, dma_handle, size, dir); > __dma_free_pages(dev, size, page, dma_handle, dir); > } > EXPORT_SYMBOL_GPL(dma_free_pages); Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v4 00/16] dma-mapping: migrate to physical address-based API
On 29.08.2025 15:16, Jason Gunthorpe wrote: > On Tue, Aug 19, 2025 at 08:36:44PM +0300, Leon Romanovsky wrote: > >> This series does the core code and modern flows. A followup series >> will give the same treatment to the legacy dma_ops implementation. > I took a quick check over this to see that it is sane. I think using > phys is an improvement for most of the dma_ops implemenations. > >arch/sparc/kernel/pci_sun4v.c >arch/sparc/kernel/iommu.c > Uses __pa to get phys from the page, never touches page > >arch/alpha/kernel/pci_iommu.c >arch/sparc/mm/io-unit.c >drivers/parisc/ccio-dma.c >drivers/parisc/sba_iommu.c > Does page_addres() and later does __pa on it. Doesn't touch struct page > >arch/x86/kernel/amd_gart_64.c >drivers/xen/swiotlb-xen.c >arch/mips/jazz/jazzdma.c > Immediately does page_to_phys(), never touches struct page > >drivers/vdpa/vdpa_user/vduse_dev.c > Does page_to_phys() to call iommu_map() > >drivers/xen/grant-dma-ops.c > Does page_to_pfn() and nothing else > >arch/powerpc/platforms/ps3/system-bus.c > This is a maze but I think it wants only phys and the virt is only > used for debug prints. > > The above all never touch a KVA and just want a phys_addr_t. > > The below are touching the KVA somehow: > >arch/sparc/mm/iommu.c >arch/arm/mm/dma-mapping.c > Uses page_address to cache flush, would be happy with phys_to_virt() > and a PhysHighMem() > >arch/powerpc/kernel/dma-iommu.c >arch/powerpc/platforms/pseries/vio.c > Uses iommu_map_page() which wants phys_to_virt(), doesn't touch > struct page > >arch/powerpc/platforms/pseries/ibmebus.c > Returns phys_to_virt() as dma_addr_t. > > The two PPC ones are weird, I didn't figure out how that was working.. > > It would be easy to make map_phys patches for about half of these, in > the first grouping. Doing so would also grant those arches > map_resource capability. > > Overall I didn't think there was any reduction in maintainability in > these places. Most are improvements eliminating code, and some are > just switching to phys_to_virt() from page_address(), which we could > further guard with DMA_ATTR_MMIO and a check for highmem. Thanks for this summary. However I would still like to get an answer for the simple question - why all this work cannot be replaced by a simple use of dma_map_resource()? I've checked the most advertised use case in https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=dmabuf-vfio and I still don't see the reason why it cannot be based on dma_map_resource() API? I'm aware of the little asymmetry of the client calls is such case, indeed it is not preety, but this should work even now: phys = phys_vec[i].paddr; if (is_mmio) dma_map_resource(phys, len, ...); else dma_map_page(phys_to_page(phys), offset_in_page(phys), ...); What did I miss? I'm not against this rework, but I would really like to know the rationale. I know that the 2-step dma-mapping API also use phys addresses and this is the same direction. This patchset focuses only on the dma_map_page -> dma_map_phys rework. There are also other interfaces, like dma_alloc_pages() and so far nothing has been proposed for them so far. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v5 07/16] dma-mapping: convert dma_direct_*map_page to be phys_addr_t based
sg->length, dir, attrs); > if (sg->dma_address == DMA_MAPPING_ERROR) { > ret = -EIO; > goto out_unmap; > diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h > index d2c0b7e632fc..3f4792910604 100644 > --- a/kernel/dma/direct.h > +++ b/kernel/dma/direct.h > @@ -80,42 +80,57 @@ static inline void dma_direct_sync_single_for_cpu(struct > device *dev, > arch_dma_mark_clean(paddr, size); > } > > -static inline dma_addr_t dma_direct_map_page(struct device *dev, > - struct page *page, unsigned long offset, size_t size, > - enum dma_data_direction dir, unsigned long attrs) > +static inline dma_addr_t dma_direct_map_phys(struct device *dev, > + phys_addr_t phys, size_t size, enum dma_data_direction dir, > + unsigned long attrs) > { > - phys_addr_t phys = page_to_phys(page) + offset; > - dma_addr_t dma_addr = phys_to_dma(dev, phys); > + dma_addr_t dma_addr; > > if (is_swiotlb_force_bounce(dev)) { > - if (is_pci_p2pdma_page(page)) > - return DMA_MAPPING_ERROR; > + if (attrs & DMA_ATTR_MMIO) > + goto err_overflow; > + > return swiotlb_map(dev, phys, size, dir, attrs); > } > > - if (unlikely(!dma_capable(dev, dma_addr, size, true)) || > - dma_kmalloc_needs_bounce(dev, size, dir)) { > - if (is_pci_p2pdma_page(page)) > - return DMA_MAPPING_ERROR; > - if (is_swiotlb_active(dev)) > - return swiotlb_map(dev, phys, size, dir, attrs); > - > - dev_WARN_ONCE(dev, 1, > - "DMA addr %pad+%zu overflow (mask %llx, bus limit > %llx).\n", > - &dma_addr, size, *dev->dma_mask, > dev->bus_dma_limit); > - return DMA_MAPPING_ERROR; > + if (attrs & DMA_ATTR_MMIO) { > + dma_addr = phys; > + if (unlikely(dma_capable(dev, dma_addr, size, false))) "!dma_capable(dev, dma_addr, size, false)" in the above line. It took me a while to find this after noticing that this patchset breaks booting some of me test systems. > + goto err_overflow; > + } else { > + dma_addr = phys_to_dma(dev, phys); > + if (unlikely(!dma_capable(dev, dma_addr, size, true)) || > + dma_kmalloc_needs_bounce(dev, size, dir)) { > + if (is_swiotlb_active(dev)) > + return swiotlb_map(dev, phys, size, dir, attrs); > + > + goto err_overflow; > + } > } > > - if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) > + if (!dev_is_dma_coherent(dev) && > + !(attrs & (DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_MMIO))) > arch_sync_dma_for_device(phys, size, dir); > return dma_addr; > + > +err_overflow: > + dev_WARN_ONCE( > + dev, 1, > + "DMA addr %pad+%zu overflow (mask %llx, bus limit %llx).\n", > + &dma_addr, size, *dev->dma_mask, dev->bus_dma_limit); > + return DMA_MAPPING_ERROR; > } > > -static inline void dma_direct_unmap_page(struct device *dev, dma_addr_t addr, > +static inline void dma_direct_unmap_phys(struct device *dev, dma_addr_t addr, > size_t size, enum dma_data_direction dir, unsigned long attrs) > { > - phys_addr_t phys = dma_to_phys(dev, addr); > + phys_addr_t phys; > + > + if (attrs & DMA_ATTR_MMIO) > + /* nothing to do: uncached and no swiotlb */ > + return; > > + phys = dma_to_phys(dev, addr); > if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC)) > dma_direct_sync_single_for_cpu(dev, addr, size, dir); > > diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c > index 58482536db9b..80481a873340 100644 > --- a/kernel/dma/mapping.c > +++ b/kernel/dma/mapping.c > @@ -166,8 +166,8 @@ dma_addr_t dma_map_page_attrs(struct device *dev, struct > page *page, > return DMA_MAPPING_ERROR; > > if (dma_map_direct(dev, ops) || > - arch_dma_map_page_direct(dev, phys + size)) > - addr = dma_direct_map_page(dev, page, offset, size, dir, attrs); > + arch_dma_map_phys_direct(dev, phys + size)) > + addr = dma_direct_map_phys(dev, phys, size, dir, attrs); > else if (use_dma_iommu(dev)) > addr = iommu_dma_map_phys(dev, phys, size, dir, attrs); > else > @@ -187,8 +187,8 @@ void dma_unmap_page_attrs(struct device *dev, dma_addr_t > addr, size_t size, > > BUG_ON(!valid_dma_direction(dir)); > if (dma_map_direct(dev, ops) || > - arch_dma_unmap_page_direct(dev, addr + size)) > - dma_direct_unmap_page(dev, addr, size, dir, attrs); > + arch_dma_unmap_phys_direct(dev, addr + size)) > + dma_direct_unmap_phys(dev, addr, size, dir, attrs); > else if (use_dma_iommu(dev)) > iommu_dma_unmap_phys(dev, addr, size, dir, attrs); > else Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v4 00/16] dma-mapping: migrate to physical address-based API
code and modern flows. A followup series >> will give the same treatment to the legacy dma_ops implementation. >> >> Thanks >> >> Leon Romanovsky (16): >>dma-mapping: introduce new DMA attribute to indicate MMIO memory >>iommu/dma: implement DMA_ATTR_MMIO for dma_iova_link(). >>dma-debug: refactor to use physical addresses for page mapping >>dma-mapping: rename trace_dma_*map_page to trace_dma_*map_phys >>iommu/dma: rename iommu_dma_*map_page to iommu_dma_*map_phys >>iommu/dma: extend iommu_dma_*map_phys API to handle MMIO memory >>dma-mapping: convert dma_direct_*map_page to be phys_addr_t based >>kmsan: convert kmsan_handle_dma to use physical addresses >>dma-mapping: handle MMIO flow in dma_map|unmap_page >>xen: swiotlb: Open code map_resource callback >>dma-mapping: export new dma_*map_phys() interface >>mm/hmm: migrate to physical address-based DMA mapping API >>mm/hmm: properly take MMIO path >>block-dma: migrate to dma_map_phys instead of map_page >>block-dma: properly take MMIO path >>nvme-pci: unmap MMIO pages with appropriate interface >> >> Documentation/core-api/dma-api.rst| 4 +- >> Documentation/core-api/dma-attributes.rst | 18 >> arch/powerpc/kernel/dma-iommu.c | 4 +- >> block/blk-mq-dma.c| 15 ++- >> drivers/iommu/dma-iommu.c | 61 +-- >> drivers/nvme/host/pci.c | 18 +++- >> drivers/virtio/virtio_ring.c | 4 +- >> drivers/xen/swiotlb-xen.c | 21 +++- >> include/linux/blk-mq-dma.h| 6 +- >> include/linux/blk_types.h | 2 + >> include/linux/dma-direct.h| 2 - >> include/linux/dma-map-ops.h | 8 +- >> include/linux/dma-mapping.h | 33 ++ >> include/linux/iommu-dma.h | 11 +- >> include/linux/kmsan.h | 9 +- >> include/trace/events/dma.h| 9 +- >> kernel/dma/debug.c| 71 - >> kernel/dma/debug.h| 37 ++- >> kernel/dma/direct.c | 22 +--- >> kernel/dma/direct.h | 52 ++ >> kernel/dma/mapping.c | 117 +- >> kernel/dma/ops_helpers.c | 6 +- >> mm/hmm.c | 19 ++-- >> mm/kmsan/hooks.c | 5 +- >> rust/kernel/dma.rs| 3 + >> tools/virtio/linux/kmsan.h| 2 +- >> 26 files changed, 305 insertions(+), 254 deletions(-) > Marek, > > So what are the next steps here? This series is pre-requirement for the > VFIO MMIO patches. I waited a bit with a hope to get a comment from Robin. It looks that there is no other alternative for the phys addr in the struct page removal process. I would like to give those patches a try in linux-next, but in meantime I tested it on my test farm and found a regression in dma_map_resource() handling. Namely the dma_map_resource() is no longer possible with size not aligned to kmalloc()'ed buffer, as dma_direct_map_phys() calls dma_kmalloc_needs_bounce(), which in turn calls dma_kmalloc_size_aligned(). It looks that the check for !(attrs & DMA_ATTR_MMIO) should be moved one level up in dma_direct_map_phys(). Here is the log: [ cut here ] dma-pl330 fe55.dma-controller: DMA addr 0xfe410024+4 overflow (mask , bus limit 0). WARNING: kernel/dma/direct.h:116 at dma_map_phys+0x3a4/0x3ec, CPU#1: speaker-test/405 Modules linked in: ... CPU: 1 UID: 0 PID: 405 Comm: speaker-test Not tainted 6.17.0-rc4-next-20250901+ #10958 PREEMPT Hardware name: Hardkernel ODROID-M1 (DT) pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : dma_map_phys+0x3a4/0x3ec lr : dma_map_phys+0x3a4/0x3ec ... Call trace: dma_map_phys+0x3a4/0x3ec (P) dma_map_resource+0x14/0x20 pl330_prep_slave_fifo+0x78/0xd0 pl330_prep_dma_cyclic+0x70/0x2b0 snd_dmaengine_pcm_trigger+0xec/0x8bc [snd_pcm_dmaengine] dmaengine_pcm_trigger+0x18/0x24 [snd_soc_core] snd_soc_pcm_component_trigger+0x164/0x208 [snd_soc_core] soc_pcm_trigger+0xe4/0x1ec [snd_soc_core] snd_pcm_do_start+0x44/0x70 [snd_pcm] snd_pcm_action_single+0x48/0xa4 [snd_pcm] snd_pcm_action+0x7c/0x98 [snd_pcm] snd_pcm_action_lock_irq+0x48/0xb4 [snd_pcm] snd_pcm_common_ioctl+0xf00/0x1f1c [snd_pcm] snd_pcm_ioctl+0x30/0x48 [snd_pcm] __arm64_sys_ioctl+0xac/0x104 invoke_syscall+0x48/0x110 el0_svc_common.constprop.0+0x40/0xe8 do_el0_svc+0x20/0x2c el0_svc+0x4c/0x160 el0t_64_sync_handler+0xa0/0xe4 el0t_64_sync+0x198/0x19c irq event stamp: 6596 hardirqs last enabled at (6595): [] _raw_spin_unlock_irqrestore+0x74/0x78 hardirqs last disabled at (6596): [] _raw_spin_lock_irq+0x78/0x7c softirqs last enabled at (6076): [] handle_softirqs+0x4c4/0x4dc softirqs last disabled at (6071): [] __do_softirq+0x14/0x20 ---[ end trace ]--- rockchip-i2s-tdm fe41.i2s: ASoC error (-12): at soc_component_trigger() on fe41.i2s Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v4 14/16] block-dma: migrate to dma_map_phys instead of map_page
On 19.08.2025 19:36, Leon Romanovsky wrote: > From: Leon Romanovsky > > After introduction of dma_map_phys(), there is no need to convert > from physical address to struct page in order to map page. So let's > use it directly. > > Signed-off-by: Leon Romanovsky > --- > block/blk-mq-dma.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c > index ad283017caef..37e2142be4f7 100644 > --- a/block/blk-mq-dma.c > +++ b/block/blk-mq-dma.c > @@ -87,8 +87,8 @@ static bool blk_dma_map_bus(struct blk_dma_iter *iter, > struct phys_vec *vec) > static bool blk_dma_map_direct(struct request *req, struct device *dma_dev, > struct blk_dma_iter *iter, struct phys_vec *vec) > { > - iter->addr = dma_map_page(dma_dev, phys_to_page(vec->paddr), > - offset_in_page(vec->paddr), vec->len, rq_dma_dir(req)); > + iter->addr = dma_map_phys(dma_dev, vec->paddr, vec->len, > + rq_dma_dir(req), 0); > if (dma_mapping_error(dma_dev, iter->addr)) { > iter->status = BLK_STS_RESOURCE; > return false; I wonder where is the corresponding dma_unmap_page() call and its change to dma_unmap_phys()... Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v6 03/16] dma-debug: refactor to use physical addresses for page mapping
On 10.09.2025 07:26, Leon Romanovsky wrote: > On Tue, Sep 09, 2025 at 10:37:48PM +0300, Leon Romanovsky wrote: >> On Tue, Sep 09, 2025 at 04:27:31PM +0300, Leon Romanovsky wrote: >>> From: Leon Romanovsky >> <...> >> >>> include/linux/page-flags.h | 1 + >> <...> >> >>> --- a/include/linux/page-flags.h >>> +++ b/include/linux/page-flags.h >>> @@ -614,6 +614,7 @@ FOLIO_FLAG(dropbehind, FOLIO_HEAD_PAGE) >>>* available at this point. >>>*/ >>> #define PageHighMem(__p) is_highmem_idx(page_zonenum(__p)) >>> +#define PhysHighMem(__p) (PageHighMem(phys_to_page(__p))) >> This was a not so great idea to add PhysHighMem() because of "else" >> below which unfolds to maze of macros and automatically generated >> functions with "static inline int Page##uname ..." signature. >> >>> #define folio_test_highmem(__f) is_highmem_idx(folio_zonenum(__f)) >>> #else >>> PAGEFLAG_FALSE(HighMem, highmem) > After sleeping over it, the following hunk will help: > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index dfbc4ba86bba2..2a1f346178024 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -614,11 +614,11 @@ FOLIO_FLAG(dropbehind, FOLIO_HEAD_PAGE) >* available at this point. >*/ > #define PageHighMem(__p) is_highmem_idx(page_zonenum(__p)) > -#define PhysHighMem(__p) (PageHighMem(phys_to_page(__p))) > #define folio_test_highmem(__f)is_highmem_idx(folio_zonenum(__f)) > #else > PAGEFLAG_FALSE(HighMem, highmem) > #endif > +#define PhysHighMem(__p) (PageHighMem(phys_to_page(__p))) > > /* Does kmap_local_folio() only allow access to one page of the folio? */ > #ifdef CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP Okay, I will add this fixup while applying the patches. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
Re: [PATCH v6 00/16] dma-mapping: migrate to physical address-based API
On 09.09.2025 15:27, Leon Romanovsky wrote: > From: Leon Romanovsky > > Changelog: > v6: > * Based on "dma-debug: don't enforce dma mapping check on noncoherent > allocations" patch. > * Removed some unused variables from kmsan conversion. > * Fixed missed ! in dma check. > v5: https://lore.kernel.org/all/cover.1756822782.git.l...@kernel.org > * Added Jason's and Keith's Reviewed-by tags > * Fixed DMA_ATTR_MMIO check in dma_direct_map_phys > * Jason's cleanup suggestions > v4: https://lore.kernel.org/all/cover.1755624249.git.l...@kernel.org/ > * Fixed kbuild error with mismatch in kmsan function declaration due to > rebase error. > v3: https://lore.kernel.org/all/cover.1755193625.git.l...@kernel.org > * Fixed typo in "cacheable" word > * Simplified kmsan patch a lot to be simple argument refactoring > v2: https://lore.kernel.org/all/cover.1755153054.git.l...@kernel.org > * Used commit messages and cover letter from Jason > * Moved setting IOMMU_MMIO flag to dma_info_to_prot function > * Micro-optimized the code > * Rebased code on v6.17-rc1 > v1: https://lore.kernel.org/all/cover.1754292567.git.l...@kernel.org > * Added new DMA_ATTR_MMIO attribute to indicate > PCI_P2PDMA_MAP_THRU_HOST_BRIDGE path. > * Rewrote dma_map_* functions to use thus new attribute > v0: https://lore.kernel.org/all/cover.1750854543.git.l...@kernel.org/ > > > This series refactors the DMA mapping to use physical addresses > as the primary interface instead of page+offset parameters. This > change aligns the DMA API with the underlying hardware reality where > DMA operations work with physical addresses, not page structures. > > The series maintains export symbol backward compatibility by keeping > the old page-based API as wrapper functions around the new physical > address-based implementations. > > This series refactors the DMA mapping API to provide a phys_addr_t > based, and struct-page free, external API that can handle all the > mapping cases we want in modern systems: > > - struct page based cacheable DRAM > - struct page MEMORY_DEVICE_PCI_P2PDMA PCI peer to peer non-cacheable > MMIO > - struct page-less PCI peer to peer non-cacheable MMIO > - struct page-less "resource" MMIO > > Overall this gets much closer to Matthew's long term wish for > struct-pageless IO to cacheable DRAM. The remaining primary work would > be in the mm side to allow kmap_local_pfn()/phys_to_virt() to work on > phys_addr_t without a struct page. > > The general design is to remove struct page usage entirely from the > DMA API inner layers. For flows that need to have a KVA for the > physical address they can use kmap_local_pfn() or phys_to_virt(). This > isolates the struct page requirements to MM code only. Long term all > removals of struct page usage are supporting Matthew's memdesc > project which seeks to substantially transform how struct page works. > > Instead make the DMA API internals work on phys_addr_t. Internally > there are still dedicated 'page' and 'resource' flows, except they are > now distinguished by a new DMA_ATTR_MMIO instead of by callchain. Both > flows use the same phys_addr_t. > > When DMA_ATTR_MMIO is specified things work similar to the existing > 'resource' flow. kmap_local_pfn(), phys_to_virt(), phys_to_page(), > pfn_valid(), etc are never called on the phys_addr_t. This requires > rejecting any configuration that would need swiotlb. CPU cache > flushing is not required, and avoided, as ATTR_MMIO also indicates the > address have no cacheable mappings. This effectively removes any > DMA API side requirement to have struct page when DMA_ATTR_MMIO is > used. > > In the !DMA_ATTR_MMIO mode things work similarly to the 'page' flow, > except on the common path of no cache flush, no swiotlb it never > touches a struct page. When cache flushing or swiotlb copying > kmap_local_pfn()/phys_to_virt() are used to get a KVA for CPU > usage. This was already the case on the unmap side, now the map side > is symmetric. > > Callers are adjusted to set DMA_ATTR_MMIO. Existing 'resource' users > must set it. The existing struct page based MEMORY_DEVICE_PCI_P2PDMA > path must also set it. This corrects some existing bugs where iommu > mappings for P2P MMIO were improperly marked IOMMU_CACHE. > > Since ATTR_MMIO is made to work with all the existing DMA map entry > points, particularly dma_iova_link(), this finally allows a way to use > the new DMA API to map PCI P2P MMIO without creating struct page. The > VFIO DMABUF series demonstrate