RE: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table related issues
>-Original Message- >From: Robin Murphy >Sent: Tuesday, September 1, 2020 3:54 PM >To: Ruhl, Michael J ; Marek Szyprowski >; dri-de...@lists.freedesktop.org; >iommu@lists.linux-foundation.org; linaro-mm-...@lists.linaro.org; linux- >ker...@vger.kernel.org >Cc: Bartlomiej Zolnierkiewicz ; David Airlie >; intel-...@lists.freedesktop.org; Christoph Hellwig >; linux-arm-ker...@lists.infradead.org >Subject: Re: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct >sg_table related issues > >On 2020-09-01 20:38, Ruhl, Michael J wrote: >>> -Original Message- >>> From: Intel-gfx On Behalf Of >>> Marek Szyprowski >>> Sent: Wednesday, August 26, 2020 2:33 AM >>> To: dri-de...@lists.freedesktop.org; iommu@lists.linux-foundation.org; >>> linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org >>> Cc: Bartlomiej Zolnierkiewicz ; David Airlie >>> ; intel-...@lists.freedesktop.org; Robin Murphy >>> ; Christoph Hellwig ; linux-arm- >>> ker...@lists.infradead.org; Marek Szyprowski >>> >>> Subject: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table >>> related issues >>> >>> The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() >>> function >>> returns the number of the created entries in the DMA address space. >>> However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and >>> dma_unmap_sg must be called with the original number of the entries >>> passed to the dma_map_sg(). >>> >>> struct sg_table is a common structure used for describing a non-contiguous >>> memory buffer, used commonly in the DRM and graphics subsystems. It >>> consists of a scatterlist with memory pages and DMA addresses (sgl entry), >>> as well as the number of scatterlist entries: CPU pages (orig_nents entry) >>> and DMA mapped pages (nents entry). >>> >>> It turned out that it was a common mistake to misuse nents and orig_nents >>> entries, calling DMA-mapping functions with a wrong number of entries or >>> ignoring the number of mapped entries returned by the dma_map_sg() >>> function. >>> >>> This driver creatively uses sg_table->orig_nents to store the size of the >>> allocated scatterlist and ignores the number of the entries returned by >>> dma_map_sg function. The sg_table->orig_nents is (mis)used to properly >>> free the (over)allocated scatterlist. >>> >>> This patch only introduces the common DMA-mapping wrappers operating >>> directly on the struct sg_table objects to the dmabuf related functions, >>> so the other drivers, which might share buffers with i915 could rely on >>> the properly set nents and orig_nents values. >>> >>> Signed-off-by: Marek Szyprowski >>> --- >>> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 11 +++ >>> drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +++ >>> 2 files changed, 6 insertions(+), 12 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c >>> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c >>> index 2679380159fc..8a988592715b 100644 >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c >>> @@ -48,12 +48,9 @@ static struct sg_table >*i915_gem_map_dma_buf(struct >>> dma_buf_attachment *attachme >>> src = sg_next(src); >>> } >>> >>> - if (!dma_map_sg_attrs(attachment->dev, >>> - st->sgl, st->nents, dir, >>> - DMA_ATTR_SKIP_CPU_SYNC)) { >>> - ret = -ENOMEM; >> >> You have dropped this error value. >> >> Do you now if this is a benign loss? > >True, dma_map_sgtable() will return -EINVAL rather than -ENOMEM for >failure. A quick look through other .map_dma_buf callbacks suggests >they're returning a motley mix of error values and NULL for failure >cases, so I'd imagine that importers shouldn't be too sensitive to the >exact value. I followed some of our code through to see if anyone is checking for -ENOMEM... I have found in some test paths... However, it is not clear to me if we can get to those paths from here. Anyways, Reviewed-by: Michael J. Ruhl Mike >Robin. > >> >> M >> >>> + ret = dma_map_sgtable(attachment->dev, st, dir, >>> DMA_ATTR_SKIP_CPU_SYNC); >>> + if (ret) >>> goto err_free_sg; >>> - } >>&g
RE: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table related issues
>-Original Message- >From: Intel-gfx On Behalf Of >Marek Szyprowski >Sent: Wednesday, August 26, 2020 2:33 AM >To: dri-de...@lists.freedesktop.org; iommu@lists.linux-foundation.org; >linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org >Cc: Bartlomiej Zolnierkiewicz ; David Airlie >; intel-...@lists.freedesktop.org; Robin Murphy >; Christoph Hellwig ; linux-arm- >ker...@lists.infradead.org; Marek Szyprowski > >Subject: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table >related issues > >The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg() >function >returns the number of the created entries in the DMA address space. >However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and >dma_unmap_sg must be called with the original number of the entries >passed to the dma_map_sg(). > >struct sg_table is a common structure used for describing a non-contiguous >memory buffer, used commonly in the DRM and graphics subsystems. It >consists of a scatterlist with memory pages and DMA addresses (sgl entry), >as well as the number of scatterlist entries: CPU pages (orig_nents entry) >and DMA mapped pages (nents entry). > >It turned out that it was a common mistake to misuse nents and orig_nents >entries, calling DMA-mapping functions with a wrong number of entries or >ignoring the number of mapped entries returned by the dma_map_sg() >function. > >This driver creatively uses sg_table->orig_nents to store the size of the >allocated scatterlist and ignores the number of the entries returned by >dma_map_sg function. The sg_table->orig_nents is (mis)used to properly >free the (over)allocated scatterlist. > >This patch only introduces the common DMA-mapping wrappers operating >directly on the struct sg_table objects to the dmabuf related functions, >so the other drivers, which might share buffers with i915 could rely on >the properly set nents and orig_nents values. > >Signed-off-by: Marek Szyprowski >--- > drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 11 +++ > drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 7 +++ > 2 files changed, 6 insertions(+), 12 deletions(-) > >diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c >b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c >index 2679380159fc..8a988592715b 100644 >--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c >+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c >@@ -48,12 +48,9 @@ static struct sg_table *i915_gem_map_dma_buf(struct >dma_buf_attachment *attachme > src = sg_next(src); > } > >- if (!dma_map_sg_attrs(attachment->dev, >-st->sgl, st->nents, dir, >-DMA_ATTR_SKIP_CPU_SYNC)) { >- ret = -ENOMEM; You have dropped this error value. Do you now if this is a benign loss? M >+ ret = dma_map_sgtable(attachment->dev, st, dir, >DMA_ATTR_SKIP_CPU_SYNC); >+ if (ret) > goto err_free_sg; >- } > > return st; > >@@ -73,9 +70,7 @@ static void i915_gem_unmap_dma_buf(struct >dma_buf_attachment *attachment, > { > struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment- >>dmabuf); > >- dma_unmap_sg_attrs(attachment->dev, >- sg->sgl, sg->nents, dir, >- DMA_ATTR_SKIP_CPU_SYNC); >+ dma_unmap_sgtable(attachment->dev, sg, dir, >DMA_ATTR_SKIP_CPU_SYNC); > sg_free_table(sg); > kfree(sg); > >diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c >b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c >index debaf7b18ab5..be30b27e2926 100644 >--- a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c >+++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c >@@ -28,10 +28,9 @@ static struct sg_table *mock_map_dma_buf(struct >dma_buf_attachment *attachment, > sg = sg_next(sg); > } > >- if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) { >- err = -ENOMEM; >+ err = dma_map_sgtable(attachment->dev, st, dir, 0); >+ if (err) > goto err_st; >- } > > return st; > >@@ -46,7 +45,7 @@ static void mock_unmap_dma_buf(struct >dma_buf_attachment *attachment, > struct sg_table *st, > enum dma_data_direction dir) > { >- dma_unmap_sg(attachment->dev, st->sgl, st->nents, dir); >+ dma_unmap_sgtable(attachment->dev, st, dir, 0); > sg_free_table(st); > kfree(st); > } >-- >2.17.1 > >___ >Intel-gfx mailing list >intel-...@lists.freedesktop.org >https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist wrappers
>-Original Message- >From: Marek Szyprowski >Sent: Tuesday, May 12, 2020 4:34 PM >To: Ruhl, Michael J ; dri- >de...@lists.freedesktop.org; iommu@lists.linux-foundation.org; linaro-mm- >s...@lists.linaro.org; linux-ker...@vger.kernel.org >Cc: Pawel Osciak ; Bartlomiej Zolnierkiewicz >; David Airlie ; linux- >me...@vger.kernel.org; Hans Verkuil ; Mauro >Carvalho Chehab ; Robin Murphy >; Christoph Hellwig ; linux-arm- >ker...@lists.infradead.org >Subject: Re: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist >wrappers > >Hi Michael, > >On 12.05.2020 19:52, Ruhl, Michael J wrote: >>> -Original Message- >>> From: dri-devel On Behalf Of >>> Marek Szyprowski >>> Sent: Tuesday, May 12, 2020 5:01 AM >>> To: dri-de...@lists.freedesktop.org; iommu@lists.linux-foundation.org; >>> linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org >>> Cc: Pawel Osciak ; Bartlomiej Zolnierkiewicz >>> ; David Airlie ; linux- >>> me...@vger.kernel.org; Hans Verkuil ; Mauro >>> Carvalho Chehab ; Robin Murphy >>> ; Christoph Hellwig ; linux-arm- >>> ker...@lists.infradead.org; Marek Szyprowski >>> >>> Subject: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist >wrappers >>> >>> Use recently introduced common wrappers operating directly on the struct >>> sg_table objects and scatterlist page iterators to make the code a bit >>> more compact, robust, easier to follow and copy/paste safe. >>> >>> No functional change, because the code already properly did all the >>> scaterlist related calls. >>> >>> Signed-off-by: Marek Szyprowski >>> --- >>> For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents >>> vs. orig_nents misuse' thread: >>> https://lore.kernel.org/dri-devel/20200512085710.14688-1- >>> m.szyprow...@samsung.com/T/ >>> --- >>> .../media/common/videobuf2/videobuf2-dma-contig.c | 41 ++- >--- >>> >>> drivers/media/common/videobuf2/videobuf2-dma-sg.c | 32 +++- >--- >>> -- >>> drivers/media/common/videobuf2/videobuf2-vmalloc.c | 12 +++ >>> 3 files changed, 34 insertions(+), 51 deletions(-) >>> >>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c >>> b/drivers/media/common/videobuf2/videobuf2-dma-contig.c >>> index d3a3ee5..bf31a9d 100644 >>> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c >>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c >>> @@ -48,16 +48,15 @@ struct vb2_dc_buf { >>> >>> static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt) >>> { >>> - struct scatterlist *s; >>> dma_addr_t expected = sg_dma_address(sgt->sgl); >>> - unsigned int i; >>> + struct sg_dma_page_iter dma_iter; >>> unsigned long size = 0; >>> >>> - for_each_sg(sgt->sgl, s, sgt->nents, i) { >>> - if (sg_dma_address(s) != expected) >>> + for_each_sgtable_dma_page(sgt, &dma_iter, 0) { >>> + if (sg_page_iter_dma_address(&dma_iter) != expected) >>> break; >>> - expected = sg_dma_address(s) + sg_dma_len(s); >>> - size += sg_dma_len(s); >>> + expected += PAGE_SIZE; >>> + size += PAGE_SIZE; >> This code in drm_prime_t_contiguous_size and here. I seem to remember >seeing >> the same pattern in other drivers. >> >> Would it worthwhile to make this a helper as well? >I think I've identified such patterns in all DRM drivers and replaced >with a common helper. So far I have no idea where to put such helper to >make it available for media/videobuf2, so those a few lines are indeed >duplicated here. I was thinking of drivers outside of DRM/media. Specifically RDMA. However, looking at that code, I see that my memory was a little off. It is working with continuous pages, but not finding the size. >> Also, isn't the sg_dma_len() the actual length of the chunk we are looking >at? >> >> If its I not PAGE_SIZE (ie. dma chunk is 4 * PAGE_SIZE?), does your >loop/calculation still work? > >scaterlist page iterators (for_each_sg_page/for_each_sg_dma_page and >their sgtable variants) always operates on PAGE_SIZE units. They >correctly handle larger sg_dma_len(). Ahh, ok, I see. Thank you! Mike > >Best regards >-- >Marek Szyprowski, PhD >Samsung R&D Institute Poland ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist wrappers
>-Original Message- >From: dri-devel On Behalf Of >Marek Szyprowski >Sent: Tuesday, May 12, 2020 5:01 AM >To: dri-de...@lists.freedesktop.org; iommu@lists.linux-foundation.org; >linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org >Cc: Pawel Osciak ; Bartlomiej Zolnierkiewicz >; David Airlie ; linux- >me...@vger.kernel.org; Hans Verkuil ; Mauro >Carvalho Chehab ; Robin Murphy >; Christoph Hellwig ; linux-arm- >ker...@lists.infradead.org; Marek Szyprowski > >Subject: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist wrappers > >Use recently introduced common wrappers operating directly on the struct >sg_table objects and scatterlist page iterators to make the code a bit >more compact, robust, easier to follow and copy/paste safe. > >No functional change, because the code already properly did all the >scaterlist related calls. > >Signed-off-by: Marek Szyprowski >--- >For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents >vs. orig_nents misuse' thread: >https://lore.kernel.org/dri-devel/20200512085710.14688-1- >m.szyprow...@samsung.com/T/ >--- > .../media/common/videobuf2/videobuf2-dma-contig.c | 41 ++ > > drivers/media/common/videobuf2/videobuf2-dma-sg.c | 32 +++ >-- > drivers/media/common/videobuf2/videobuf2-vmalloc.c | 12 +++ > 3 files changed, 34 insertions(+), 51 deletions(-) > >diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c >b/drivers/media/common/videobuf2/videobuf2-dma-contig.c >index d3a3ee5..bf31a9d 100644 >--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c >+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c >@@ -48,16 +48,15 @@ struct vb2_dc_buf { > > static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt) > { >- struct scatterlist *s; > dma_addr_t expected = sg_dma_address(sgt->sgl); >- unsigned int i; >+ struct sg_dma_page_iter dma_iter; > unsigned long size = 0; > >- for_each_sg(sgt->sgl, s, sgt->nents, i) { >- if (sg_dma_address(s) != expected) >+ for_each_sgtable_dma_page(sgt, &dma_iter, 0) { >+ if (sg_page_iter_dma_address(&dma_iter) != expected) > break; >- expected = sg_dma_address(s) + sg_dma_len(s); >- size += sg_dma_len(s); >+ expected += PAGE_SIZE; >+ size += PAGE_SIZE; This code in drm_prime_t_contiguous_size and here. I seem to remember seeing the same pattern in other drivers. Would it worthwhile to make this a helper as well? Also, isn't the sg_dma_len() the actual length of the chunk we are looking at? If its I not PAGE_SIZE (ie. dma chunk is 4 * PAGE_SIZE?), does your loop/calculation still work? Thanks, Mike > } > return size; > } >@@ -99,8 +98,7 @@ static void vb2_dc_prepare(void *buf_priv) > if (!sgt || buf->db_attach) > return; > >- dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents, >- buf->dma_dir); >+ dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir); > } > > static void vb2_dc_finish(void *buf_priv) >@@ -112,7 +110,7 @@ static void vb2_dc_finish(void *buf_priv) > if (!sgt || buf->db_attach) > return; > >- dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf- >>dma_dir); >+ dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir); > } > > /*/ >@@ -273,8 +271,8 @@ static void vb2_dc_dmabuf_ops_detach(struct >dma_buf *dbuf, >* memory locations do not require any explicit cache >* maintenance prior or after being used by the device. >*/ >- dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt- >>orig_nents, >- attach->dma_dir, >DMA_ATTR_SKIP_CPU_SYNC); >+ dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir, >+DMA_ATTR_SKIP_CPU_SYNC); > sg_free_table(sgt); > kfree(attach); > db_attach->priv = NULL; >@@ -299,8 +297,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map( > > /* release any previous cache */ > if (attach->dma_dir != DMA_NONE) { >- dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt- >>orig_nents, >- attach->dma_dir, >DMA_ATTR_SKIP_CPU_SYNC); >+ dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir, >+DMA_ATTR_SKIP_CPU_SYNC); > attach->dma_dir = DMA_NONE; > } > >@@ -308,9 +306,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map( >* mapping to the client with new direction, no cache sync >* required see comment in vb2_dc_dmabuf_ops_detach() >*/ >- sgt->nents = dma_map_sg_attrs(db_attach->dev, sgt->sgl, sgt- >>orig_nents, >-dma_dir, DMA_ATTR_SKIP_CPU_SYNC); >- if (!sgt->nents) { >+ if (dm