RE: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table related issues

2020-09-01 Thread Ruhl, Michael J
>-Original Message-
>From: Robin Murphy 
>Sent: Tuesday, September 1, 2020 3:54 PM
>To: Ruhl, Michael J ; Marek Szyprowski
>; dri-de...@lists.freedesktop.org;
>iommu@lists.linux-foundation.org; linaro-mm-...@lists.linaro.org; linux-
>ker...@vger.kernel.org
>Cc: Bartlomiej Zolnierkiewicz ; David Airlie
>; intel-...@lists.freedesktop.org; Christoph Hellwig
>; linux-arm-ker...@lists.infradead.org
>Subject: Re: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct
>sg_table related issues
>
>On 2020-09-01 20:38, Ruhl, Michael J wrote:
>>> -Original Message-
>>> From: Intel-gfx  On Behalf Of
>>> Marek Szyprowski
>>> Sent: Wednesday, August 26, 2020 2:33 AM
>>> To: dri-de...@lists.freedesktop.org; iommu@lists.linux-foundation.org;
>>> linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org
>>> Cc: Bartlomiej Zolnierkiewicz ; David Airlie
>>> ; intel-...@lists.freedesktop.org; Robin Murphy
>>> ; Christoph Hellwig ; linux-arm-
>>> ker...@lists.infradead.org; Marek Szyprowski
>>> 
>>> Subject: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table
>>> related issues
>>>
>>> The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg()
>>> function
>>> returns the number of the created entries in the DMA address space.
>>> However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
>>> dma_unmap_sg must be called with the original number of the entries
>>> passed to the dma_map_sg().
>>>
>>> struct sg_table is a common structure used for describing a non-contiguous
>>> memory buffer, used commonly in the DRM and graphics subsystems. It
>>> consists of a scatterlist with memory pages and DMA addresses (sgl entry),
>>> as well as the number of scatterlist entries: CPU pages (orig_nents entry)
>>> and DMA mapped pages (nents entry).
>>>
>>> It turned out that it was a common mistake to misuse nents and orig_nents
>>> entries, calling DMA-mapping functions with a wrong number of entries or
>>> ignoring the number of mapped entries returned by the dma_map_sg()
>>> function.
>>>
>>> This driver creatively uses sg_table->orig_nents to store the size of the
>>> allocated scatterlist and ignores the number of the entries returned by
>>> dma_map_sg function. The sg_table->orig_nents is (mis)used to properly
>>> free the (over)allocated scatterlist.
>>>
>>> This patch only introduces the common DMA-mapping wrappers operating
>>> directly on the struct sg_table objects to the dmabuf related functions,
>>> so the other drivers, which might share buffers with i915 could rely on
>>> the properly set nents and orig_nents values.
>>>
>>> Signed-off-by: Marek Szyprowski 
>>> ---
>>> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   | 11 +++
>>> drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c |  7 +++
>>> 2 files changed, 6 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> index 2679380159fc..8a988592715b 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>>> @@ -48,12 +48,9 @@ static struct sg_table
>*i915_gem_map_dma_buf(struct
>>> dma_buf_attachment *attachme
>>> src = sg_next(src);
>>> }
>>>
>>> -   if (!dma_map_sg_attrs(attachment->dev,
>>> - st->sgl, st->nents, dir,
>>> - DMA_ATTR_SKIP_CPU_SYNC)) {
>>> -   ret = -ENOMEM;
>>
>> You have dropped this error value.
>>
>> Do you now if this is a benign loss?
>
>True, dma_map_sgtable() will return -EINVAL rather than -ENOMEM for
>failure. A quick look through other .map_dma_buf callbacks suggests
>they're returning a motley mix of error values and NULL for failure
>cases, so I'd imagine that importers shouldn't be too sensitive to the
>exact value.

I followed some of our code through to see if anyone is checking for -ENOMEM...

I have found in some test paths... However, it is not clear to me if we can get
to those paths from here.

Anyways,

Reviewed-by: Michael J. Ruhl 

Mike

>Robin.
>
>>
>> M
>>
>>> +   ret = dma_map_sgtable(attachment->dev, st, dir,
>>> DMA_ATTR_SKIP_CPU_SYNC);
>>> +   if (ret)
>>> goto err_free_sg;
>>> -   }
>>&g

RE: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table related issues

2020-09-01 Thread Ruhl, Michael J
>-Original Message-
>From: Intel-gfx  On Behalf Of
>Marek Szyprowski
>Sent: Wednesday, August 26, 2020 2:33 AM
>To: dri-de...@lists.freedesktop.org; iommu@lists.linux-foundation.org;
>linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org
>Cc: Bartlomiej Zolnierkiewicz ; David Airlie
>; intel-...@lists.freedesktop.org; Robin Murphy
>; Christoph Hellwig ; linux-arm-
>ker...@lists.infradead.org; Marek Szyprowski
>
>Subject: [Intel-gfx] [PATCH v9 08/32] drm: i915: fix common struct sg_table
>related issues
>
>The Documentation/DMA-API-HOWTO.txt states that the dma_map_sg()
>function
>returns the number of the created entries in the DMA address space.
>However the subsequent calls to the dma_sync_sg_for_{device,cpu}() and
>dma_unmap_sg must be called with the original number of the entries
>passed to the dma_map_sg().
>
>struct sg_table is a common structure used for describing a non-contiguous
>memory buffer, used commonly in the DRM and graphics subsystems. It
>consists of a scatterlist with memory pages and DMA addresses (sgl entry),
>as well as the number of scatterlist entries: CPU pages (orig_nents entry)
>and DMA mapped pages (nents entry).
>
>It turned out that it was a common mistake to misuse nents and orig_nents
>entries, calling DMA-mapping functions with a wrong number of entries or
>ignoring the number of mapped entries returned by the dma_map_sg()
>function.
>
>This driver creatively uses sg_table->orig_nents to store the size of the
>allocated scatterlist and ignores the number of the entries returned by
>dma_map_sg function. The sg_table->orig_nents is (mis)used to properly
>free the (over)allocated scatterlist.
>
>This patch only introduces the common DMA-mapping wrappers operating
>directly on the struct sg_table objects to the dmabuf related functions,
>so the other drivers, which might share buffers with i915 could rely on
>the properly set nents and orig_nents values.
>
>Signed-off-by: Marek Szyprowski 
>---
> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   | 11 +++
> drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c |  7 +++
> 2 files changed, 6 insertions(+), 12 deletions(-)
>
>diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>index 2679380159fc..8a988592715b 100644
>--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
>@@ -48,12 +48,9 @@ static struct sg_table *i915_gem_map_dma_buf(struct
>dma_buf_attachment *attachme
>   src = sg_next(src);
>   }
>
>-  if (!dma_map_sg_attrs(attachment->dev,
>-st->sgl, st->nents, dir,
>-DMA_ATTR_SKIP_CPU_SYNC)) {
>-  ret = -ENOMEM;

You have dropped this error value.

Do you now if this is a benign loss?

M

>+  ret = dma_map_sgtable(attachment->dev, st, dir,
>DMA_ATTR_SKIP_CPU_SYNC);
>+  if (ret)
>   goto err_free_sg;
>-  }
>
>   return st;
>
>@@ -73,9 +70,7 @@ static void i915_gem_unmap_dma_buf(struct
>dma_buf_attachment *attachment,
> {
>   struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment-
>>dmabuf);
>
>-  dma_unmap_sg_attrs(attachment->dev,
>- sg->sgl, sg->nents, dir,
>- DMA_ATTR_SKIP_CPU_SYNC);
>+  dma_unmap_sgtable(attachment->dev, sg, dir,
>DMA_ATTR_SKIP_CPU_SYNC);
>   sg_free_table(sg);
>   kfree(sg);
>
>diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>index debaf7b18ab5..be30b27e2926 100644
>--- a/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>+++ b/drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c
>@@ -28,10 +28,9 @@ static struct sg_table *mock_map_dma_buf(struct
>dma_buf_attachment *attachment,
>   sg = sg_next(sg);
>   }
>
>-  if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) {
>-  err = -ENOMEM;
>+  err = dma_map_sgtable(attachment->dev, st, dir, 0);
>+  if (err)
>   goto err_st;
>-  }
>
>   return st;
>
>@@ -46,7 +45,7 @@ static void mock_unmap_dma_buf(struct
>dma_buf_attachment *attachment,
>  struct sg_table *st,
>  enum dma_data_direction dir)
> {
>-  dma_unmap_sg(attachment->dev, st->sgl, st->nents, dir);
>+  dma_unmap_sgtable(attachment->dev, st, dir, 0);
>   sg_free_table(st);
>   kfree(st);
> }
>--
>2.17.1
>
>___
>Intel-gfx mailing list
>intel-...@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist wrappers

2020-05-13 Thread Ruhl, Michael J
>-Original Message-
>From: Marek Szyprowski 
>Sent: Tuesday, May 12, 2020 4:34 PM
>To: Ruhl, Michael J ; dri-
>de...@lists.freedesktop.org; iommu@lists.linux-foundation.org; linaro-mm-
>s...@lists.linaro.org; linux-ker...@vger.kernel.org
>Cc: Pawel Osciak ; Bartlomiej Zolnierkiewicz
>; David Airlie ; linux-
>me...@vger.kernel.org; Hans Verkuil ; Mauro
>Carvalho Chehab ; Robin Murphy
>; Christoph Hellwig ; linux-arm-
>ker...@lists.infradead.org
>Subject: Re: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist
>wrappers
>
>Hi Michael,
>
>On 12.05.2020 19:52, Ruhl, Michael J wrote:
>>> -Original Message-
>>> From: dri-devel  On Behalf Of
>>> Marek Szyprowski
>>> Sent: Tuesday, May 12, 2020 5:01 AM
>>> To: dri-de...@lists.freedesktop.org; iommu@lists.linux-foundation.org;
>>> linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org
>>> Cc: Pawel Osciak ; Bartlomiej Zolnierkiewicz
>>> ; David Airlie ; linux-
>>> me...@vger.kernel.org; Hans Verkuil ; Mauro
>>> Carvalho Chehab ; Robin Murphy
>>> ; Christoph Hellwig ; linux-arm-
>>> ker...@lists.infradead.org; Marek Szyprowski
>>> 
>>> Subject: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist
>wrappers
>>>
>>> Use recently introduced common wrappers operating directly on the struct
>>> sg_table objects and scatterlist page iterators to make the code a bit
>>> more compact, robust, easier to follow and copy/paste safe.
>>>
>>> No functional change, because the code already properly did all the
>>> scaterlist related calls.
>>>
>>> Signed-off-by: Marek Szyprowski 
>>> ---
>>> For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
>>> vs. orig_nents misuse' thread:
>>> https://lore.kernel.org/dri-devel/20200512085710.14688-1-
>>> m.szyprow...@samsung.com/T/
>>> ---
>>> .../media/common/videobuf2/videobuf2-dma-contig.c  | 41 ++-
>---
>>> 
>>> drivers/media/common/videobuf2/videobuf2-dma-sg.c  | 32 +++-
>---
>>> --
>>> drivers/media/common/videobuf2/videobuf2-vmalloc.c | 12 +++
>>> 3 files changed, 34 insertions(+), 51 deletions(-)
>>>
>>> diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>>> b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>>> index d3a3ee5..bf31a9d 100644
>>> --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>>> +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>>> @@ -48,16 +48,15 @@ struct vb2_dc_buf {
>>>
>>> static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt)
>>> {
>>> -   struct scatterlist *s;
>>> dma_addr_t expected = sg_dma_address(sgt->sgl);
>>> -   unsigned int i;
>>> +   struct sg_dma_page_iter dma_iter;
>>> unsigned long size = 0;
>>>
>>> -   for_each_sg(sgt->sgl, s, sgt->nents, i) {
>>> -   if (sg_dma_address(s) != expected)
>>> +   for_each_sgtable_dma_page(sgt, &dma_iter, 0) {
>>> +   if (sg_page_iter_dma_address(&dma_iter) != expected)
>>> break;
>>> -   expected = sg_dma_address(s) + sg_dma_len(s);
>>> -   size += sg_dma_len(s);
>>> +   expected += PAGE_SIZE;
>>> +   size += PAGE_SIZE;
>> This code in drm_prime_t_contiguous_size and here.  I seem to remember
>seeing
>> the same pattern in other drivers.
>>
>> Would it worthwhile to make this a helper as well?
>I think I've identified such patterns in all DRM drivers and replaced
>with a common helper. So far I have no idea where to put such helper to
>make it available for media/videobuf2, so those a few lines are indeed
>duplicated here.

I was thinking of drivers outside of DRM/media.  Specifically RDMA.

However, looking at that code, I see that my memory was a little off.
It is working with continuous pages,  but not finding the size.

>> Also, isn't the sg_dma_len() the actual length of the chunk we are looking
>at?
>>
>> If its I not PAGE_SIZE (ie. dma chunk is 4 * PAGE_SIZE?), does your
>loop/calculation still work?
>
>scaterlist page iterators (for_each_sg_page/for_each_sg_dma_page and
>their sgtable variants) always operates on PAGE_SIZE units. They
>correctly handle larger sg_dma_len().

Ahh, ok, I see. 

Thank you!

Mike

>
>Best regards
>--
>Marek Szyprowski, PhD
>Samsung R&D Institute Poland

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist wrappers

2020-05-12 Thread Ruhl, Michael J



>-Original Message-
>From: dri-devel  On Behalf Of
>Marek Szyprowski
>Sent: Tuesday, May 12, 2020 5:01 AM
>To: dri-de...@lists.freedesktop.org; iommu@lists.linux-foundation.org;
>linaro-mm-...@lists.linaro.org; linux-ker...@vger.kernel.org
>Cc: Pawel Osciak ; Bartlomiej Zolnierkiewicz
>; David Airlie ; linux-
>me...@vger.kernel.org; Hans Verkuil ; Mauro
>Carvalho Chehab ; Robin Murphy
>; Christoph Hellwig ; linux-arm-
>ker...@lists.infradead.org; Marek Szyprowski
>
>Subject: [PATCH v4 38/38] videobuf2: use sgtable-based scatterlist wrappers
>
>Use recently introduced common wrappers operating directly on the struct
>sg_table objects and scatterlist page iterators to make the code a bit
>more compact, robust, easier to follow and copy/paste safe.
>
>No functional change, because the code already properly did all the
>scaterlist related calls.
>
>Signed-off-by: Marek Szyprowski 
>---
>For more information, see '[PATCH v4 00/38] DRM: fix struct sg_table nents
>vs. orig_nents misuse' thread:
>https://lore.kernel.org/dri-devel/20200512085710.14688-1-
>m.szyprow...@samsung.com/T/
>---
> .../media/common/videobuf2/videobuf2-dma-contig.c  | 41 ++
>
> drivers/media/common/videobuf2/videobuf2-dma-sg.c  | 32 +++
>--
> drivers/media/common/videobuf2/videobuf2-vmalloc.c | 12 +++
> 3 files changed, 34 insertions(+), 51 deletions(-)
>
>diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>index d3a3ee5..bf31a9d 100644
>--- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>+++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c
>@@ -48,16 +48,15 @@ struct vb2_dc_buf {
>
> static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt)
> {
>-  struct scatterlist *s;
>   dma_addr_t expected = sg_dma_address(sgt->sgl);
>-  unsigned int i;
>+  struct sg_dma_page_iter dma_iter;
>   unsigned long size = 0;
>
>-  for_each_sg(sgt->sgl, s, sgt->nents, i) {
>-  if (sg_dma_address(s) != expected)
>+  for_each_sgtable_dma_page(sgt, &dma_iter, 0) {
>+  if (sg_page_iter_dma_address(&dma_iter) != expected)
>   break;
>-  expected = sg_dma_address(s) + sg_dma_len(s);
>-  size += sg_dma_len(s);
>+  expected += PAGE_SIZE;
>+  size += PAGE_SIZE;

This code in drm_prime_t_contiguous_size and here.  I seem to remember seeing
the same pattern in other drivers.

Would it worthwhile to make this a helper as well?

Also, isn't the sg_dma_len() the actual length of the chunk we are looking at?

If its I not PAGE_SIZE (ie. dma chunk is 4 * PAGE_SIZE?), does your 
loop/calculation still work?

Thanks,

Mike

>   }
>   return size;
> }
>@@ -99,8 +98,7 @@ static void vb2_dc_prepare(void *buf_priv)
>   if (!sgt || buf->db_attach)
>   return;
>
>-  dma_sync_sg_for_device(buf->dev, sgt->sgl, sgt->orig_nents,
>- buf->dma_dir);
>+  dma_sync_sgtable_for_device(buf->dev, sgt, buf->dma_dir);
> }
>
> static void vb2_dc_finish(void *buf_priv)
>@@ -112,7 +110,7 @@ static void vb2_dc_finish(void *buf_priv)
>   if (!sgt || buf->db_attach)
>   return;
>
>-  dma_sync_sg_for_cpu(buf->dev, sgt->sgl, sgt->orig_nents, buf-
>>dma_dir);
>+  dma_sync_sgtable_for_cpu(buf->dev, sgt, buf->dma_dir);
> }
>
> /*/
>@@ -273,8 +271,8 @@ static void vb2_dc_dmabuf_ops_detach(struct
>dma_buf *dbuf,
>* memory locations do not require any explicit cache
>* maintenance prior or after being used by the device.
>*/
>-  dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt-
>>orig_nents,
>- attach->dma_dir,
>DMA_ATTR_SKIP_CPU_SYNC);
>+  dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
>+DMA_ATTR_SKIP_CPU_SYNC);
>   sg_free_table(sgt);
>   kfree(attach);
>   db_attach->priv = NULL;
>@@ -299,8 +297,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
>
>   /* release any previous cache */
>   if (attach->dma_dir != DMA_NONE) {
>-  dma_unmap_sg_attrs(db_attach->dev, sgt->sgl, sgt-
>>orig_nents,
>- attach->dma_dir,
>DMA_ATTR_SKIP_CPU_SYNC);
>+  dma_unmap_sgtable(db_attach->dev, sgt, attach->dma_dir,
>+DMA_ATTR_SKIP_CPU_SYNC);
>   attach->dma_dir = DMA_NONE;
>   }
>
>@@ -308,9 +306,8 @@ static struct sg_table *vb2_dc_dmabuf_ops_map(
>* mapping to the client with new direction, no cache sync
>* required see comment in vb2_dc_dmabuf_ops_detach()
>*/
>-  sgt->nents = dma_map_sg_attrs(db_attach->dev, sgt->sgl, sgt-
>>orig_nents,
>-dma_dir, DMA_ATTR_SKIP_CPU_SYNC);
>-  if (!sgt->nents) {
>+  if (dm