Re: [PATCH] media: venus: use contig vb2 ops

2021-03-01 Thread Tomasz Figa
On Mon, Mar 1, 2021 at 7:22 PM Stanimir Varbanov
 wrote:
>
>
>
> On 3/1/21 11:23 AM, Tomasz Figa wrote:
> > Hi Alex, Stanimir,
> >
> > On Wed, Dec 16, 2020 at 12:15 PM Tomasz Figa  wrote:
> >>
> >> On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne  
> >> wrote:
> >>>
> >>> Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
>  Hi Tomasz,
> 
>  On 12/15/20 1:47 PM, Tomasz Figa wrote:
> > On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> >  wrote:
> >>
> >> Hi,
> >>
> >> Cc: Robin
> >>
> >> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> >>> This driver uses the SG vb2 ops, but effectively only ever accesses 
> >>> the
> >>> first entry of the SG table, indicating that it expects a flat layout.
> >>> Switch it to use the contiguous ops to make sure this expected 
> >>> invariant
> >>
> >> Under what circumstances the sg table will has nents > 1? I came down 
> >> to
> >> [1] but not sure I got it right.
> >>
> >> I'm afraid that for systems with low amount of system memory and when
> >> the memory become fragmented, the driver will not work. That's why I
> >> started with sg allocator.
> >
> > It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> > in terms of the DMA (aka IOVA) address space. In other words, it
> > guarantees that having one DMA address and length fully describes the
> 
>  Ahh, I missed that part. Looks like I misunderstood videobu2 contig
>  allocator.
> >>>
> >>> I'm learning everyday too, but I'm surprised I don't see a call to
> >>> vb2_dma_contig_set_max_seg_size() in this driver (I could also just have 
> >>> missed
> >>> a patch when overlooking this thread) ?
> >>>
> >>> The reason I'm asking, doc says it should be called by driver supporting 
> >>> IOMMU,
> >>> which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, 
> >>> mtk-
> >>> mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst 
> >>> case it's
> >>> all covered and we are good, otherwise perhaps a downstream patch didn't 
> >>> make it
> >>> ?
> >>>
> >>> /**
> >>>  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
> >>>  * @dev:device for configuring DMA parameters
> >>>  * @size:   size of DMA max segment size to set
> >>>  *
> >>>  * To allow mapping the scatter-list into a single chunk in the DMA
> >>>  * address space, the device is required to have the DMA max segment
> >>>  * size parameter set to a value larger than the buffer size. Otherwise,
> >>>  * the DMA-mapping subsystem will split the mapping into max segment
> >>>  * size chunks. This function sets the DMA max segment size
> >>>  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
> >>>  * address space.
> >>>  * This code assumes that the DMA-mapping subsystem will merge all
> >>>  * scatterlist segments if this is really possible (for example when
> >>>  * an IOMMU is available and enabled).
> >>>  * Ideally, this parameter should be set by the generic bus code, but it
> >>>  * is left with the default 64KiB value due to historical litmiations in
> >>>  * other subsystems (like limited USB host drivers) and there no good
> >>>  * place to set it to the proper value.
> >>>  * This function should be called from the drivers, which are known to
> >>>  * operate on platforms with IOMMU and provide access to shared buffers
> >>>  * (either USERPTR or DMABUF). This should be done before initializing
> >>>  * videobuf2 queue.
> >>>  */
> >>
> >> It does call dma_set_max_seg_size() directly:
> >> https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230
> >>
> >> Actually, why do we even need a vb2 helper for this?
> >>
> >
> > What's the plan for this patch?
>
> It will be part of v5.12.

Great, thanks!


Re: [PATCH] media: venus: use contig vb2 ops

2021-03-01 Thread Stanimir Varbanov



On 3/1/21 11:23 AM, Tomasz Figa wrote:
> Hi Alex, Stanimir,
> 
> On Wed, Dec 16, 2020 at 12:15 PM Tomasz Figa  wrote:
>>
>> On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne  
>> wrote:
>>>
>>> Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
 Hi Tomasz,

 On 12/15/20 1:47 PM, Tomasz Figa wrote:
> On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
>  wrote:
>>
>> Hi,
>>
>> Cc: Robin
>>
>> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
>>> This driver uses the SG vb2 ops, but effectively only ever accesses the
>>> first entry of the SG table, indicating that it expects a flat layout.
>>> Switch it to use the contiguous ops to make sure this expected invariant
>>
>> Under what circumstances the sg table will has nents > 1? I came down to
>> [1] but not sure I got it right.
>>
>> I'm afraid that for systems with low amount of system memory and when
>> the memory become fragmented, the driver will not work. That's why I
>> started with sg allocator.
>
> It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> in terms of the DMA (aka IOVA) address space. In other words, it
> guarantees that having one DMA address and length fully describes the

 Ahh, I missed that part. Looks like I misunderstood videobu2 contig
 allocator.
>>>
>>> I'm learning everyday too, but I'm surprised I don't see a call to
>>> vb2_dma_contig_set_max_seg_size() in this driver (I could also just have 
>>> missed
>>> a patch when overlooking this thread) ?
>>>
>>> The reason I'm asking, doc says it should be called by driver supporting 
>>> IOMMU,
>>> which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, 
>>> mtk-
>>> mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case 
>>> it's
>>> all covered and we are good, otherwise perhaps a downstream patch didn't 
>>> make it
>>> ?
>>>
>>> /**
>>>  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
>>>  * @dev:device for configuring DMA parameters
>>>  * @size:   size of DMA max segment size to set
>>>  *
>>>  * To allow mapping the scatter-list into a single chunk in the DMA
>>>  * address space, the device is required to have the DMA max segment
>>>  * size parameter set to a value larger than the buffer size. Otherwise,
>>>  * the DMA-mapping subsystem will split the mapping into max segment
>>>  * size chunks. This function sets the DMA max segment size
>>>  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
>>>  * address space.
>>>  * This code assumes that the DMA-mapping subsystem will merge all
>>>  * scatterlist segments if this is really possible (for example when
>>>  * an IOMMU is available and enabled).
>>>  * Ideally, this parameter should be set by the generic bus code, but it
>>>  * is left with the default 64KiB value due to historical litmiations in
>>>  * other subsystems (like limited USB host drivers) and there no good
>>>  * place to set it to the proper value.
>>>  * This function should be called from the drivers, which are known to
>>>  * operate on platforms with IOMMU and provide access to shared buffers
>>>  * (either USERPTR or DMABUF). This should be done before initializing
>>>  * videobuf2 queue.
>>>  */
>>
>> It does call dma_set_max_seg_size() directly:
>> https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230
>>
>> Actually, why do we even need a vb2 helper for this?
>>
> 
> What's the plan for this patch?

It will be part of v5.12.

-- 
regards,
Stan


Re: [PATCH] media: venus: use contig vb2 ops

2021-03-01 Thread Tomasz Figa
Hi Alex, Stanimir,

On Wed, Dec 16, 2020 at 12:15 PM Tomasz Figa  wrote:
>
> On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne  wrote:
> >
> > Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> > > Hi Tomasz,
> > >
> > > On 12/15/20 1:47 PM, Tomasz Figa wrote:
> > > > On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > Cc: Robin
> > > > >
> > > > > On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > > > > > This driver uses the SG vb2 ops, but effectively only ever accesses 
> > > > > > the
> > > > > > first entry of the SG table, indicating that it expects a flat 
> > > > > > layout.
> > > > > > Switch it to use the contiguous ops to make sure this expected 
> > > > > > invariant
> > > > >
> > > > > Under what circumstances the sg table will has nents > 1? I came down 
> > > > > to
> > > > > [1] but not sure I got it right.
> > > > >
> > > > > I'm afraid that for systems with low amount of system memory and when
> > > > > the memory become fragmented, the driver will not work. That's why I
> > > > > started with sg allocator.
> > > >
> > > > It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> > > > in terms of the DMA (aka IOVA) address space. In other words, it
> > > > guarantees that having one DMA address and length fully describes the
> > >
> > > Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> > > allocator.
> >
> > I'm learning everyday too, but I'm surprised I don't see a call to
> > vb2_dma_contig_set_max_seg_size() in this driver (I could also just have 
> > missed
> > a patch when overlooking this thread) ?
> >
> > The reason I'm asking, doc says it should be called by driver supporting 
> > IOMMU,
> > which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, 
> > mtk-
> > mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case 
> > it's
> > all covered and we are good, otherwise perhaps a downstream patch didn't 
> > make it
> > ?
> >
> > /**
> >  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
> >  * @dev:device for configuring DMA parameters
> >  * @size:   size of DMA max segment size to set
> >  *
> >  * To allow mapping the scatter-list into a single chunk in the DMA
> >  * address space, the device is required to have the DMA max segment
> >  * size parameter set to a value larger than the buffer size. Otherwise,
> >  * the DMA-mapping subsystem will split the mapping into max segment
> >  * size chunks. This function sets the DMA max segment size
> >  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
> >  * address space.
> >  * This code assumes that the DMA-mapping subsystem will merge all
> >  * scatterlist segments if this is really possible (for example when
> >  * an IOMMU is available and enabled).
> >  * Ideally, this parameter should be set by the generic bus code, but it
> >  * is left with the default 64KiB value due to historical litmiations in
> >  * other subsystems (like limited USB host drivers) and there no good
> >  * place to set it to the proper value.
> >  * This function should be called from the drivers, which are known to
> >  * operate on platforms with IOMMU and provide access to shared buffers
> >  * (either USERPTR or DMABUF). This should be done before initializing
> >  * videobuf2 queue.
> >  */
>
> It does call dma_set_max_seg_size() directly:
> https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230
>
> Actually, why do we even need a vb2 helper for this?
>

What's the plan for this patch?

Best regards,
Tomasz

> >
> > regards,
> > Nicolas
> >
> > >
> > > > buffer. This seems to be the requirement of the hardware/firmware
> > > > handled by the venus driver. If the device is behind an IOMMU, which
> > > > is the case for the SoCs in question, the underlying DMA ops will
> > > > actually allocate a discontiguous set of pages, so it has nothing to
> > > > do to system memory amount or fragmentation. If for some reason the
> > > > IOMMU can't be used, there is no way around, the memory needs to be
> > > > contiguous because of the hardware/firmware/driver expectation.
> > > >
> > > > On the other hand, the vb2-dma-sg allocator doesn't have any
> > > > continuity guarantees for the DMA, or any other, address space. The
> > > > current code works fine, because it calls dma_map_sg() on the whole
> > > > set of pages and that ends up mapping it contiguously in the IOVA
> > > > space, but that's just an implementation detail, not an API guarantee.
> > >
> > > It was good to know. Thanks for the explanation.
> > >
> > > >
> > > > Best regards,
> > > > Tomasz
> > > >
> > > > >
> > > > > [1]
> > > > > https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
> > > > >
> > > > > > is always enforced. Since the device is supposed to be behind an 
> > > > > > IOMMU
> > > > > > this should have little to none 

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Tomasz Figa
On Wed, Dec 16, 2020 at 4:21 AM Nicolas Dufresne  wrote:
>
> Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> > Hi Tomasz,
> >
> > On 12/15/20 1:47 PM, Tomasz Figa wrote:
> > > On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > Cc: Robin
> > > >
> > > > On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > > > > This driver uses the SG vb2 ops, but effectively only ever accesses 
> > > > > the
> > > > > first entry of the SG table, indicating that it expects a flat layout.
> > > > > Switch it to use the contiguous ops to make sure this expected 
> > > > > invariant
> > > >
> > > > Under what circumstances the sg table will has nents > 1? I came down to
> > > > [1] but not sure I got it right.
> > > >
> > > > I'm afraid that for systems with low amount of system memory and when
> > > > the memory become fragmented, the driver will not work. That's why I
> > > > started with sg allocator.
> > >
> > > It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> > > in terms of the DMA (aka IOVA) address space. In other words, it
> > > guarantees that having one DMA address and length fully describes the
> >
> > Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> > allocator.
>
> I'm learning everyday too, but I'm surprised I don't see a call to
> vb2_dma_contig_set_max_seg_size() in this driver (I could also just have 
> missed
> a patch when overlooking this thread) ?
>
> The reason I'm asking, doc says it should be called by driver supporting 
> IOMMU,
> which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, mtk-
> mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case 
> it's
> all covered and we are good, otherwise perhaps a downstream patch didn't make 
> it
> ?
>
> /**
>  * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
>  * @dev:device for configuring DMA parameters
>  * @size:   size of DMA max segment size to set
>  *
>  * To allow mapping the scatter-list into a single chunk in the DMA
>  * address space, the device is required to have the DMA max segment
>  * size parameter set to a value larger than the buffer size. Otherwise,
>  * the DMA-mapping subsystem will split the mapping into max segment
>  * size chunks. This function sets the DMA max segment size
>  * parameter to let DMA-mapping map a buffer as a single chunk in DMA
>  * address space.
>  * This code assumes that the DMA-mapping subsystem will merge all
>  * scatterlist segments if this is really possible (for example when
>  * an IOMMU is available and enabled).
>  * Ideally, this parameter should be set by the generic bus code, but it
>  * is left with the default 64KiB value due to historical litmiations in
>  * other subsystems (like limited USB host drivers) and there no good
>  * place to set it to the proper value.
>  * This function should be called from the drivers, which are known to
>  * operate on platforms with IOMMU and provide access to shared buffers
>  * (either USERPTR or DMABUF). This should be done before initializing
>  * videobuf2 queue.
>  */

It does call dma_set_max_seg_size() directly:
https://elixir.bootlin.com/linux/latest/source/drivers/media/platform/qcom/venus/core.c#L230

Actually, why do we even need a vb2 helper for this?

>
> regards,
> Nicolas
>
> >
> > > buffer. This seems to be the requirement of the hardware/firmware
> > > handled by the venus driver. If the device is behind an IOMMU, which
> > > is the case for the SoCs in question, the underlying DMA ops will
> > > actually allocate a discontiguous set of pages, so it has nothing to
> > > do to system memory amount or fragmentation. If for some reason the
> > > IOMMU can't be used, there is no way around, the memory needs to be
> > > contiguous because of the hardware/firmware/driver expectation.
> > >
> > > On the other hand, the vb2-dma-sg allocator doesn't have any
> > > continuity guarantees for the DMA, or any other, address space. The
> > > current code works fine, because it calls dma_map_sg() on the whole
> > > set of pages and that ends up mapping it contiguously in the IOVA
> > > space, but that's just an implementation detail, not an API guarantee.
> >
> > It was good to know. Thanks for the explanation.
> >
> > >
> > > Best regards,
> > > Tomasz
> > >
> > > >
> > > > [1]
> > > > https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
> > > >
> > > > > is always enforced. Since the device is supposed to be behind an IOMMU
> > > > > this should have little to none practical consequences beyond making 
> > > > > the
> > > > > driver not rely on a particular behavior of the SG implementation.
> > > > >
> > > > > Reported-by: Tomasz Figa 
> > > > > Signed-off-by: Alexandre Courbot 
> > > > > ---
> > > > > Hi everyone,
> > > > >
> > > > > It probably doesn't hurt to fix this issue before some actual issue
> > > > > happens.
> > > > > I have tested this 

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Nicolas Dufresne
Le mardi 15 décembre 2020 à 15:54 +0200, Stanimir Varbanov a écrit :
> Hi Tomasz,
> 
> On 12/15/20 1:47 PM, Tomasz Figa wrote:
> > On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
> >  wrote:
> > > 
> > > Hi,
> > > 
> > > Cc: Robin
> > > 
> > > On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > > > This driver uses the SG vb2 ops, but effectively only ever accesses the
> > > > first entry of the SG table, indicating that it expects a flat layout.
> > > > Switch it to use the contiguous ops to make sure this expected invariant
> > > 
> > > Under what circumstances the sg table will has nents > 1? I came down to
> > > [1] but not sure I got it right.
> > > 
> > > I'm afraid that for systems with low amount of system memory and when
> > > the memory become fragmented, the driver will not work. That's why I
> > > started with sg allocator.
> > 
> > It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> > in terms of the DMA (aka IOVA) address space. In other words, it
> > guarantees that having one DMA address and length fully describes the
> 
> Ahh, I missed that part. Looks like I misunderstood videobu2 contig
> allocator.

I'm learning everyday too, but I'm surprised I don't see a call to
vb2_dma_contig_set_max_seg_size() in this driver (I could also just have missed
a patch when overlooking this thread) ?

The reason I'm asking, doc says it should be called by driver supporting IOMMU,
which seems to be the case for such drivers (MFC, exynos4-is, exynos-gsc, mtk-
mdp, s5p-g2d, hantro, rkvdec, zoran, ti-vpe, ..). I posting it, worst case it's
all covered and we are good, otherwise perhaps a downstream patch didn't make it
?

/**
 * vb2_dma_contig_set_max_seg_size() - configure DMA max segment size
 * @dev:device for configuring DMA parameters
 * @size:   size of DMA max segment size to set
 *
 * To allow mapping the scatter-list into a single chunk in the DMA
 * address space, the device is required to have the DMA max segment
 * size parameter set to a value larger than the buffer size. Otherwise,
 * the DMA-mapping subsystem will split the mapping into max segment
 * size chunks. This function sets the DMA max segment size
 * parameter to let DMA-mapping map a buffer as a single chunk in DMA
 * address space.
 * This code assumes that the DMA-mapping subsystem will merge all
 * scatterlist segments if this is really possible (for example when
 * an IOMMU is available and enabled).
 * Ideally, this parameter should be set by the generic bus code, but it
 * is left with the default 64KiB value due to historical litmiations in
 * other subsystems (like limited USB host drivers) and there no good
 * place to set it to the proper value.
 * This function should be called from the drivers, which are known to
 * operate on platforms with IOMMU and provide access to shared buffers
 * (either USERPTR or DMABUF). This should be done before initializing
 * videobuf2 queue.
 */

regards,
Nicolas

> 
> > buffer. This seems to be the requirement of the hardware/firmware
> > handled by the venus driver. If the device is behind an IOMMU, which
> > is the case for the SoCs in question, the underlying DMA ops will
> > actually allocate a discontiguous set of pages, so it has nothing to
> > do to system memory amount or fragmentation. If for some reason the
> > IOMMU can't be used, there is no way around, the memory needs to be
> > contiguous because of the hardware/firmware/driver expectation.
> > 
> > On the other hand, the vb2-dma-sg allocator doesn't have any
> > continuity guarantees for the DMA, or any other, address space. The
> > current code works fine, because it calls dma_map_sg() on the whole
> > set of pages and that ends up mapping it contiguously in the IOVA
> > space, but that's just an implementation detail, not an API guarantee.
> 
> It was good to know. Thanks for the explanation.
> 
> > 
> > Best regards,
> > Tomasz
> > 
> > > 
> > > [1]
> > > https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
> > > 
> > > > is always enforced. Since the device is supposed to be behind an IOMMU
> > > > this should have little to none practical consequences beyond making the
> > > > driver not rely on a particular behavior of the SG implementation.
> > > > 
> > > > Reported-by: Tomasz Figa 
> > > > Signed-off-by: Alexandre Courbot 
> > > > ---
> > > > Hi everyone,
> > > > 
> > > > It probably doesn't hurt to fix this issue before some actual issue
> > > > happens.
> > > > I have tested this patch on Chrome OS and playback was just as fine as
> > > > with
> > > > the SG ops.
> > > > 
> > > >  drivers/media/platform/Kconfig  | 2 +-
> > > >  drivers/media/platform/qcom/venus/helpers.c | 9 ++---
> > > >  drivers/media/platform/qcom/venus/vdec.c    | 6 +++---
> > > >  drivers/media/platform/qcom/venus/venc.c    | 6 +++---
> > > >  4 files changed, 9 insertions(+), 14 deletions(-)
> > > > 
> > > > diff --git a/drivers/media/platform/Kconfig
> > > > 

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Stanimir Varbanov
Hi Tomasz,

On 12/15/20 1:47 PM, Tomasz Figa wrote:
> On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
>  wrote:
>>
>> Hi,
>>
>> Cc: Robin
>>
>> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
>>> This driver uses the SG vb2 ops, but effectively only ever accesses the
>>> first entry of the SG table, indicating that it expects a flat layout.
>>> Switch it to use the contiguous ops to make sure this expected invariant
>>
>> Under what circumstances the sg table will has nents > 1? I came down to
>> [1] but not sure I got it right.
>>
>> I'm afraid that for systems with low amount of system memory and when
>> the memory become fragmented, the driver will not work. That's why I
>> started with sg allocator.
> 
> It is exactly the opposite. The vb2-dma-contig allocator is "contig"
> in terms of the DMA (aka IOVA) address space. In other words, it
> guarantees that having one DMA address and length fully describes the

Ahh, I missed that part. Looks like I misunderstood videobu2 contig
allocator.

> buffer. This seems to be the requirement of the hardware/firmware
> handled by the venus driver. If the device is behind an IOMMU, which
> is the case for the SoCs in question, the underlying DMA ops will
> actually allocate a discontiguous set of pages, so it has nothing to
> do to system memory amount or fragmentation. If for some reason the
> IOMMU can't be used, there is no way around, the memory needs to be
> contiguous because of the hardware/firmware/driver expectation.
> 
> On the other hand, the vb2-dma-sg allocator doesn't have any
> continuity guarantees for the DMA, or any other, address space. The
> current code works fine, because it calls dma_map_sg() on the whole
> set of pages and that ends up mapping it contiguously in the IOVA
> space, but that's just an implementation detail, not an API guarantee.

It was good to know. Thanks for the explanation.

> 
> Best regards,
> Tomasz
> 
>>
>> [1]
>> https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
>>
>>> is always enforced. Since the device is supposed to be behind an IOMMU
>>> this should have little to none practical consequences beyond making the
>>> driver not rely on a particular behavior of the SG implementation.
>>>
>>> Reported-by: Tomasz Figa 
>>> Signed-off-by: Alexandre Courbot 
>>> ---
>>> Hi everyone,
>>>
>>> It probably doesn't hurt to fix this issue before some actual issue happens.
>>> I have tested this patch on Chrome OS and playback was just as fine as with
>>> the SG ops.
>>>
>>>  drivers/media/platform/Kconfig  | 2 +-
>>>  drivers/media/platform/qcom/venus/helpers.c | 9 ++---
>>>  drivers/media/platform/qcom/venus/vdec.c| 6 +++---
>>>  drivers/media/platform/qcom/venus/venc.c| 6 +++---
>>>  4 files changed, 9 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
>>> index 35a18d388f3f..d9d7954111f2 100644
>>> --- a/drivers/media/platform/Kconfig
>>> +++ b/drivers/media/platform/Kconfig
>>> @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
>>>   depends on INTERCONNECT || !INTERCONNECT
>>>   select QCOM_MDT_LOADER if ARCH_QCOM
>>>   select QCOM_SCM if ARCH_QCOM
>>> - select VIDEOBUF2_DMA_SG
>>> + select VIDEOBUF2_DMA_CONTIG
>>>   select V4L2_MEM2MEM_DEV
>>>   help
>>> This is a V4L2 driver for Qualcomm Venus video accelerator
>>> diff --git a/drivers/media/platform/qcom/venus/helpers.c 
>>> b/drivers/media/platform/qcom/venus/helpers.c
>>> index 50439eb1ffea..859d260f002b 100644
>>> --- a/drivers/media/platform/qcom/venus/helpers.c
>>> +++ b/drivers/media/platform/qcom/venus/helpers.c
>>> @@ -7,7 +7,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> -#include 
>>> +#include 
>>>  #include 
>>>  #include 
>>>
>>> @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
>>>   struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
>>>   struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
>>>   struct venus_buffer *buf = to_venus_buffer(vbuf);
>>> - struct sg_table *sgt;
>>> -
>>> - sgt = vb2_dma_sg_plane_desc(vb, 0);
>>> - if (!sgt)
>>> - return -EFAULT;
>>>
>>>   buf->size = vb2_plane_size(vb, 0);
>>> - buf->dma_addr = sg_dma_address(sgt->sgl);
>>
>> Can we do it:
>>
>> if (WARN_ON(sgt->nents > 1))
>> return -EFAULT;
>>
>> I understand that logically using dma-sg when the flat layout is
>> expected by the hardware is wrong, but I haven't seen issues until now.
>>
>>> + buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>>>
>>>   if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
>>>   list_add_tail(>reg_list, >registeredbufs);
>>> diff --git a/drivers/media/platform/qcom/venus/vdec.c 
>>> b/drivers/media/platform/qcom/venus/vdec.c
>>> index 8488411204c3..3fb277c81aca 100644
>>> --- a/drivers/media/platform/qcom/venus/vdec.c
>>> +++ b/drivers/media/platform/qcom/venus/vdec.c

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Robin Murphy

On 2020-12-15 11:47, Tomasz Figa wrote:

On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
 wrote:


Hi,

Cc: Robin

On 12/14/20 2:57 PM, Alexandre Courbot wrote:

This driver uses the SG vb2 ops, but effectively only ever accesses the
first entry of the SG table, indicating that it expects a flat layout.
Switch it to use the contiguous ops to make sure this expected invariant


Under what circumstances the sg table will has nents > 1? I came down to
[1] but not sure I got it right.

I'm afraid that for systems with low amount of system memory and when
the memory become fragmented, the driver will not work. That's why I
started with sg allocator.


It is exactly the opposite. The vb2-dma-contig allocator is "contig"
in terms of the DMA (aka IOVA) address space. In other words, it
guarantees that having one DMA address and length fully describes the
buffer. This seems to be the requirement of the hardware/firmware
handled by the venus driver. If the device is behind an IOMMU, which
is the case for the SoCs in question, the underlying DMA ops will
actually allocate a discontiguous set of pages, so it has nothing to
do to system memory amount or fragmentation. If for some reason the
IOMMU can't be used, there is no way around, the memory needs to be
contiguous because of the hardware/firmware/driver expectation.

On the other hand, the vb2-dma-sg allocator doesn't have any
continuity guarantees for the DMA, or any other, address space.


Yes, intuitively one would assume that the sg code was for devices with 
native scatter-gather capability that can deal with an actual set of 
buffer descriptors, rather than just a single pointer (which is the 
original purpose of scatterlists, after all). I've always been slightly 
puzzled why the two seem to be quite so similar.



The
current code works fine, because it calls dma_map_sg() on the whole
set of pages and that ends up mapping it contiguously in the IOVA
space, but that's just an implementation detail, not an API guarantee.


Oh, the fun we've had over that implementation detail! :P

Robin.


Best regards,
Tomasz



[1]
https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782


is always enforced. Since the device is supposed to be behind an IOMMU
this should have little to none practical consequences beyond making the
driver not rely on a particular behavior of the SG implementation.

Reported-by: Tomasz Figa 
Signed-off-by: Alexandre Courbot 
---
Hi everyone,

It probably doesn't hurt to fix this issue before some actual issue happens.
I have tested this patch on Chrome OS and playback was just as fine as with
the SG ops.

  drivers/media/platform/Kconfig  | 2 +-
  drivers/media/platform/qcom/venus/helpers.c | 9 ++---
  drivers/media/platform/qcom/venus/vdec.c| 6 +++---
  drivers/media/platform/qcom/venus/venc.c| 6 +++---
  4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index 35a18d388f3f..d9d7954111f2 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
   depends on INTERCONNECT || !INTERCONNECT
   select QCOM_MDT_LOADER if ARCH_QCOM
   select QCOM_SCM if ARCH_QCOM
- select VIDEOBUF2_DMA_SG
+ select VIDEOBUF2_DMA_CONTIG
   select V4L2_MEM2MEM_DEV
   help
 This is a V4L2 driver for Qualcomm Venus video accelerator
diff --git a/drivers/media/platform/qcom/venus/helpers.c 
b/drivers/media/platform/qcom/venus/helpers.c
index 50439eb1ffea..859d260f002b 100644
--- a/drivers/media/platform/qcom/venus/helpers.c
+++ b/drivers/media/platform/qcom/venus/helpers.c
@@ -7,7 +7,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  #include 
  #include 

@@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
   struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
   struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
   struct venus_buffer *buf = to_venus_buffer(vbuf);
- struct sg_table *sgt;
-
- sgt = vb2_dma_sg_plane_desc(vb, 0);
- if (!sgt)
- return -EFAULT;

   buf->size = vb2_plane_size(vb, 0);
- buf->dma_addr = sg_dma_address(sgt->sgl);


Can we do it:

 if (WARN_ON(sgt->nents > 1))
 return -EFAULT;

I understand that logically using dma-sg when the flat layout is
expected by the hardware is wrong, but I haven't seen issues until now.


+ buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);

   if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
   list_add_tail(>reg_list, >registeredbufs);
diff --git a/drivers/media/platform/qcom/venus/vdec.c 
b/drivers/media/platform/qcom/venus/vdec.c
index 8488411204c3..3fb277c81aca 100644
--- a/drivers/media/platform/qcom/venus/vdec.c
+++ b/drivers/media/platform/qcom/venus/vdec.c
@@ -13,7 +13,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 

  #include 

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Tomasz Figa
On Tue, Dec 15, 2020 at 8:16 PM Stanimir Varbanov
 wrote:
>
> Hi,
>
> Cc: Robin
>
> On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> > This driver uses the SG vb2 ops, but effectively only ever accesses the
> > first entry of the SG table, indicating that it expects a flat layout.
> > Switch it to use the contiguous ops to make sure this expected invariant
>
> Under what circumstances the sg table will has nents > 1? I came down to
> [1] but not sure I got it right.
>
> I'm afraid that for systems with low amount of system memory and when
> the memory become fragmented, the driver will not work. That's why I
> started with sg allocator.

It is exactly the opposite. The vb2-dma-contig allocator is "contig"
in terms of the DMA (aka IOVA) address space. In other words, it
guarantees that having one DMA address and length fully describes the
buffer. This seems to be the requirement of the hardware/firmware
handled by the venus driver. If the device is behind an IOMMU, which
is the case for the SoCs in question, the underlying DMA ops will
actually allocate a discontiguous set of pages, so it has nothing to
do to system memory amount or fragmentation. If for some reason the
IOMMU can't be used, there is no way around, the memory needs to be
contiguous because of the hardware/firmware/driver expectation.

On the other hand, the vb2-dma-sg allocator doesn't have any
continuity guarantees for the DMA, or any other, address space. The
current code works fine, because it calls dma_map_sg() on the whole
set of pages and that ends up mapping it contiguously in the IOVA
space, but that's just an implementation detail, not an API guarantee.

Best regards,
Tomasz

>
> [1]
> https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782
>
> > is always enforced. Since the device is supposed to be behind an IOMMU
> > this should have little to none practical consequences beyond making the
> > driver not rely on a particular behavior of the SG implementation.
> >
> > Reported-by: Tomasz Figa 
> > Signed-off-by: Alexandre Courbot 
> > ---
> > Hi everyone,
> >
> > It probably doesn't hurt to fix this issue before some actual issue happens.
> > I have tested this patch on Chrome OS and playback was just as fine as with
> > the SG ops.
> >
> >  drivers/media/platform/Kconfig  | 2 +-
> >  drivers/media/platform/qcom/venus/helpers.c | 9 ++---
> >  drivers/media/platform/qcom/venus/vdec.c| 6 +++---
> >  drivers/media/platform/qcom/venus/venc.c| 6 +++---
> >  4 files changed, 9 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> > index 35a18d388f3f..d9d7954111f2 100644
> > --- a/drivers/media/platform/Kconfig
> > +++ b/drivers/media/platform/Kconfig
> > @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
> >   depends on INTERCONNECT || !INTERCONNECT
> >   select QCOM_MDT_LOADER if ARCH_QCOM
> >   select QCOM_SCM if ARCH_QCOM
> > - select VIDEOBUF2_DMA_SG
> > + select VIDEOBUF2_DMA_CONTIG
> >   select V4L2_MEM2MEM_DEV
> >   help
> > This is a V4L2 driver for Qualcomm Venus video accelerator
> > diff --git a/drivers/media/platform/qcom/venus/helpers.c 
> > b/drivers/media/platform/qcom/venus/helpers.c
> > index 50439eb1ffea..859d260f002b 100644
> > --- a/drivers/media/platform/qcom/venus/helpers.c
> > +++ b/drivers/media/platform/qcom/venus/helpers.c
> > @@ -7,7 +7,7 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> > +#include 
> >  #include 
> >  #include 
> >
> > @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
> >   struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
> >   struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> >   struct venus_buffer *buf = to_venus_buffer(vbuf);
> > - struct sg_table *sgt;
> > -
> > - sgt = vb2_dma_sg_plane_desc(vb, 0);
> > - if (!sgt)
> > - return -EFAULT;
> >
> >   buf->size = vb2_plane_size(vb, 0);
> > - buf->dma_addr = sg_dma_address(sgt->sgl);
>
> Can we do it:
>
> if (WARN_ON(sgt->nents > 1))
> return -EFAULT;
>
> I understand that logically using dma-sg when the flat layout is
> expected by the hardware is wrong, but I haven't seen issues until now.
>
> > + buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
> >
> >   if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
> >   list_add_tail(>reg_list, >registeredbufs);
> > diff --git a/drivers/media/platform/qcom/venus/vdec.c 
> > b/drivers/media/platform/qcom/venus/vdec.c
> > index 8488411204c3..3fb277c81aca 100644
> > --- a/drivers/media/platform/qcom/venus/vdec.c
> > +++ b/drivers/media/platform/qcom/venus/vdec.c
> > @@ -13,7 +13,7 @@
> >  #include 
> >  #include 
> >  #include 
> > -#include 
> > +#include 
> >
> >  #include "hfi_venus_io.h"
> >  #include "hfi_parser.h"
> > @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct 
> > vb2_queue *src_vq,
> 

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Robin Murphy

On 2020-12-15 11:16, Stanimir Varbanov wrote:

Hi,

Cc: Robin

On 12/14/20 2:57 PM, Alexandre Courbot wrote:

This driver uses the SG vb2 ops, but effectively only ever accesses the
first entry of the SG table, indicating that it expects a flat layout.
Switch it to use the contiguous ops to make sure this expected invariant


Under what circumstances the sg table will has nents > 1? I came down to
[1] but not sure I got it right.

I'm afraid that for systems with low amount of system memory and when
the memory become fragmented, the driver will not work. That's why I
started with sg allocator.


Despite what videobuf-dma-contig seems to assume, dma_alloc_coherent() 
makes no guarantee of providing physically contiguous memory. What it 
provides is *DMA contiguous* memory, i.e. from the point of view of the 
device. When an IOMMU is present and managed by the DMA API, such 
buffers may be assembled out of physically-scattered pages (particularly 
under memory pressure/fragmentation).


Robin.



[1]
https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782


is always enforced. Since the device is supposed to be behind an IOMMU
this should have little to none practical consequences beyond making the
driver not rely on a particular behavior of the SG implementation.

Reported-by: Tomasz Figa 
Signed-off-by: Alexandre Courbot 
---
Hi everyone,

It probably doesn't hurt to fix this issue before some actual issue happens.
I have tested this patch on Chrome OS and playback was just as fine as with
the SG ops.

  drivers/media/platform/Kconfig  | 2 +-
  drivers/media/platform/qcom/venus/helpers.c | 9 ++---
  drivers/media/platform/qcom/venus/vdec.c| 6 +++---
  drivers/media/platform/qcom/venus/venc.c| 6 +++---
  4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index 35a18d388f3f..d9d7954111f2 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
depends on INTERCONNECT || !INTERCONNECT
select QCOM_MDT_LOADER if ARCH_QCOM
select QCOM_SCM if ARCH_QCOM
-   select VIDEOBUF2_DMA_SG
+   select VIDEOBUF2_DMA_CONTIG
select V4L2_MEM2MEM_DEV
help
  This is a V4L2 driver for Qualcomm Venus video accelerator
diff --git a/drivers/media/platform/qcom/venus/helpers.c 
b/drivers/media/platform/qcom/venus/helpers.c
index 50439eb1ffea..859d260f002b 100644
--- a/drivers/media/platform/qcom/venus/helpers.c
+++ b/drivers/media/platform/qcom/venus/helpers.c
@@ -7,7 +7,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  #include 
  #include 
  
@@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)

struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
struct venus_buffer *buf = to_venus_buffer(vbuf);
-   struct sg_table *sgt;
-
-   sgt = vb2_dma_sg_plane_desc(vb, 0);
-   if (!sgt)
-   return -EFAULT;
  
  	buf->size = vb2_plane_size(vb, 0);

-   buf->dma_addr = sg_dma_address(sgt->sgl);


Can we do it:

if (WARN_ON(sgt->nents > 1))
return -EFAULT;

I understand that logically using dma-sg when the flat layout is
expected by the hardware is wrong, but I haven't seen issues until now.


+   buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
  
  	if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)

list_add_tail(>reg_list, >registeredbufs);
diff --git a/drivers/media/platform/qcom/venus/vdec.c 
b/drivers/media/platform/qcom/venus/vdec.c
index 8488411204c3..3fb277c81aca 100644
--- a/drivers/media/platform/qcom/venus/vdec.c
+++ b/drivers/media/platform/qcom/venus/vdec.c
@@ -13,7 +13,7 @@
  #include 
  #include 
  #include 
-#include 
+#include 
  
  #include "hfi_venus_io.h"

  #include "hfi_parser.h"
@@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
*src_vq,
src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
src_vq->ops = _vb2_ops;
-   src_vq->mem_ops = _dma_sg_memops;
+   src_vq->mem_ops = _dma_contig_memops;
src_vq->drv_priv = inst;
src_vq->buf_struct_size = sizeof(struct venus_buffer);
src_vq->allow_zero_bytesused = 1;
@@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
*src_vq,
dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
dst_vq->ops = _vb2_ops;
-   dst_vq->mem_ops = _dma_sg_memops;
+   dst_vq->mem_ops = _dma_contig_memops;
dst_vq->drv_priv = inst;
dst_vq->buf_struct_size = sizeof(struct venus_buffer);
dst_vq->allow_zero_bytesused = 1;
diff --git a/drivers/media/platform/qcom/venus/venc.c 
b/drivers/media/platform/qcom/venus/venc.c

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Stanimir Varbanov
Hi,

Cc: Robin

On 12/14/20 2:57 PM, Alexandre Courbot wrote:
> This driver uses the SG vb2 ops, but effectively only ever accesses the
> first entry of the SG table, indicating that it expects a flat layout.
> Switch it to use the contiguous ops to make sure this expected invariant

Under what circumstances the sg table will has nents > 1? I came down to
[1] but not sure I got it right.

I'm afraid that for systems with low amount of system memory and when
the memory become fragmented, the driver will not work. That's why I
started with sg allocator.

[1]
https://elixir.bootlin.com/linux/v5.10.1/source/drivers/iommu/dma-iommu.c#L782

> is always enforced. Since the device is supposed to be behind an IOMMU
> this should have little to none practical consequences beyond making the
> driver not rely on a particular behavior of the SG implementation.
> 
> Reported-by: Tomasz Figa 
> Signed-off-by: Alexandre Courbot 
> ---
> Hi everyone,
> 
> It probably doesn't hurt to fix this issue before some actual issue happens.
> I have tested this patch on Chrome OS and playback was just as fine as with
> the SG ops.
> 
>  drivers/media/platform/Kconfig  | 2 +-
>  drivers/media/platform/qcom/venus/helpers.c | 9 ++---
>  drivers/media/platform/qcom/venus/vdec.c| 6 +++---
>  drivers/media/platform/qcom/venus/venc.c| 6 +++---
>  4 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> index 35a18d388f3f..d9d7954111f2 100644
> --- a/drivers/media/platform/Kconfig
> +++ b/drivers/media/platform/Kconfig
> @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
>   depends on INTERCONNECT || !INTERCONNECT
>   select QCOM_MDT_LOADER if ARCH_QCOM
>   select QCOM_SCM if ARCH_QCOM
> - select VIDEOBUF2_DMA_SG
> + select VIDEOBUF2_DMA_CONTIG
>   select V4L2_MEM2MEM_DEV
>   help
> This is a V4L2 driver for Qualcomm Venus video accelerator
> diff --git a/drivers/media/platform/qcom/venus/helpers.c 
> b/drivers/media/platform/qcom/venus/helpers.c
> index 50439eb1ffea..859d260f002b 100644
> --- a/drivers/media/platform/qcom/venus/helpers.c
> +++ b/drivers/media/platform/qcom/venus/helpers.c
> @@ -7,7 +7,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  
> @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
>   struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
>   struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
>   struct venus_buffer *buf = to_venus_buffer(vbuf);
> - struct sg_table *sgt;
> -
> - sgt = vb2_dma_sg_plane_desc(vb, 0);
> - if (!sgt)
> - return -EFAULT;
>  
>   buf->size = vb2_plane_size(vb, 0);
> - buf->dma_addr = sg_dma_address(sgt->sgl);

Can we do it:

if (WARN_ON(sgt->nents > 1))
return -EFAULT;

I understand that logically using dma-sg when the flat layout is
expected by the hardware is wrong, but I haven't seen issues until now.

> + buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>  
>   if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
>   list_add_tail(>reg_list, >registeredbufs);
> diff --git a/drivers/media/platform/qcom/venus/vdec.c 
> b/drivers/media/platform/qcom/venus/vdec.c
> index 8488411204c3..3fb277c81aca 100644
> --- a/drivers/media/platform/qcom/venus/vdec.c
> +++ b/drivers/media/platform/qcom/venus/vdec.c
> @@ -13,7 +13,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  
>  #include "hfi_venus_io.h"
>  #include "hfi_parser.h"
> @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
> *src_vq,
>   src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>   src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>   src_vq->ops = _vb2_ops;
> - src_vq->mem_ops = _dma_sg_memops;
> + src_vq->mem_ops = _dma_contig_memops;
>   src_vq->drv_priv = inst;
>   src_vq->buf_struct_size = sizeof(struct venus_buffer);
>   src_vq->allow_zero_bytesused = 1;
> @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
> *src_vq,
>   dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
>   dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
>   dst_vq->ops = _vb2_ops;
> - dst_vq->mem_ops = _dma_sg_memops;
> + dst_vq->mem_ops = _dma_contig_memops;
>   dst_vq->drv_priv = inst;
>   dst_vq->buf_struct_size = sizeof(struct venus_buffer);
>   dst_vq->allow_zero_bytesused = 1;
> diff --git a/drivers/media/platform/qcom/venus/venc.c 
> b/drivers/media/platform/qcom/venus/venc.c
> index 1c61602c5de1..a09550cd1dba 100644
> --- a/drivers/media/platform/qcom/venus/venc.c
> +++ b/drivers/media/platform/qcom/venus/venc.c
> @@ -10,7 +10,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
> 

Re: [PATCH] media: venus: use contig vb2 ops

2020-12-15 Thread Tomasz Figa
On Mon, Dec 14, 2020 at 9:57 PM Alexandre Courbot  wrote:
>
> This driver uses the SG vb2 ops, but effectively only ever accesses the
> first entry of the SG table, indicating that it expects a flat layout.
> Switch it to use the contiguous ops to make sure this expected invariant
> is always enforced. Since the device is supposed to be behind an IOMMU
> this should have little to none practical consequences beyond making the
> driver not rely on a particular behavior of the SG implementation.
>
> Reported-by: Tomasz Figa 
> Signed-off-by: Alexandre Courbot 
> ---
> Hi everyone,
>
> It probably doesn't hurt to fix this issue before some actual issue happens.
> I have tested this patch on Chrome OS and playback was just as fine as with
> the SG ops.
>
>  drivers/media/platform/Kconfig  | 2 +-
>  drivers/media/platform/qcom/venus/helpers.c | 9 ++---
>  drivers/media/platform/qcom/venus/vdec.c| 6 +++---
>  drivers/media/platform/qcom/venus/venc.c| 6 +++---
>  4 files changed, 9 insertions(+), 14 deletions(-)
>

Reviewed-by: Tomasz Figa 

Best regards,
Tomasz

> diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
> index 35a18d388f3f..d9d7954111f2 100644
> --- a/drivers/media/platform/Kconfig
> +++ b/drivers/media/platform/Kconfig
> @@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
> depends on INTERCONNECT || !INTERCONNECT
> select QCOM_MDT_LOADER if ARCH_QCOM
> select QCOM_SCM if ARCH_QCOM
> -   select VIDEOBUF2_DMA_SG
> +   select VIDEOBUF2_DMA_CONTIG
> select V4L2_MEM2MEM_DEV
> help
>   This is a V4L2 driver for Qualcomm Venus video accelerator
> diff --git a/drivers/media/platform/qcom/venus/helpers.c 
> b/drivers/media/platform/qcom/venus/helpers.c
> index 50439eb1ffea..859d260f002b 100644
> --- a/drivers/media/platform/qcom/venus/helpers.c
> +++ b/drivers/media/platform/qcom/venus/helpers.c
> @@ -7,7 +7,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>
> @@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
> struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
> struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
> struct venus_buffer *buf = to_venus_buffer(vbuf);
> -   struct sg_table *sgt;
> -
> -   sgt = vb2_dma_sg_plane_desc(vb, 0);
> -   if (!sgt)
> -   return -EFAULT;
>
> buf->size = vb2_plane_size(vb, 0);
> -   buf->dma_addr = sg_dma_address(sgt->sgl);
> +   buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
>
> if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
> list_add_tail(>reg_list, >registeredbufs);
> diff --git a/drivers/media/platform/qcom/venus/vdec.c 
> b/drivers/media/platform/qcom/venus/vdec.c
> index 8488411204c3..3fb277c81aca 100644
> --- a/drivers/media/platform/qcom/venus/vdec.c
> +++ b/drivers/media/platform/qcom/venus/vdec.c
> @@ -13,7 +13,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>
>  #include "hfi_venus_io.h"
>  #include "hfi_parser.h"
> @@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
> *src_vq,
> src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> src_vq->ops = _vb2_ops;
> -   src_vq->mem_ops = _dma_sg_memops;
> +   src_vq->mem_ops = _dma_contig_memops;
> src_vq->drv_priv = inst;
> src_vq->buf_struct_size = sizeof(struct venus_buffer);
> src_vq->allow_zero_bytesused = 1;
> @@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
> *src_vq,
> dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
> dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> dst_vq->ops = _vb2_ops;
> -   dst_vq->mem_ops = _dma_sg_memops;
> +   dst_vq->mem_ops = _dma_contig_memops;
> dst_vq->drv_priv = inst;
> dst_vq->buf_struct_size = sizeof(struct venus_buffer);
> dst_vq->allow_zero_bytesused = 1;
> diff --git a/drivers/media/platform/qcom/venus/venc.c 
> b/drivers/media/platform/qcom/venus/venc.c
> index 1c61602c5de1..a09550cd1dba 100644
> --- a/drivers/media/platform/qcom/venus/venc.c
> +++ b/drivers/media/platform/qcom/venus/venc.c
> @@ -10,7 +10,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
> *src_vq,
> src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
> src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
> src_vq->ops = _vb2_ops;
> -   src_vq->mem_ops = _dma_sg_memops;
> +   src_vq->mem_ops = _dma_contig_memops;
> src_vq->drv_priv = inst;
> src_vq->buf_struct_size = sizeof(struct venus_buffer);
> src_vq->allow_zero_bytesused = 1;
> @@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, 

[PATCH] media: venus: use contig vb2 ops

2020-12-14 Thread Alexandre Courbot
This driver uses the SG vb2 ops, but effectively only ever accesses the
first entry of the SG table, indicating that it expects a flat layout.
Switch it to use the contiguous ops to make sure this expected invariant
is always enforced. Since the device is supposed to be behind an IOMMU
this should have little to none practical consequences beyond making the
driver not rely on a particular behavior of the SG implementation.

Reported-by: Tomasz Figa 
Signed-off-by: Alexandre Courbot 
---
Hi everyone,

It probably doesn't hurt to fix this issue before some actual issue happens.
I have tested this patch on Chrome OS and playback was just as fine as with
the SG ops.

 drivers/media/platform/Kconfig  | 2 +-
 drivers/media/platform/qcom/venus/helpers.c | 9 ++---
 drivers/media/platform/qcom/venus/vdec.c| 6 +++---
 drivers/media/platform/qcom/venus/venc.c| 6 +++---
 4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index 35a18d388f3f..d9d7954111f2 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -533,7 +533,7 @@ config VIDEO_QCOM_VENUS
depends on INTERCONNECT || !INTERCONNECT
select QCOM_MDT_LOADER if ARCH_QCOM
select QCOM_SCM if ARCH_QCOM
-   select VIDEOBUF2_DMA_SG
+   select VIDEOBUF2_DMA_CONTIG
select V4L2_MEM2MEM_DEV
help
  This is a V4L2 driver for Qualcomm Venus video accelerator
diff --git a/drivers/media/platform/qcom/venus/helpers.c 
b/drivers/media/platform/qcom/venus/helpers.c
index 50439eb1ffea..859d260f002b 100644
--- a/drivers/media/platform/qcom/venus/helpers.c
+++ b/drivers/media/platform/qcom/venus/helpers.c
@@ -7,7 +7,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
@@ -1284,14 +1284,9 @@ int venus_helper_vb2_buf_init(struct vb2_buffer *vb)
struct venus_inst *inst = vb2_get_drv_priv(vb->vb2_queue);
struct vb2_v4l2_buffer *vbuf = to_vb2_v4l2_buffer(vb);
struct venus_buffer *buf = to_venus_buffer(vbuf);
-   struct sg_table *sgt;
-
-   sgt = vb2_dma_sg_plane_desc(vb, 0);
-   if (!sgt)
-   return -EFAULT;
 
buf->size = vb2_plane_size(vb, 0);
-   buf->dma_addr = sg_dma_address(sgt->sgl);
+   buf->dma_addr = vb2_dma_contig_plane_dma_addr(vb, 0);
 
if (vb->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE)
list_add_tail(>reg_list, >registeredbufs);
diff --git a/drivers/media/platform/qcom/venus/vdec.c 
b/drivers/media/platform/qcom/venus/vdec.c
index 8488411204c3..3fb277c81aca 100644
--- a/drivers/media/platform/qcom/venus/vdec.c
+++ b/drivers/media/platform/qcom/venus/vdec.c
@@ -13,7 +13,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include "hfi_venus_io.h"
 #include "hfi_parser.h"
@@ -1461,7 +1461,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
*src_vq,
src_vq->io_modes = VB2_MMAP | VB2_DMABUF;
src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
src_vq->ops = _vb2_ops;
-   src_vq->mem_ops = _dma_sg_memops;
+   src_vq->mem_ops = _dma_contig_memops;
src_vq->drv_priv = inst;
src_vq->buf_struct_size = sizeof(struct venus_buffer);
src_vq->allow_zero_bytesused = 1;
@@ -1475,7 +1475,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
*src_vq,
dst_vq->io_modes = VB2_MMAP | VB2_DMABUF;
dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
dst_vq->ops = _vb2_ops;
-   dst_vq->mem_ops = _dma_sg_memops;
+   dst_vq->mem_ops = _dma_contig_memops;
dst_vq->drv_priv = inst;
dst_vq->buf_struct_size = sizeof(struct venus_buffer);
dst_vq->allow_zero_bytesused = 1;
diff --git a/drivers/media/platform/qcom/venus/venc.c 
b/drivers/media/platform/qcom/venus/venc.c
index 1c61602c5de1..a09550cd1dba 100644
--- a/drivers/media/platform/qcom/venus/venc.c
+++ b/drivers/media/platform/qcom/venus/venc.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -1001,7 +1001,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
*src_vq,
src_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
src_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
src_vq->ops = _vb2_ops;
-   src_vq->mem_ops = _dma_sg_memops;
+   src_vq->mem_ops = _dma_contig_memops;
src_vq->drv_priv = inst;
src_vq->buf_struct_size = sizeof(struct venus_buffer);
src_vq->allow_zero_bytesused = 1;
@@ -1017,7 +1017,7 @@ static int m2m_queue_init(void *priv, struct vb2_queue 
*src_vq,
dst_vq->io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF;
dst_vq->timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_COPY;
dst_vq->ops = _vb2_ops;
-   dst_vq->mem_ops = _dma_sg_memops;
+   dst_vq->mem_ops = _dma_contig_memops;
dst_vq->drv_priv = inst;
dst_vq->buf_struct_size =