On Thu, Jul 6, 2017 at 10:31 PM, Tomasz Figa <tf...@chromium.org> wrote:
> On Thu, Jul 6, 2017 at 9:23 PM, Arnd Bergmann <a...@arndb.de> wrote:
>> On Thu, Jul 6, 2017 at 10:36 AM, Tomasz Figa <tf...@chromium.org> wrote:
>>> On Thu, Jul 6, 2017 at 5:34 PM, Tomasz Figa <tf...@chromium.org> wrote:
>>>> On Thu, Jul 6, 2017 at 5:26 PM, Arnd Bergmann <a...@arndb.de> wrote:
>>>>> On Thu, Jul 6, 2017 at 3:44 AM, Tomasz Figa <tf...@chromium.org> wrote:
>>
>>>>
>>>> I'd say that this is something that has been consistently tried to be
>>>> avoided by V4L2 and that's why it's so tightly integrated with DMA
>>>> mapping. IMHO re-implementing the code that's already there in
>>>> videobuf2 again in the driver, only because, for no good reason
>>>> mentioned as for now, having a loadable module providing DMA ops was
>>>> disliked.
>>>
>>> Sorry, I intended to mean:
>>>
>>> IMHO re-implementing the code that's already there in videobuf2 again
>>> in the driver, only because, for no good reason mentioned as for now,
>>> having a loadable module providing DMA ops was disliked, would make no
>>> sense.
>>
>> Why would we need to duplicate that code? I would expect that the videobuf2
>> core can simply call the regular dma_mapping interfaces, and you handle the
>> IOPTE generation at the point when the buffer is handed off from the core
>> code to the device driver. Am I missing something?
>
> Well, for example, the iommu-dma helpers already implement all the
> IOVA management, SG iterations, IOMMU API calls, sanity checks and so
> on. There is a significant amount of common code.
>
> On the other hand, if it's strictly about base/dma-mapping, we might
> not need it indeed. The driver could call iommu-dma helpers directly,
> without the need to provide its own DMA ops. One caveat, though, we
> are not able to obtain coherent (i.e. uncached) memory with this
> approach, which might have some performance effects and complicates
> the code, that would now need to flush caches even for some small
> internal buffers.

I think I should add a bit of explanation here:
 1) the device is non-coherent with CPU caches, even on x86,
 2) it looks like x86 does not have non-coherent DMA ops, (but it
might be something that could be fixed)
 3) one technically could still use __get_vm_area() and map_vm_area(),
which _are_ exported, to create an uncached mapping. I'll leave it to
you to judge if it would be better than using the already available
generic helpers.

Best regards,
Tomasz

Reply via email to