Re: [PATCH] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS

2017-01-13 Thread Geert Uytterhoeven
Hi Robin,

On Fri, Jan 13, 2017 at 1:17 PM, Robin Murphy  wrote:
> On 13/01/17 11:59, Geert Uytterhoeven wrote:
>> On Fri, Jan 13, 2017 at 12:32 PM, Robin Murphy  wrote:
>>> On 13/01/17 11:07, Geert Uytterhoeven wrote:
 Add support for DMA_ATTR_FORCE_CONTIGUOUS to the generic IOMMU DMA code.
 This allows to allocate physically contiguous DMA buffers on arm64
 systems with an IOMMU.
>>>
>>> Can anyone explain what this attribute is actually used for? I've never
>>> quite figured it out.
>>
>> My understanding is that DMA_ATTR_FORCE_CONTIGUOUS is needed when using
>> an IOMMU but wanting the buffers to be both contiguous in IOVA space and
>> physically contiguous to allow passing to devices without IOMMU.
>>
>> Main users are graphic and remote processors.
>
> Sure, I assumed it must be to do with buffer sharing, but the systems
> I'm aware of which have IOMMUs in their media subsystems tend to have
> them in front of every IP block involved, so I was curious as to what
> bit of non-IOMMU hardware wanted to play too. The lone in-tree use in
> the Exynos DRM driver was never very revealing, and the new one I see in
> the Qualcomm PIL driver frankly looks redundant to me.

I'll let the GPU-literate people comment on that...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS

2017-01-13 Thread Robin Murphy
On 13/01/17 11:59, Geert Uytterhoeven wrote:
> Hi Robin,
> 
> On Fri, Jan 13, 2017 at 12:32 PM, Robin Murphy  wrote:
>> On 13/01/17 11:07, Geert Uytterhoeven wrote:
>>> Add support for DMA_ATTR_FORCE_CONTIGUOUS to the generic IOMMU DMA code.
>>> This allows to allocate physically contiguous DMA buffers on arm64
>>> systems with an IOMMU.
>>
>> Can anyone explain what this attribute is actually used for? I've never
>> quite figured it out.
> 
> My understanding is that DMA_ATTR_FORCE_CONTIGUOUS is needed when using
> an IOMMU but wanting the buffers to be both contiguous in IOVA space and
> physically contiguous to allow passing to devices without IOMMU.
> 
> Main users are graphic and remote processors.

Sure, I assumed it must be to do with buffer sharing, but the systems
I'm aware of which have IOMMUs in their media subsystems tend to have
them in front of every IP block involved, so I was curious as to what
bit of non-IOMMU hardware wanted to play too. The lone in-tree use in
the Exynos DRM driver was never very revealing, and the new one I see in
the Qualcomm PIL driver frankly looks redundant to me.

Robin.

>>> --- a/drivers/iommu/dma-iommu.c
>>> +++ b/drivers/iommu/dma-iommu.c
> 
>>> @@ -265,6 +272,20 @@ static struct page **__iommu_dma_alloc_pages(unsigned 
>>> int count,
>>>   /* IOMMU can map any pages, so himem can also be used here */
>>>   gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
>>>
>>> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
>>> + int order = get_order(count << PAGE_SHIFT);
>>> + struct page *page;
>>> +
>>> + page = dma_alloc_from_contiguous(dev, count, order);
>>> + if (!page)
>>> + return NULL;
>>> +
>>> + while (count--)
>>> + pages[i++] = page++;
>>> +
>>> + return pages;
>>> + }
>>> +
>>
>> This is really yuck. Plus it's entirely pointless to go through the
>> whole page array/scatterlist dance when we know the buffer is going to
>> be physically contiguous - it should just be allocate, map, done. I'd
>> much rather see standalone iommu_dma_{alloc,free}_contiguous()
>> functions, and let the arch code handle dispatching appropriately.
> 
> Fair enough.
> 
> Gr{oetje,eeting}s,
> 
> Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
> 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS

2017-01-13 Thread Geert Uytterhoeven
Hi Robin,

On Fri, Jan 13, 2017 at 12:32 PM, Robin Murphy  wrote:
> On 13/01/17 11:07, Geert Uytterhoeven wrote:
>> Add support for DMA_ATTR_FORCE_CONTIGUOUS to the generic IOMMU DMA code.
>> This allows to allocate physically contiguous DMA buffers on arm64
>> systems with an IOMMU.
>
> Can anyone explain what this attribute is actually used for? I've never
> quite figured it out.

My understanding is that DMA_ATTR_FORCE_CONTIGUOUS is needed when using
an IOMMU but wanting the buffers to be both contiguous in IOVA space and
physically contiguous to allow passing to devices without IOMMU.

Main users are graphic and remote processors.

>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c

>> @@ -265,6 +272,20 @@ static struct page **__iommu_dma_alloc_pages(unsigned 
>> int count,
>>   /* IOMMU can map any pages, so himem can also be used here */
>>   gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
>>
>> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
>> + int order = get_order(count << PAGE_SHIFT);
>> + struct page *page;
>> +
>> + page = dma_alloc_from_contiguous(dev, count, order);
>> + if (!page)
>> + return NULL;
>> +
>> + while (count--)
>> + pages[i++] = page++;
>> +
>> + return pages;
>> + }
>> +
>
> This is really yuck. Plus it's entirely pointless to go through the
> whole page array/scatterlist dance when we know the buffer is going to
> be physically contiguous - it should just be allocate, map, done. I'd
> much rather see standalone iommu_dma_{alloc,free}_contiguous()
> functions, and let the arch code handle dispatching appropriately.

Fair enough.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/dma: Add support for DMA_ATTR_FORCE_CONTIGUOUS

2017-01-13 Thread Robin Murphy
On 13/01/17 11:07, Geert Uytterhoeven wrote:
> Add support for DMA_ATTR_FORCE_CONTIGUOUS to the generic IOMMU DMA code.
> This allows to allocate physically contiguous DMA buffers on arm64
> systems with an IOMMU.

Can anyone explain what this attribute is actually used for? I've never
quite figured it out.

> Note that as this uses the CMA allocator, setting this attribute has a
> runtime-dependency on CONFIG_DMA_CMA, just like on arm32.
> 
> For arm64 systems using swiotlb, no changes are needed to support the
> allocation of physically contiguous DMA buffers:
>   - swiotlb always uses physically contiguous buffers (up to
> IO_TLB_SEGSIZE = 128 pages),
>   - arm64's __dma_alloc_coherent() already calls
> dma_alloc_from_contiguous() when CMA is available.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
>  arch/arm64/mm/dma-mapping.c |  4 ++--
>  drivers/iommu/dma-iommu.c   | 44 ++--
>  include/linux/dma-iommu.h   |  2 +-
>  3 files changed, 37 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 1d7d5d2881db7c19..5fba14a21163e41f 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -589,7 +589,7 @@ static void *__iommu_alloc_attrs(struct device *dev, 
> size_t size,
>   addr = dma_common_pages_remap(pages, size, VM_USERMAP, prot,
> __builtin_return_address(0));
>   if (!addr)
> - iommu_dma_free(dev, pages, iosize, handle);
> + iommu_dma_free(dev, pages, iosize, handle, attrs);
>   } else {
>   struct page *page;
>   /*
> @@ -642,7 +642,7 @@ static void __iommu_free_attrs(struct device *dev, size_t 
> size, void *cpu_addr,
>  
>   if (WARN_ON(!area || !area->pages))
>   return;
> - iommu_dma_free(dev, area->pages, iosize, );
> + iommu_dma_free(dev, area->pages, iosize, , attrs);
>   dma_common_free_remap(cpu_addr, size, VM_USERMAP);
>   } else {
>   iommu_dma_unmap_page(dev, handle, iosize, 0, 0);
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 2db0d641cf4505b5..b991e8dc35c5 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  struct iommu_dma_msi_page {
>   struct list_headlist;
> @@ -238,15 +239,21 @@ static void __iommu_dma_unmap(struct iommu_domain 
> *domain, dma_addr_t dma_addr)
>   __free_iova(iovad, iova);
>  }
>  
> -static void __iommu_dma_free_pages(struct page **pages, int count)
> +static void __iommu_dma_free_pages(struct device *dev, struct page **pages,
> +int count, unsigned long attrs)
>  {
> - while (count--)
> - __free_page(pages[count]);
> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> + dma_release_from_contiguous(dev, pages[0], count);
> + } else {
> + while (count--)
> + __free_page(pages[count]);
> + }
>   kvfree(pages);
>  }
>  
> -static struct page **__iommu_dma_alloc_pages(unsigned int count,
> - unsigned long order_mask, gfp_t gfp)
> +static struct page **__iommu_dma_alloc_pages(struct device *dev,
> + unsigned int count, unsigned long order_mask, gfp_t gfp,
> + unsigned long attrs)
>  {
>   struct page **pages;
>   unsigned int i = 0, array_size = count * sizeof(*pages);
> @@ -265,6 +272,20 @@ static struct page **__iommu_dma_alloc_pages(unsigned 
> int count,
>   /* IOMMU can map any pages, so himem can also be used here */
>   gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
>  
> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
> + int order = get_order(count << PAGE_SHIFT);
> + struct page *page;
> +
> + page = dma_alloc_from_contiguous(dev, count, order);
> + if (!page)
> + return NULL;
> +
> + while (count--)
> + pages[i++] = page++;
> +
> + return pages;
> + }
> +

This is really yuck. Plus it's entirely pointless to go through the
whole page array/scatterlist dance when we know the buffer is going to
be physically contiguous - it should just be allocate, map, done. I'd
much rather see standalone iommu_dma_{alloc,free}_contiguous()
functions, and let the arch code handle dispatching appropriately.

Robin.

>   while (count) {
>   struct page *page = NULL;
>   unsigned int order_size;
> @@ -294,7 +315,7 @@ static struct page **__iommu_dma_alloc_pages(unsigned int 
> count,
>   __free_pages(page, order);
>   }
>   if (!page) {
> - __iommu_dma_free_pages(pages, i);
> +