Re: [PATCH v2 01/31] af9005: don't do DMA on stack

2016-10-12 Thread Milton Miller II
On Tue, 11 Oct 2016 07:09:16 -0300, Mauro wrote:
>  struct af9005_device_state {
> 
>   u8 sequence;
> 
>   int led_state;
> 
> + unsigned char data[256];
> 
> + struct mutex data_mutex;
> 
>  };


This will not work on DMA incoherent architectures.  When the data 
cache is invalidated at the end of the DMA, the stores to the mutex 
will be thrown away and the system will not be happy as changed to 
the mutex are lost.

The data allocation needs to be a separate kmalloc, as is
somewhat obtusely documented around line 226 of DMA-API.txt[1].

A separate kmalloc will be aligned to be in separate cache lines.

Excerpt from DMA-API.txt:
 Warnings:  Memory coherency operates at a granularity called the cache line 
width.  In order for memory mapped by this API to operate correctly, the mapped 
region must begin exactly on a cache line boundary and end exactly on one (to 
prevent two separately mapped regions from sharing a single cache line).  Since 
the cache line size may not be known at compile time, the API will not enforce 
this requirement.  Therefore, it is recommended that driver writers who don't 
take special care to determine the cache line size at run time only map virtual 
regions that begin and end on page boundaries (which are guaranteed also to be 
cache line boundaries). 

[1]https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/DMA-API.txt#n226

milton

PS: I normally read lkml from various web archives, sorry for
loss of threading and cc.



Re: [PATCH v2 01/31] af9005: don't do DMA on stack

2016-10-12 Thread Milton Miller II
On Tue, 11 Oct 2016 07:09:16 -0300, Mauro wrote:
>  struct af9005_device_state {
> 
>   u8 sequence;
> 
>   int led_state;
> 
> + unsigned char data[256];
> 
> + struct mutex data_mutex;
> 
>  };


This will not work on DMA incoherent architectures.  When the data 
cache is invalidated at the end of the DMA, the stores to the mutex 
will be thrown away and the system will not be happy as changed to 
the mutex are lost.

The data allocation needs to be a separate kmalloc, as is
somewhat obtusely documented around line 226 of DMA-API.txt[1].

A separate kmalloc will be aligned to be in separate cache lines.

Excerpt from DMA-API.txt:
 Warnings:  Memory coherency operates at a granularity called the cache line 
width.  In order for memory mapped by this API to operate correctly, the mapped 
region must begin exactly on a cache line boundary and end exactly on one (to 
prevent two separately mapped regions from sharing a single cache line).  Since 
the cache line size may not be known at compile time, the API will not enforce 
this requirement.  Therefore, it is recommended that driver writers who don't 
take special care to determine the cache line size at run time only map virtual 
regions that begin and end on page boundaries (which are guaranteed also to be 
cache line boundaries). 

[1]https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/DMA-API.txt#n226

milton

PS: I normally read lkml from various web archives, sorry for
loss of threading and cc.