subject:"Re\: \[PATCH v2\] dma\: Add Xilinx AXI Video Direct Memory Access Engine driver support"

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-10 Thread Srikanth Thokala

On Thu, Feb 6, 2014 at 9:23 PM, Lars-Peter Clausen  wrote:
> On 02/06/2014 02:34 PM, Srikanth Thokala wrote:
>>
>> On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen 
>> wrote:
>>>
>>> On 02/05/2014 05:25 PM, Srikanth Thokala wrote:


 On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala 
 wrote:
>
>
> Hi Vinod,
>
> On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul 
> wrote:
>>
>>
>> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>>>
>>>
>>> Hi Lars/Vinod,
>
>
> The question here i think would be waht this device supports? Is
> the
> hardware
> capable of doing interleaved transfers, then would make sense.



 The hardware does 2D transfers. The parameters for a transfer are
 height,
 width and stride. That's only a subset of what interleaved transfers
 can be
 (xt->num_frames must be one for 2d transfers). But if I remember
 correctly
 there has been some discussion on this in the past and the result of
 that
 discussion was that using interleaved transfers for 2D transfers is
 preferred over adding a custom API for 2D transfers.
>>>
>>>
>>>
>>> I went through the prep_interleaved_dma API and I see only one
>>> descriptor
>>> is prepared per API call (i.e. per frame).  As our IP supports upto
>>> 16
>>> frame
>>> buffers (can be more in future), isn't it less efficient compared to
>>> the
>>> prep_slave_sg where we get a single sg list and can prepare all the
>>> descriptors
>>> (of non-contiguous buffers) in one go?  Correct me, if am wrong and
>>> let
>>> me
>>> know your opinions.
>>
>>
>> Well the descriptor maybe one, but that can represent multiple frames,
>> for
>> example 16 as in your case. Can you read up the documentation of how
>> multiple
>> frames are passed. Pls see include/linux/dmaengine.h
>>
>> /**
>>* Interleaved Transfer Request
>>* 
>>* A chunk is collection of contiguous bytes to be transfered.
>>* The gap(in bytes) between two chunks is called
>> inter-chunk-gap(ICG).
>>* ICGs may or maynot change between chunks.
>>* A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>>*  that when repeated an integral number of times, specifies the
>> transfer.
>>* A transfer template is specification of a Frame, the number of
>> times
>>*  it is to be repeated and other per-transfer attributes.
>>*
>>* Practically, a client driver would have ready a template for each
>>*  type of transfer it is going to need during its lifetime and
>>*  set only 'src_start' and 'dst_start' before submitting the
>> requests.
>>*
>>*
>>*  |  Frame-1|   Frame-2   | ~ |
>> Frame-'numf'  |
>>*  |==.===...=...|==.===...=...| ~
>> |==.===...=...|
>>*
>>*==  Chunk size
>>*... ICG
>>*/
>
>
>
> Yes, it can handle multiple frames specified by 'numf' each of size
> 'frame_size * sgl[0].size'.
> But, I see it only works if all the frames' memory is contiguous and
> in this case we
> can just increment 'src_start' by the total frame size 'numf' number
> of times to fill in
> for each HW descriptor (each frame is one HW descriptor).  So, there
> is no issue when the
> memory is contiguous.  If the frames are non contiguous, we have to
> call this API for each
> frame (hence for each descriptor), as the src_start for each frame is
> different.  Is it correct?
>
> FYI: This hardware has an inbuilt Scatter-Gather engine.
>

 Ping?
>>>
>>>
>>>
>>> If you want to submit multiple frames at once I think you should look at
>>> how
>>> the current dmaengine API can be extended to allow that. And also provide
>>> an
>>> explanation on how this is superior over submitting them one by one.
>>
>>
>>
>> Sure.  I would start with explaning the current implementation of this
>> driver.
>>
>> Using prep_slave_sg(), we can define multiple segments in a
>> async_tx_descriptor where each frame is defined by a segment (a sg
>> list entry).  So, the slave device could DMA the data (of multiple
>> frames) with a descriptor by calling tx_submit in a transaction i.e.,
>>
>> prep_slave_sg(16)  -> tx_submit(1) -> interrupt  (16 frames)
>>
>> Using interleaved_dma(), we could not divide into segments when we
>> have scattered memory (for the reasons mentioned in above thread).
>> This implies we are restricting the slave device to process frame by
>> frame i.e.,
>>
>> interleaved_dma(1) -> tx_submit(1) -> interrupt -> interleaved_dma(2)
>> -> tx_submit (2) -> interrupt ->

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-10 Thread Srikanth Thokala

On Thu, Feb 6, 2014 at 9:23 PM, Lars-Peter Clausen l...@metafoo.de wrote:
 On 02/06/2014 02:34 PM, Srikanth Thokala wrote:

 On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen l...@metafoo.de
 wrote:

 On 02/05/2014 05:25 PM, Srikanth Thokala wrote:


 On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com
 wrote:


 Hi Vinod,

 On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com
 wrote:


 On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:


 Hi Lars/Vinod,


 The question here i think would be waht this device supports? Is
 the
 hardware
 capable of doing interleaved transfers, then would make sense.



 The hardware does 2D transfers. The parameters for a transfer are
 height,
 width and stride. That's only a subset of what interleaved transfers
 can be
 (xt-num_frames must be one for 2d transfers). But if I remember
 correctly
 there has been some discussion on this in the past and the result of
 that
 discussion was that using interleaved transfers for 2D transfers is
 preferred over adding a custom API for 2D transfers.



 I went through the prep_interleaved_dma API and I see only one
 descriptor
 is prepared per API call (i.e. per frame).  As our IP supports upto
 16
 frame
 buffers (can be more in future), isn't it less efficient compared to
 the
 prep_slave_sg where we get a single sg list and can prepare all the
 descriptors
 (of non-contiguous buffers) in one go?  Correct me, if am wrong and
 let
 me
 know your opinions.


 Well the descriptor maybe one, but that can represent multiple frames,
 for
 example 16 as in your case. Can you read up the documentation of how
 multiple
 frames are passed. Pls see include/linux/dmaengine.h

 /**
* Interleaved Transfer Request
* 
* A chunk is collection of contiguous bytes to be transfered.
* The gap(in bytes) between two chunks is called
 inter-chunk-gap(ICG).
* ICGs may or maynot change between chunks.
* A FRAME is the smallest series of contiguous {chunk,icg} pairs,
*  that when repeated an integral number of times, specifies the
 transfer.
* A transfer template is specification of a Frame, the number of
 times
*  it is to be repeated and other per-transfer attributes.
*
* Practically, a client driver would have ready a template for each
*  type of transfer it is going to need during its lifetime and
*  set only 'src_start' and 'dst_start' before submitting the
 requests.
*
*
*  |  Frame-1|   Frame-2   | ~ |
 Frame-'numf'  |
*  |==.===...=...|==.===...=...| ~
 |==.===...=...|
*
*==  Chunk size
*... ICG
*/



 Yes, it can handle multiple frames specified by 'numf' each of size
 'frame_size * sgl[0].size'.
 But, I see it only works if all the frames' memory is contiguous and
 in this case we
 can just increment 'src_start' by the total frame size 'numf' number
 of times to fill in
 for each HW descriptor (each frame is one HW descriptor).  So, there
 is no issue when the
 memory is contiguous.  If the frames are non contiguous, we have to
 call this API for each
 frame (hence for each descriptor), as the src_start for each frame is
 different.  Is it correct?

 FYI: This hardware has an inbuilt Scatter-Gather engine.


 Ping?



 If you want to submit multiple frames at once I think you should look at
 how
 the current dmaengine API can be extended to allow that. And also provide
 an
 explanation on how this is superior over submitting them one by one.



 Sure.  I would start with explaning the current implementation of this
 driver.

 Using prep_slave_sg(), we can define multiple segments in a
 async_tx_descriptor where each frame is defined by a segment (a sg
 list entry).  So, the slave device could DMA the data (of multiple
 frames) with a descriptor by calling tx_submit in a transaction i.e.,

 prep_slave_sg(16)  - tx_submit(1) - interrupt  (16 frames)

 Using interleaved_dma(), we could not divide into segments when we
 have scattered memory (for the reasons mentioned in above thread).
 This implies we are restricting the slave device to process frame by
 frame i.e.,

 interleaved_dma(1) - tx_submit(1) - interrupt - interleaved_dma(2)
 - tx_submit (2) - interrupt -  tx_submit(16) - interrupt


 The API allows you to create and submit multiple interleaved descriptors
 before you have to issue them.

 interleaved_dma(1) - tx_submit(1) - interleaved_dma(2) - tx_submit(2) -
 ... - issue_pending() - interrupt


 This implementation makes the hardware to wait until the next frame is
 submitted.

 To overcome this, I feel it would be a good option if we could extend
 interleaved_dma template to modify src_start/dest_start to be a
 pointer to an array of addresses.  Here, number of addresses will be
 defined by numf. The other option would be to include scatterlist in
 the interleaved template. This way we can handle scattered memory
 using

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-06 Thread Lars-Peter Clausen


On 02/06/2014 02:34 PM, Srikanth Thokala wrote:

On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen  wrote:

On 02/05/2014 05:25 PM, Srikanth Thokala wrote:


On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala 
wrote:


Hi Vinod,

On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul  wrote:


On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:


Hi Lars/Vinod,


The question here i think would be waht this device supports? Is the
hardware
capable of doing interleaved transfers, then would make sense.



The hardware does 2D transfers. The parameters for a transfer are
height,
width and stride. That's only a subset of what interleaved transfers
can be
(xt->num_frames must be one for 2d transfers). But if I remember
correctly
there has been some discussion on this in the past and the result of
that
discussion was that using interleaved transfers for 2D transfers is
preferred over adding a custom API for 2D transfers.



I went through the prep_interleaved_dma API and I see only one
descriptor
is prepared per API call (i.e. per frame).  As our IP supports upto 16
frame
buffers (can be more in future), isn't it less efficient compared to
the
prep_slave_sg where we get a single sg list and can prepare all the
descriptors
(of non-contiguous buffers) in one go?  Correct me, if am wrong and let
me
know your opinions.


Well the descriptor maybe one, but that can represent multiple frames,
for
example 16 as in your case. Can you read up the documentation of how
multiple
frames are passed. Pls see include/linux/dmaengine.h

/**
   * Interleaved Transfer Request
   * 
   * A chunk is collection of contiguous bytes to be transfered.
   * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
   * ICGs may or maynot change between chunks.
   * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
   *  that when repeated an integral number of times, specifies the
transfer.
   * A transfer template is specification of a Frame, the number of times
   *  it is to be repeated and other per-transfer attributes.
   *
   * Practically, a client driver would have ready a template for each
   *  type of transfer it is going to need during its lifetime and
   *  set only 'src_start' and 'dst_start' before submitting the
requests.
   *
   *
   *  |  Frame-1|   Frame-2   | ~ |
Frame-'numf'  |
   *  |==.===...=...|==.===...=...| ~
|==.===...=...|
   *
   *==  Chunk size
   *... ICG
   */



Yes, it can handle multiple frames specified by 'numf' each of size
'frame_size * sgl[0].size'.
But, I see it only works if all the frames' memory is contiguous and
in this case we
can just increment 'src_start' by the total frame size 'numf' number
of times to fill in
for each HW descriptor (each frame is one HW descriptor).  So, there
is no issue when the
memory is contiguous.  If the frames are non contiguous, we have to
call this API for each
frame (hence for each descriptor), as the src_start for each frame is
different.  Is it correct?

FYI: This hardware has an inbuilt Scatter-Gather engine.



Ping?



If you want to submit multiple frames at once I think you should look at how
the current dmaengine API can be extended to allow that. And also provide an
explanation on how this is superior over submitting them one by one.



Sure.  I would start with explaning the current implementation of this driver.

Using prep_slave_sg(), we can define multiple segments in a
async_tx_descriptor where each frame is defined by a segment (a sg
list entry).  So, the slave device could DMA the data (of multiple
frames) with a descriptor by calling tx_submit in a transaction i.e.,

prep_slave_sg(16)  -> tx_submit(1) -> interrupt  (16 frames)

Using interleaved_dma(), we could not divide into segments when we
have scattered memory (for the reasons mentioned in above thread).
This implies we are restricting the slave device to process frame by
frame i.e.,

interleaved_dma(1) -> tx_submit(1) -> interrupt -> interleaved_dma(2)
-> tx_submit (2) -> interrupt ->  tx_submit(16) -> interrupt



The API allows you to create and submit multiple interleaved descriptors 
before you have to issue them.


interleaved_dma(1) -> tx_submit(1) -> interleaved_dma(2) -> tx_submit(2) -> 
... -> issue_pending() -> interrupt



This implementation makes the hardware to wait until the next frame is
submitted.

To overcome this, I feel it would be a good option if we could extend
interleaved_dma template to modify src_start/dest_start to be a
pointer to an array of addresses.  Here, number of addresses will be
defined by numf. The other option would be to include scatterlist in
the interleaved template. This way we can handle scattered memory
using this API.


Each "frame" in a interleaved transfer describes a single line in your video 
frame (size = width, icg = stride). numf is the number of lines per video 
frame. So the suggested change does not make that much

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-06 Thread Srikanth Thokala

On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen  wrote:
> On 02/05/2014 05:25 PM, Srikanth Thokala wrote:
>>
>> On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala 
>> wrote:
>>>
>>> Hi Vinod,
>>>
>>> On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul  wrote:

 On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>
> Hi Lars/Vinod,
>>>
>>> The question here i think would be waht this device supports? Is the
>>> hardware
>>> capable of doing interleaved transfers, then would make sense.
>>
>>
>> The hardware does 2D transfers. The parameters for a transfer are
>> height,
>> width and stride. That's only a subset of what interleaved transfers
>> can be
>> (xt->num_frames must be one for 2d transfers). But if I remember
>> correctly
>> there has been some discussion on this in the past and the result of
>> that
>> discussion was that using interleaved transfers for 2D transfers is
>> preferred over adding a custom API for 2D transfers.
>
>
> I went through the prep_interleaved_dma API and I see only one
> descriptor
> is prepared per API call (i.e. per frame).  As our IP supports upto 16
> frame
> buffers (can be more in future), isn't it less efficient compared to
> the
> prep_slave_sg where we get a single sg list and can prepare all the
> descriptors
> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let
> me
> know your opinions.

 Well the descriptor maybe one, but that can represent multiple frames,
 for
 example 16 as in your case. Can you read up the documentation of how
 multiple
 frames are passed. Pls see include/linux/dmaengine.h

 /**
   * Interleaved Transfer Request
   * 
   * A chunk is collection of contiguous bytes to be transfered.
   * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
   * ICGs may or maynot change between chunks.
   * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
   *  that when repeated an integral number of times, specifies the
 transfer.
   * A transfer template is specification of a Frame, the number of times
   *  it is to be repeated and other per-transfer attributes.
   *
   * Practically, a client driver would have ready a template for each
   *  type of transfer it is going to need during its lifetime and
   *  set only 'src_start' and 'dst_start' before submitting the
 requests.
   *
   *
   *  |  Frame-1|   Frame-2   | ~ |
 Frame-'numf'  |
   *  |==.===...=...|==.===...=...| ~
 |==.===...=...|
   *
   *==  Chunk size
   *... ICG
   */
>>>
>>>
>>> Yes, it can handle multiple frames specified by 'numf' each of size
>>> 'frame_size * sgl[0].size'.
>>> But, I see it only works if all the frames' memory is contiguous and
>>> in this case we
>>> can just increment 'src_start' by the total frame size 'numf' number
>>> of times to fill in
>>> for each HW descriptor (each frame is one HW descriptor).  So, there
>>> is no issue when the
>>> memory is contiguous.  If the frames are non contiguous, we have to
>>> call this API for each
>>> frame (hence for each descriptor), as the src_start for each frame is
>>> different.  Is it correct?
>>>
>>> FYI: This hardware has an inbuilt Scatter-Gather engine.
>>>
>>
>> Ping?
>
>
> If you want to submit multiple frames at once I think you should look at how
> the current dmaengine API can be extended to allow that. And also provide an
> explanation on how this is superior over submitting them one by one.


Sure.  I would start with explaning the current implementation of this driver.

Using prep_slave_sg(), we can define multiple segments in a
async_tx_descriptor where each frame is defined by a segment (a sg
list entry).  So, the slave device could DMA the data (of multiple
frames) with a descriptor by calling tx_submit in a transaction i.e.,

prep_slave_sg(16)  -> tx_submit(1) -> interrupt  (16 frames)

Using interleaved_dma(), we could not divide into segments when we
have scattered memory (for the reasons mentioned in above thread).
This implies we are restricting the slave device to process frame by
frame i.e.,

interleaved_dma(1) -> tx_submit(1) -> interrupt -> interleaved_dma(2)
-> tx_submit (2) -> interrupt ->  tx_submit(16) -> interrupt

This implementation makes the hardware to wait until the next frame is
submitted.

To overcome this, I feel it would be a good option if we could extend
interleaved_dma template to modify src_start/dest_start to be a
pointer to an array of addresses.  Here, number of addresses will be
defined by numf. The other option would be to include scatterlist in
the interleaved template. This way we can handle scattered memory
using this API.

Srikanth

>
> - Lars
>
>
> --
> To

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-06 Thread Srikanth Thokala

On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen l...@metafoo.de wrote:
 On 02/05/2014 05:25 PM, Srikanth Thokala wrote:

 On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com
 wrote:

 Hi Vinod,

 On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote:

 On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:

 Hi Lars/Vinod,

 The question here i think would be waht this device supports? Is the
 hardware
 capable of doing interleaved transfers, then would make sense.


 The hardware does 2D transfers. The parameters for a transfer are
 height,
 width and stride. That's only a subset of what interleaved transfers
 can be
 (xt-num_frames must be one for 2d transfers). But if I remember
 correctly
 there has been some discussion on this in the past and the result of
 that
 discussion was that using interleaved transfers for 2D transfers is
 preferred over adding a custom API for 2D transfers.


 I went through the prep_interleaved_dma API and I see only one
 descriptor
 is prepared per API call (i.e. per frame).  As our IP supports upto 16
 frame
 buffers (can be more in future), isn't it less efficient compared to
 the
 prep_slave_sg where we get a single sg list and can prepare all the
 descriptors
 (of non-contiguous buffers) in one go?  Correct me, if am wrong and let
 me
 know your opinions.

 Well the descriptor maybe one, but that can represent multiple frames,
 for
 example 16 as in your case. Can you read up the documentation of how
 multiple
 frames are passed. Pls see include/linux/dmaengine.h

 /**
   * Interleaved Transfer Request
   * 
   * A chunk is collection of contiguous bytes to be transfered.
   * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
   * ICGs may or maynot change between chunks.
   * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
   *  that when repeated an integral number of times, specifies the
 transfer.
   * A transfer template is specification of a Frame, the number of times
   *  it is to be repeated and other per-transfer attributes.
   *
   * Practically, a client driver would have ready a template for each
   *  type of transfer it is going to need during its lifetime and
   *  set only 'src_start' and 'dst_start' before submitting the
 requests.
   *
   *
   *  |  Frame-1|   Frame-2   | ~ |
 Frame-'numf'  |
   *  |==.===...=...|==.===...=...| ~
 |==.===...=...|
   *
   *==  Chunk size
   *... ICG
   */


 Yes, it can handle multiple frames specified by 'numf' each of size
 'frame_size * sgl[0].size'.
 But, I see it only works if all the frames' memory is contiguous and
 in this case we
 can just increment 'src_start' by the total frame size 'numf' number
 of times to fill in
 for each HW descriptor (each frame is one HW descriptor).  So, there
 is no issue when the
 memory is contiguous.  If the frames are non contiguous, we have to
 call this API for each
 frame (hence for each descriptor), as the src_start for each frame is
 different.  Is it correct?

 FYI: This hardware has an inbuilt Scatter-Gather engine.


 Ping?


 If you want to submit multiple frames at once I think you should look at how
 the current dmaengine API can be extended to allow that. And also provide an
 explanation on how this is superior over submitting them one by one.


Sure.  I would start with explaning the current implementation of this driver.

Using prep_slave_sg(), we can define multiple segments in a
async_tx_descriptor where each frame is defined by a segment (a sg
list entry).  So, the slave device could DMA the data (of multiple
frames) with a descriptor by calling tx_submit in a transaction i.e.,

prep_slave_sg(16)  - tx_submit(1) - interrupt  (16 frames)

Using interleaved_dma(), we could not divide into segments when we
have scattered memory (for the reasons mentioned in above thread).
This implies we are restricting the slave device to process frame by
frame i.e.,

interleaved_dma(1) - tx_submit(1) - interrupt - interleaved_dma(2)
- tx_submit (2) - interrupt -  tx_submit(16) - interrupt

This implementation makes the hardware to wait until the next frame is
submitted.

To overcome this, I feel it would be a good option if we could extend
interleaved_dma template to modify src_start/dest_start to be a
pointer to an array of addresses.  Here, number of addresses will be
defined by numf. The other option would be to include scatterlist in
the interleaved template. This way we can handle scattered memory
using this API.

Srikanth


 - Lars


 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-06 Thread Lars-Peter Clausen


On 02/06/2014 02:34 PM, Srikanth Thokala wrote:

On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen l...@metafoo.de wrote:

On 02/05/2014 05:25 PM, Srikanth Thokala wrote:


On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com
wrote:


Hi Vinod,

On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote:


On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:


Hi Lars/Vinod,


The question here i think would be waht this device supports? Is the
hardware
capable of doing interleaved transfers, then would make sense.



The hardware does 2D transfers. The parameters for a transfer are
height,
width and stride. That's only a subset of what interleaved transfers
can be
(xt-num_frames must be one for 2d transfers). But if I remember
correctly
there has been some discussion on this in the past and the result of
that
discussion was that using interleaved transfers for 2D transfers is
preferred over adding a custom API for 2D transfers.



I went through the prep_interleaved_dma API and I see only one
descriptor
is prepared per API call (i.e. per frame).  As our IP supports upto 16
frame
buffers (can be more in future), isn't it less efficient compared to
the
prep_slave_sg where we get a single sg list and can prepare all the
descriptors
(of non-contiguous buffers) in one go?  Correct me, if am wrong and let
me
know your opinions.


Well the descriptor maybe one, but that can represent multiple frames,
for
example 16 as in your case. Can you read up the documentation of how
multiple
frames are passed. Pls see include/linux/dmaengine.h

/**
   * Interleaved Transfer Request
   * 
   * A chunk is collection of contiguous bytes to be transfered.
   * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
   * ICGs may or maynot change between chunks.
   * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
   *  that when repeated an integral number of times, specifies the
transfer.
   * A transfer template is specification of a Frame, the number of times
   *  it is to be repeated and other per-transfer attributes.
   *
   * Practically, a client driver would have ready a template for each
   *  type of transfer it is going to need during its lifetime and
   *  set only 'src_start' and 'dst_start' before submitting the
requests.
   *
   *
   *  |  Frame-1|   Frame-2   | ~ |
Frame-'numf'  |
   *  |==.===...=...|==.===...=...| ~
|==.===...=...|
   *
   *==  Chunk size
   *... ICG
   */



Yes, it can handle multiple frames specified by 'numf' each of size
'frame_size * sgl[0].size'.
But, I see it only works if all the frames' memory is contiguous and
in this case we
can just increment 'src_start' by the total frame size 'numf' number
of times to fill in
for each HW descriptor (each frame is one HW descriptor).  So, there
is no issue when the
memory is contiguous.  If the frames are non contiguous, we have to
call this API for each
frame (hence for each descriptor), as the src_start for each frame is
different.  Is it correct?

FYI: This hardware has an inbuilt Scatter-Gather engine.



Ping?



If you want to submit multiple frames at once I think you should look at how
the current dmaengine API can be extended to allow that. And also provide an
explanation on how this is superior over submitting them one by one.



Sure.  I would start with explaning the current implementation of this driver.

Using prep_slave_sg(), we can define multiple segments in a
async_tx_descriptor where each frame is defined by a segment (a sg
list entry).  So, the slave device could DMA the data (of multiple
frames) with a descriptor by calling tx_submit in a transaction i.e.,

prep_slave_sg(16)  - tx_submit(1) - interrupt  (16 frames)

Using interleaved_dma(), we could not divide into segments when we
have scattered memory (for the reasons mentioned in above thread).
This implies we are restricting the slave device to process frame by
frame i.e.,

interleaved_dma(1) - tx_submit(1) - interrupt - interleaved_dma(2)
- tx_submit (2) - interrupt -  tx_submit(16) - interrupt



The API allows you to create and submit multiple interleaved descriptors 
before you have to issue them.


interleaved_dma(1) - tx_submit(1) - interleaved_dma(2) - tx_submit(2) - 
... - issue_pending() - interrupt



This implementation makes the hardware to wait until the next frame is
submitted.

To overcome this, I feel it would be a good option if we could extend
interleaved_dma template to modify src_start/dest_start to be a
pointer to an array of addresses.  Here, number of addresses will be
defined by numf. The other option would be to include scatterlist in
the interleaved template. This way we can handle scattered memory
using this API.


Each frame in a interleaved transfer describes a single line in your video 
frame (size = width, icg = stride). numf is the number of lines per video 
frame. So the

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-05 Thread Lars-Peter Clausen


On 02/05/2014 05:25 PM, Srikanth Thokala wrote:

On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala  wrote:

Hi Vinod,

On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul  wrote:

On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:

Hi Lars/Vinod,

The question here i think would be waht this device supports? Is the hardware
capable of doing interleaved transfers, then would make sense.


The hardware does 2D transfers. The parameters for a transfer are height,
width and stride. That's only a subset of what interleaved transfers can be
(xt->num_frames must be one for 2d transfers). But if I remember correctly
there has been some discussion on this in the past and the result of that
discussion was that using interleaved transfers for 2D transfers is
preferred over adding a custom API for 2D transfers.


I went through the prep_interleaved_dma API and I see only one descriptor
is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
buffers (can be more in future), isn't it less efficient compared to the
prep_slave_sg where we get a single sg list and can prepare all the descriptors
(of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
know your opinions.

Well the descriptor maybe one, but that can represent multiple frames, for
example 16 as in your case. Can you read up the documentation of how multiple
frames are passed. Pls see include/linux/dmaengine.h

/**
  * Interleaved Transfer Request
  * 
  * A chunk is collection of contiguous bytes to be transfered.
  * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
  * ICGs may or maynot change between chunks.
  * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
  *  that when repeated an integral number of times, specifies the transfer.
  * A transfer template is specification of a Frame, the number of times
  *  it is to be repeated and other per-transfer attributes.
  *
  * Practically, a client driver would have ready a template for each
  *  type of transfer it is going to need during its lifetime and
  *  set only 'src_start' and 'dst_start' before submitting the requests.
  *
  *
  *  |  Frame-1|   Frame-2   | ~ |   Frame-'numf'  |
  *  |==.===...=...|==.===...=...| ~ |==.===...=...|
  *
  *==  Chunk size
  *... ICG
  */


Yes, it can handle multiple frames specified by 'numf' each of size
'frame_size * sgl[0].size'.
But, I see it only works if all the frames' memory is contiguous and
in this case we
can just increment 'src_start' by the total frame size 'numf' number
of times to fill in
for each HW descriptor (each frame is one HW descriptor).  So, there
is no issue when the
memory is contiguous.  If the frames are non contiguous, we have to
call this API for each
frame (hence for each descriptor), as the src_start for each frame is
different.  Is it correct?

FYI: This hardware has an inbuilt Scatter-Gather engine.



Ping?


If you want to submit multiple frames at once I think you should look at how 
the current dmaengine API can be extended to allow that. And also provide an 
explanation on how this is superior over submitting them one by one.


- Lars

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-05 Thread Srikanth Thokala

On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala  wrote:
> Hi Vinod,
>
> On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul  wrote:
>> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>>> Hi Lars/Vinod,
>>> >> The question here i think would be waht this device supports? Is the 
>>> >> hardware
>>> >> capable of doing interleaved transfers, then would make sense.
>>> >
>>> > The hardware does 2D transfers. The parameters for a transfer are height,
>>> > width and stride. That's only a subset of what interleaved transfers can 
>>> > be
>>> > (xt->num_frames must be one for 2d transfers). But if I remember correctly
>>> > there has been some discussion on this in the past and the result of that
>>> > discussion was that using interleaved transfers for 2D transfers is
>>> > preferred over adding a custom API for 2D transfers.
>>>
>>> I went through the prep_interleaved_dma API and I see only one descriptor
>>> is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
>>> buffers (can be more in future), isn't it less efficient compared to the
>>> prep_slave_sg where we get a single sg list and can prepare all the 
>>> descriptors
>>> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
>>> know your opinions.
>> Well the descriptor maybe one, but that can represent multiple frames, for
>> example 16 as in your case. Can you read up the documentation of how multiple
>> frames are passed. Pls see include/linux/dmaengine.h
>>
>> /**
>>  * Interleaved Transfer Request
>>  * 
>>  * A chunk is collection of contiguous bytes to be transfered.
>>  * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>>  * ICGs may or maynot change between chunks.
>>  * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>>  *  that when repeated an integral number of times, specifies the transfer.
>>  * A transfer template is specification of a Frame, the number of times
>>  *  it is to be repeated and other per-transfer attributes.
>>  *
>>  * Practically, a client driver would have ready a template for each
>>  *  type of transfer it is going to need during its lifetime and
>>  *  set only 'src_start' and 'dst_start' before submitting the requests.
>>  *
>>  *
>>  *  |  Frame-1|   Frame-2   | ~ |   Frame-'numf'  |
>>  *  |==.===...=...|==.===...=...| ~ |==.===...=...|
>>  *
>>  *==  Chunk size
>>  *... ICG
>>  */
>
> Yes, it can handle multiple frames specified by 'numf' each of size
> 'frame_size * sgl[0].size'.
> But, I see it only works if all the frames' memory is contiguous and
> in this case we
> can just increment 'src_start' by the total frame size 'numf' number
> of times to fill in
> for each HW descriptor (each frame is one HW descriptor).  So, there
> is no issue when the
> memory is contiguous.  If the frames are non contiguous, we have to
> call this API for each
> frame (hence for each descriptor), as the src_start for each frame is
> different.  Is it correct?
>
> FYI: This hardware has an inbuilt Scatter-Gather engine.
>

Ping?


> Srikanth
>
>>
>> --
>> ~Vinod
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-05 Thread Srikanth Thokala

On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com wrote:
 Hi Vinod,

 On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote:
 On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
 Hi Lars/Vinod,
  The question here i think would be waht this device supports? Is the 
  hardware
  capable of doing interleaved transfers, then would make sense.
 
  The hardware does 2D transfers. The parameters for a transfer are height,
  width and stride. That's only a subset of what interleaved transfers can 
  be
  (xt-num_frames must be one for 2d transfers). But if I remember correctly
  there has been some discussion on this in the past and the result of that
  discussion was that using interleaved transfers for 2D transfers is
  preferred over adding a custom API for 2D transfers.

 I went through the prep_interleaved_dma API and I see only one descriptor
 is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
 buffers (can be more in future), isn't it less efficient compared to the
 prep_slave_sg where we get a single sg list and can prepare all the 
 descriptors
 (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
 know your opinions.
 Well the descriptor maybe one, but that can represent multiple frames, for
 example 16 as in your case. Can you read up the documentation of how multiple
 frames are passed. Pls see include/linux/dmaengine.h

 /**
  * Interleaved Transfer Request
  * 
  * A chunk is collection of contiguous bytes to be transfered.
  * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
  * ICGs may or maynot change between chunks.
  * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
  *  that when repeated an integral number of times, specifies the transfer.
  * A transfer template is specification of a Frame, the number of times
  *  it is to be repeated and other per-transfer attributes.
  *
  * Practically, a client driver would have ready a template for each
  *  type of transfer it is going to need during its lifetime and
  *  set only 'src_start' and 'dst_start' before submitting the requests.
  *
  *
  *  |  Frame-1|   Frame-2   | ~ |   Frame-'numf'  |
  *  |==.===...=...|==.===...=...| ~ |==.===...=...|
  *
  *==  Chunk size
  *... ICG
  */

 Yes, it can handle multiple frames specified by 'numf' each of size
 'frame_size * sgl[0].size'.
 But, I see it only works if all the frames' memory is contiguous and
 in this case we
 can just increment 'src_start' by the total frame size 'numf' number
 of times to fill in
 for each HW descriptor (each frame is one HW descriptor).  So, there
 is no issue when the
 memory is contiguous.  If the frames are non contiguous, we have to
 call this API for each
 frame (hence for each descriptor), as the src_start for each frame is
 different.  Is it correct?

 FYI: This hardware has an inbuilt Scatter-Gather engine.


Ping?


 Srikanth


 --
 ~Vinod
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-05 Thread Lars-Peter Clausen


On 02/05/2014 05:25 PM, Srikanth Thokala wrote:

On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com wrote:

Hi Vinod,

On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote:

On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:

Hi Lars/Vinod,

The question here i think would be waht this device supports? Is the hardware
capable of doing interleaved transfers, then would make sense.


The hardware does 2D transfers. The parameters for a transfer are height,
width and stride. That's only a subset of what interleaved transfers can be
(xt-num_frames must be one for 2d transfers). But if I remember correctly
there has been some discussion on this in the past and the result of that
discussion was that using interleaved transfers for 2D transfers is
preferred over adding a custom API for 2D transfers.


I went through the prep_interleaved_dma API and I see only one descriptor
is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
buffers (can be more in future), isn't it less efficient compared to the
prep_slave_sg where we get a single sg list and can prepare all the descriptors
(of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
know your opinions.

Well the descriptor maybe one, but that can represent multiple frames, for
example 16 as in your case. Can you read up the documentation of how multiple
frames are passed. Pls see include/linux/dmaengine.h

/**
  * Interleaved Transfer Request
  * 
  * A chunk is collection of contiguous bytes to be transfered.
  * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
  * ICGs may or maynot change between chunks.
  * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
  *  that when repeated an integral number of times, specifies the transfer.
  * A transfer template is specification of a Frame, the number of times
  *  it is to be repeated and other per-transfer attributes.
  *
  * Practically, a client driver would have ready a template for each
  *  type of transfer it is going to need during its lifetime and
  *  set only 'src_start' and 'dst_start' before submitting the requests.
  *
  *
  *  |  Frame-1|   Frame-2   | ~ |   Frame-'numf'  |
  *  |==.===...=...|==.===...=...| ~ |==.===...=...|
  *
  *==  Chunk size
  *... ICG
  */


Yes, it can handle multiple frames specified by 'numf' each of size
'frame_size * sgl[0].size'.
But, I see it only works if all the frames' memory is contiguous and
in this case we
can just increment 'src_start' by the total frame size 'numf' number
of times to fill in
for each HW descriptor (each frame is one HW descriptor).  So, there
is no issue when the
memory is contiguous.  If the frames are non contiguous, we have to
call this API for each
frame (hence for each descriptor), as the src_start for each frame is
different.  Is it correct?

FYI: This hardware has an inbuilt Scatter-Gather engine.



Ping?


If you want to submit multiple frames at once I think you should look at how 
the current dmaengine API can be extended to allow that. And also provide an 
explanation on how this is superior over submitting them one by one.


- Lars

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-04 Thread Srikanth Thokala

On Tue, Feb 4, 2014 at 10:58 AM, Vinod Koul  wrote:
> On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote:
>> >>> >> [...]
>> >>> >>> +/**
>> >>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>> >>> >>> + * @dchan: DMA Channel pointer
>> >>> >>> + * @cmd: DMA control command
>> >>> >>> + * @arg: Channel configuration
>> >>> >>> + *
>> >>> >>> + * Return: '0' on success and failure value on error
>> >>> >>> + */
>> >>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>> >>> >>> +   enum dma_ctrl_cmd cmd, unsigned 
>> >>> >>> long arg)
>> >>> >>> +{
>> >>> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> >>> >>> +
>> >>> >>> + switch (cmd) {
>> >>> >>> + case DMA_TERMINATE_ALL:
>> >>> >>> + xilinx_vdma_terminate_all(chan);
>> >>> >>> + return 0;
>> >>> >>> + case DMA_SLAVE_CONFIG:
>> >>> >>> + return xilinx_vdma_slave_config(chan,
>> >>> >>> + (struct xilinx_vdma_config 
>> >>> >>> *)arg);
>> >>> >>
>> >>> >> You really shouldn't be overloading the generic API with your own 
>> >>> >> semantics.
>> >>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>> >>> >
>> >>> > Ok.  The driver needs few additional configuration from the slave
>> >>> > device like Vertical
>> >>> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that 
>> >>> > case do you
>> >>> > suggest me to define a separate dma_ctrl_cmd like the one 
>> >>> > FSLDMA_EXTERNAL_START
>> >>> > defined for Freescale drivers?
>> >>>
>> >>> In my opinion it is not a good idea to have driver implement a generic 
>> >>> API,
>> >>> but at the same time let the driver have custom semantics for those API
>> >>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>> >>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>> >>> the purpose of a generic API, namely that you are able to write generic 
>> >>> code
>> >>> that makes use of the API without having to know about which 
>> >>> implementation
>> >>> API it is talking to. The dmaengine framework provides the
>> >>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>> >>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
>> >>
>> >> The question here i think would be waht this device supports? Is the 
>> >> hardware
>> >> capable of doing interleaved transfers, then would make sense.
>> >>
>> >> While we do try to get users use dma_slave_config, but there will always 
>> >> be
>> >> someone who have specfic params. If we can generalize then we might want 
>> >> to add
>> >> to the dma_slave_config as well
>> >
>> > There are many configuration parameters which are specific to IP and I
>> > would like to
>> > give an overview of some of parameteres here:
>> >
>> > 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame
>> > referenced by
>> > 'cfg->park_frm', so user will have control on each frame in this mode.
>> >
>> > 2) Interrupt Coalesce ('cfg->coalesce'):  Used for setting interrupt
>> > threshold. This value
>> >determines the number of frame buffers to process. To use this feature,
>> >'cfg->frm_cnt_en' should be set.
>> >
>> > 3) Frame Synchronization Source ('cfg->ext_fsync'):  Can be an
>> > external/internal frame
>> > synchronization source. Used to synchronize one channel (MM2S/S2MM) 
>> > with
>> > another (S2MM/MM2S) channel.
>> >
>> > 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate 
>> > between
>> > master and slave.  In master mode (cfg->master), frames are not 
>> > dropped and
>> > slave can drop frames to adjust to master frame rate.
>> >
>> > And in future, this Engine being a soft IP, we could expect some more 
>> > additional
>> > parameters.  Isn't a good idea to have a private member in 
>> > dma_slave_config for
>> > sharing additional configuration between slave device and dma engine? Or a 
>> > new
>> > dma_ctrl_cmd like FSLDMA_EXTERNAL_START?
>
> The idea of a generic API is that we can use it for most of the controllers. 
> Even
> if you are planning to support a family of controllers
>
> ATM, lets not discuss the possiblity of private member and try to exhanust all
> possible options. Worst case you can embed the dma_slave_config in
> xilinx_dma_slave_config and retrieve it in dmac driver

Ok.

Srikanth

>
> --
> ~Vinod
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-04 Thread Srikanth Thokala

On Tue, Feb 4, 2014 at 10:58 AM, Vinod Koul vinod.k...@intel.com wrote:
 On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote:
   [...]
   +/**
   + * xilinx_vdma_device_control - Configure DMA channel of the device
   + * @dchan: DMA Channel pointer
   + * @cmd: DMA control command
   + * @arg: Channel configuration
   + *
   + * Return: '0' on success and failure value on error
   + */
   +static int xilinx_vdma_device_control(struct dma_chan *dchan,
   +   enum dma_ctrl_cmd cmd, unsigned 
   long arg)
   +{
   + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
   +
   + switch (cmd) {
   + case DMA_TERMINATE_ALL:
   + xilinx_vdma_terminate_all(chan);
   + return 0;
   + case DMA_SLAVE_CONFIG:
   + return xilinx_vdma_slave_config(chan,
   + (struct xilinx_vdma_config 
   *)arg);
  
   You really shouldn't be overloading the generic API with your own 
   semantics.
   DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
  
   Ok.  The driver needs few additional configuration from the slave
   device like Vertical
   Size, Horizontal Size,  Stride etc., for the DMA transfers, in that 
   case do you
   suggest me to define a separate dma_ctrl_cmd like the one 
   FSLDMA_EXTERNAL_START
   defined for Freescale drivers?
 
  In my opinion it is not a good idea to have driver implement a generic 
  API,
  but at the same time let the driver have custom semantics for those API
  calls. It's a bit like having a gpio driver that expects 23 and 42 as the
  values passed to gpio_set_value instead of 0 and 1. It completely defeats
  the purpose of a generic API, namely that you are able to write generic 
  code
  that makes use of the API without having to know about which 
  implementation
  API it is talking to. The dmaengine framework provides the
  dmaengine_prep_interleaved_dma() function to setup two dimensional
  transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
 
  The question here i think would be waht this device supports? Is the 
  hardware
  capable of doing interleaved transfers, then would make sense.
 
  While we do try to get users use dma_slave_config, but there will always 
  be
  someone who have specfic params. If we can generalize then we might want 
  to add
  to the dma_slave_config as well
 
  There are many configuration parameters which are specific to IP and I
  would like to
  give an overview of some of parameteres here:
 
  1) Park Mode ('cfg-park'): In Park mode, engine will park on frame
  referenced by
  'cfg-park_frm', so user will have control on each frame in this mode.
 
  2) Interrupt Coalesce ('cfg-coalesce'):  Used for setting interrupt
  threshold. This value
 determines the number of frame buffers to process. To use this feature,
 'cfg-frm_cnt_en' should be set.
 
  3) Frame Synchronization Source ('cfg-ext_fsync'):  Can be an
  external/internal frame
  synchronization source. Used to synchronize one channel (MM2S/S2MM) 
  with
  another (S2MM/MM2S) channel.
 
  4) Genlock Synchronization ('cfg-genlock'): Used to avoid mismatch rate 
  between
  master and slave.  In master mode (cfg-master), frames are not 
  dropped and
  slave can drop frames to adjust to master frame rate.
 
  And in future, this Engine being a soft IP, we could expect some more 
  additional
  parameters.  Isn't a good idea to have a private member in 
  dma_slave_config for
  sharing additional configuration between slave device and dma engine? Or a 
  new
  dma_ctrl_cmd like FSLDMA_EXTERNAL_START?

 The idea of a generic API is that we can use it for most of the controllers. 
 Even
 if you are planning to support a family of controllers

 ATM, lets not discuss the possiblity of private member and try to exhanust all
 possible options. Worst case you can embed the dma_slave_config in
 xilinx_dma_slave_config and retrieve it in dmac driver

Ok.

Srikanth


 --
 ~Vinod
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-03 Thread Vinod Koul

On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote:
> >>> >> [...]
> >>> >>> +/**
> >>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
> >>> >>> + * @dchan: DMA Channel pointer
> >>> >>> + * @cmd: DMA control command
> >>> >>> + * @arg: Channel configuration
> >>> >>> + *
> >>> >>> + * Return: '0' on success and failure value on error
> >>> >>> + */
> >>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> >>> >>> +   enum dma_ctrl_cmd cmd, unsigned 
> >>> >>> long arg)
> >>> >>> +{
> >>> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> >>> >>> +
> >>> >>> + switch (cmd) {
> >>> >>> + case DMA_TERMINATE_ALL:
> >>> >>> + xilinx_vdma_terminate_all(chan);
> >>> >>> + return 0;
> >>> >>> + case DMA_SLAVE_CONFIG:
> >>> >>> + return xilinx_vdma_slave_config(chan,
> >>> >>> + (struct xilinx_vdma_config 
> >>> >>> *)arg);
> >>> >>
> >>> >> You really shouldn't be overloading the generic API with your own 
> >>> >> semantics.
> >>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> >>> >
> >>> > Ok.  The driver needs few additional configuration from the slave
> >>> > device like Vertical
> >>> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that 
> >>> > case do you
> >>> > suggest me to define a separate dma_ctrl_cmd like the one 
> >>> > FSLDMA_EXTERNAL_START
> >>> > defined for Freescale drivers?
> >>>
> >>> In my opinion it is not a good idea to have driver implement a generic 
> >>> API,
> >>> but at the same time let the driver have custom semantics for those API
> >>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
> >>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
> >>> the purpose of a generic API, namely that you are able to write generic 
> >>> code
> >>> that makes use of the API without having to know about which 
> >>> implementation
> >>> API it is talking to. The dmaengine framework provides the
> >>> dmaengine_prep_interleaved_dma() function to setup two dimensional
> >>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
> >>
> >> The question here i think would be waht this device supports? Is the 
> >> hardware
> >> capable of doing interleaved transfers, then would make sense.
> >>
> >> While we do try to get users use dma_slave_config, but there will always be
> >> someone who have specfic params. If we can generalize then we might want 
> >> to add
> >> to the dma_slave_config as well
> >
> > There are many configuration parameters which are specific to IP and I
> > would like to
> > give an overview of some of parameteres here:
> >
> > 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame
> > referenced by
> > 'cfg->park_frm', so user will have control on each frame in this mode.
> >
> > 2) Interrupt Coalesce ('cfg->coalesce'):  Used for setting interrupt
> > threshold. This value
> >determines the number of frame buffers to process. To use this feature,
> >'cfg->frm_cnt_en' should be set.
> >
> > 3) Frame Synchronization Source ('cfg->ext_fsync'):  Can be an
> > external/internal frame
> > synchronization source. Used to synchronize one channel (MM2S/S2MM) with
> > another (S2MM/MM2S) channel.
> >
> > 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate 
> > between
> > master and slave.  In master mode (cfg->master), frames are not dropped 
> > and
> > slave can drop frames to adjust to master frame rate.
> >
> > And in future, this Engine being a soft IP, we could expect some more 
> > additional
> > parameters.  Isn't a good idea to have a private member in dma_slave_config 
> > for
> > sharing additional configuration between slave device and dma engine? Or a 
> > new
> > dma_ctrl_cmd like FSLDMA_EXTERNAL_START?

The idea of a generic API is that we can use it for most of the controllers. 
Even
if you are planning to support a family of controllers

ATM, lets not discuss the possiblity of private member and try to exhanust all
possible options. Worst case you can embed the dma_slave_config in
xilinx_dma_slave_config and retrieve it in dmac driver

-- 
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-03 Thread Vinod Koul

On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote:
   [...]
   +/**
   + * xilinx_vdma_device_control - Configure DMA channel of the device
   + * @dchan: DMA Channel pointer
   + * @cmd: DMA control command
   + * @arg: Channel configuration
   + *
   + * Return: '0' on success and failure value on error
   + */
   +static int xilinx_vdma_device_control(struct dma_chan *dchan,
   +   enum dma_ctrl_cmd cmd, unsigned 
   long arg)
   +{
   + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
   +
   + switch (cmd) {
   + case DMA_TERMINATE_ALL:
   + xilinx_vdma_terminate_all(chan);
   + return 0;
   + case DMA_SLAVE_CONFIG:
   + return xilinx_vdma_slave_config(chan,
   + (struct xilinx_vdma_config 
   *)arg);
  
   You really shouldn't be overloading the generic API with your own 
   semantics.
   DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
  
   Ok.  The driver needs few additional configuration from the slave
   device like Vertical
   Size, Horizontal Size,  Stride etc., for the DMA transfers, in that 
   case do you
   suggest me to define a separate dma_ctrl_cmd like the one 
   FSLDMA_EXTERNAL_START
   defined for Freescale drivers?
 
  In my opinion it is not a good idea to have driver implement a generic 
  API,
  but at the same time let the driver have custom semantics for those API
  calls. It's a bit like having a gpio driver that expects 23 and 42 as the
  values passed to gpio_set_value instead of 0 and 1. It completely defeats
  the purpose of a generic API, namely that you are able to write generic 
  code
  that makes use of the API without having to know about which 
  implementation
  API it is talking to. The dmaengine framework provides the
  dmaengine_prep_interleaved_dma() function to setup two dimensional
  transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
 
  The question here i think would be waht this device supports? Is the 
  hardware
  capable of doing interleaved transfers, then would make sense.
 
  While we do try to get users use dma_slave_config, but there will always be
  someone who have specfic params. If we can generalize then we might want 
  to add
  to the dma_slave_config as well
 
  There are many configuration parameters which are specific to IP and I
  would like to
  give an overview of some of parameteres here:
 
  1) Park Mode ('cfg-park'): In Park mode, engine will park on frame
  referenced by
  'cfg-park_frm', so user will have control on each frame in this mode.
 
  2) Interrupt Coalesce ('cfg-coalesce'):  Used for setting interrupt
  threshold. This value
 determines the number of frame buffers to process. To use this feature,
 'cfg-frm_cnt_en' should be set.
 
  3) Frame Synchronization Source ('cfg-ext_fsync'):  Can be an
  external/internal frame
  synchronization source. Used to synchronize one channel (MM2S/S2MM) with
  another (S2MM/MM2S) channel.
 
  4) Genlock Synchronization ('cfg-genlock'): Used to avoid mismatch rate 
  between
  master and slave.  In master mode (cfg-master), frames are not dropped 
  and
  slave can drop frames to adjust to master frame rate.
 
  And in future, this Engine being a soft IP, we could expect some more 
  additional
  parameters.  Isn't a good idea to have a private member in dma_slave_config 
  for
  sharing additional configuration between slave device and dma engine? Or a 
  new
  dma_ctrl_cmd like FSLDMA_EXTERNAL_START?

The idea of a generic API is that we can use it for most of the controllers. 
Even
if you are planning to support a family of controllers

ATM, lets not discuss the possiblity of private member and try to exhanust all
possible options. Worst case you can embed the dma_slave_config in
xilinx_dma_slave_config and retrieve it in dmac driver

-- 
~Vinod
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-01 Thread Lars-Peter Clausen


On 01/31/2014 06:44 PM, Andy Gross wrote:

On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:

On 01/24/2014 12:16 PM, Srikanth Thokala wrote:

Hi Lars,

On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  wrote:

On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
[...]

+/**
+ * xilinx_vdma_device_control - Configure DMA channel of the device
+ * @dchan: DMA Channel pointer
+ * @cmd: DMA control command
+ * @arg: Channel configuration
+ *
+ * Return: '0' on success and failure value on error
+ */
+static int xilinx_vdma_device_control(struct dma_chan *dchan,
+   enum dma_ctrl_cmd cmd, unsigned long arg)
+{
+ struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
+
+ switch (cmd) {
+ case DMA_TERMINATE_ALL:
+ xilinx_vdma_terminate_all(chan);
+ return 0;
+ case DMA_SLAVE_CONFIG:
+ return xilinx_vdma_slave_config(chan,
+ (struct xilinx_vdma_config *)arg);


You really shouldn't be overloading the generic API with your own semantics.
DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.


Ok.  The driver needs few additional configuration from the slave
device like Vertical
Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
defined for Freescale drivers?


In my opinion it is not a good idea to have driver implement a generic API,
but at the same time let the driver have custom semantics for those API
calls. It's a bit like having a gpio driver that expects 23 and 42 as the
values passed to gpio_set_value instead of 0 and 1. It completely defeats
the purpose of a generic API, namely that you are able to write generic code
that makes use of the API without having to know about which implementation
API it is talking to. The dmaengine framework provides the
dmaengine_prep_interleaved_dma() function to setup two dimensional
transfers, e.g. take a look at sirf-dma.c or imx-dma.c.



The comments in the include/linux/dmaengine.h state that if you have
non-generic, non-fixed configuration then you can just create your own
structure and embed the dma_slave_config.  Using the container_of you can get
back your structure.


We should probably revise that, since it is not going to work that well.



I agree that we should always use the generic structure if possible, but
sometimes there are some non-standard things that you have to do for your
hardware.  I am currently in a bind for adding some quirky features that are
required by peripherals who want to use the QCOM DMA devices.


Well there are two types of extensions to the API. The first type changes the 
semantics of the API so it is no longer possible to use the API without knowing 
about the extension. This is in my opinion a complete no-go since goes against 
the very idea of a common API. If you implement the common API with custom 
semantics you have a custom API. It's just better hidden since you use the same 
function names. My opinion on this is if you want/need a custom API make it a 
custom API with custom function names. This on one hand avoids confusion about 
the behavior and on the other hand reduces the maintenance burden for the 
common API (e.g. if somebody makes changes to the common API they don't have to 
bother to update your driver and don't have to try to understand the custom 
semantics). The other kind of extensions are those that add additional 
functionality on top of the common API, while keeping the normal semantics for 
the common API. Which means a user that does not know about the extensions is 
still able to function. A user that knows about the extension can make use of 
the additional features.


That said, everybody always thinks their hardware is special and requires 
special extensions. Usually this is not the case, there will always sooner or 
later somebody else who needs the same extensions. The dmaengine API is not set 
in stone, so if you think something is missing to properly to support your 
hardware it is worth investigating if it makes sense to add the missing parts 
to the common API. As I said before the whole point of the exercise of having a 
common API is that we want to abstract away (hardware) implementation specific 
details. This allows the upper layers to have platform independent common code 
to take care of setting up the DMA transfers. E.g. in ALSA subsystem we went 
from 10+ custom implementations of a PCM driver build on top of dmaengine to 1 
generic implementation that is shared between platforms. All those custom PCM 
drivers had hardcoded assumptions about the behavior and features of the 
underlying dmaengine driver. To be able to have one generic PCM driver it was 
necessarily to extend the dmaengine API to be able to expose these differences 
in features and behavior. So as I said the API is not set in stone if it is 
necessary to extend or

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-02-01 Thread Lars-Peter Clausen


On 01/31/2014 06:44 PM, Andy Gross wrote:

On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:

On 01/24/2014 12:16 PM, Srikanth Thokala wrote:

Hi Lars,

On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote:

On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
[...]

+/**
+ * xilinx_vdma_device_control - Configure DMA channel of the device
+ * @dchan: DMA Channel pointer
+ * @cmd: DMA control command
+ * @arg: Channel configuration
+ *
+ * Return: '0' on success and failure value on error
+ */
+static int xilinx_vdma_device_control(struct dma_chan *dchan,
+   enum dma_ctrl_cmd cmd, unsigned long arg)
+{
+ struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
+
+ switch (cmd) {
+ case DMA_TERMINATE_ALL:
+ xilinx_vdma_terminate_all(chan);
+ return 0;
+ case DMA_SLAVE_CONFIG:
+ return xilinx_vdma_slave_config(chan,
+ (struct xilinx_vdma_config *)arg);


You really shouldn't be overloading the generic API with your own semantics.
DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.


Ok.  The driver needs few additional configuration from the slave
device like Vertical
Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
defined for Freescale drivers?


In my opinion it is not a good idea to have driver implement a generic API,
but at the same time let the driver have custom semantics for those API
calls. It's a bit like having a gpio driver that expects 23 and 42 as the
values passed to gpio_set_value instead of 0 and 1. It completely defeats
the purpose of a generic API, namely that you are able to write generic code
that makes use of the API without having to know about which implementation
API it is talking to. The dmaengine framework provides the
dmaengine_prep_interleaved_dma() function to setup two dimensional
transfers, e.g. take a look at sirf-dma.c or imx-dma.c.



The comments in the include/linux/dmaengine.h state that if you have
non-generic, non-fixed configuration then you can just create your own
structure and embed the dma_slave_config.  Using the container_of you can get
back your structure.


We should probably revise that, since it is not going to work that well.



I agree that we should always use the generic structure if possible, but
sometimes there are some non-standard things that you have to do for your
hardware.  I am currently in a bind for adding some quirky features that are
required by peripherals who want to use the QCOM DMA devices.


Well there are two types of extensions to the API. The first type changes the 
semantics of the API so it is no longer possible to use the API without knowing 
about the extension. This is in my opinion a complete no-go since goes against 
the very idea of a common API. If you implement the common API with custom 
semantics you have a custom API. It's just better hidden since you use the same 
function names. My opinion on this is if you want/need a custom API make it a 
custom API with custom function names. This on one hand avoids confusion about 
the behavior and on the other hand reduces the maintenance burden for the 
common API (e.g. if somebody makes changes to the common API they don't have to 
bother to update your driver and don't have to try to understand the custom 
semantics). The other kind of extensions are those that add additional 
functionality on top of the common API, while keeping the normal semantics for 
the common API. Which means a user that does not know about the extensions is 
still able to function. A user that knows about the extension can make use of 
the additional features.


That said, everybody always thinks their hardware is special and requires 
special extensions. Usually this is not the case, there will always sooner or 
later somebody else who needs the same extensions. The dmaengine API is not set 
in stone, so if you think something is missing to properly to support your 
hardware it is worth investigating if it makes sense to add the missing parts 
to the common API. As I said before the whole point of the exercise of having a 
common API is that we want to abstract away (hardware) implementation specific 
details. This allows the upper layers to have platform independent common code 
to take care of setting up the DMA transfers. E.g. in ALSA subsystem we went 
from 10+ custom implementations of a PCM driver build on top of dmaengine to 1 
generic implementation that is shared between platforms. All those custom PCM 
drivers had hardcoded assumptions about the behavior and features of the 
underlying dmaengine driver. To be able to have one generic PCM driver it was 
necessarily to extend the dmaengine API to be able to expose these differences 
in features and behavior. So as I said the API is not set in stone if it is 
necessary

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-31 Thread Andy Gross

On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
> > Hi Lars,
> > 
> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  wrote:
> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> >> [...]
> >>> +/**
> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
> >>> + * @dchan: DMA Channel pointer
> >>> + * @cmd: DMA control command
> >>> + * @arg: Channel configuration
> >>> + *
> >>> + * Return: '0' on success and failure value on error
> >>> + */
> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> >>> +   enum dma_ctrl_cmd cmd, unsigned long 
> >>> arg)
> >>> +{
> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> >>> +
> >>> + switch (cmd) {
> >>> + case DMA_TERMINATE_ALL:
> >>> + xilinx_vdma_terminate_all(chan);
> >>> + return 0;
> >>> + case DMA_SLAVE_CONFIG:
> >>> + return xilinx_vdma_slave_config(chan,
> >>> + (struct xilinx_vdma_config *)arg);
> >>
> >> You really shouldn't be overloading the generic API with your own 
> >> semantics.
> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> > 
> > Ok.  The driver needs few additional configuration from the slave
> > device like Vertical
> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do 
> > you
> > suggest me to define a separate dma_ctrl_cmd like the one 
> > FSLDMA_EXTERNAL_START
> > defined for Freescale drivers?
> 
> In my opinion it is not a good idea to have driver implement a generic API,
> but at the same time let the driver have custom semantics for those API
> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
> values passed to gpio_set_value instead of 0 and 1. It completely defeats
> the purpose of a generic API, namely that you are able to write generic code
> that makes use of the API without having to know about which implementation
> API it is talking to. The dmaengine framework provides the
> dmaengine_prep_interleaved_dma() function to setup two dimensional
> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
> 

The comments in the include/linux/dmaengine.h state that if you have
non-generic, non-fixed configuration then you can just create your own
structure and embed the dma_slave_config.  Using the container_of you can get
back your structure.

I agree that we should always use the generic structure if possible, but
sometimes there are some non-standard things that you have to do for your
hardware.  I am currently in a bind for adding some quirky features that are
required by peripherals who want to use the QCOM DMA devices.

If the context field in prep_slave_sg and prep_dma_cyclic was exposed to
everyone, that would allow an easy way to pass in hardware specific
configuration without bastardizing the slave_config.  I noticed that rapidio is
the only consumer of that field and that they have their own prep function.

If we are not going to allow people to do their own slave_config when they need
to, then we need to remove the comments from the include file and expose the
context to the dmaengine_prep_slave_sg and dmaengine_prep_dma_cyclic.

-- 
sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-31 Thread Andy Gross

On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
 On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
  Hi Lars,
  
  On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote:
  On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
  [...]
  +/**
  + * xilinx_vdma_device_control - Configure DMA channel of the device
  + * @dchan: DMA Channel pointer
  + * @cmd: DMA control command
  + * @arg: Channel configuration
  + *
  + * Return: '0' on success and failure value on error
  + */
  +static int xilinx_vdma_device_control(struct dma_chan *dchan,
  +   enum dma_ctrl_cmd cmd, unsigned long 
  arg)
  +{
  + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
  +
  + switch (cmd) {
  + case DMA_TERMINATE_ALL:
  + xilinx_vdma_terminate_all(chan);
  + return 0;
  + case DMA_SLAVE_CONFIG:
  + return xilinx_vdma_slave_config(chan,
  + (struct xilinx_vdma_config *)arg);
 
  You really shouldn't be overloading the generic API with your own 
  semantics.
  DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
  
  Ok.  The driver needs few additional configuration from the slave
  device like Vertical
  Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do 
  you
  suggest me to define a separate dma_ctrl_cmd like the one 
  FSLDMA_EXTERNAL_START
  defined for Freescale drivers?
 
 In my opinion it is not a good idea to have driver implement a generic API,
 but at the same time let the driver have custom semantics for those API
 calls. It's a bit like having a gpio driver that expects 23 and 42 as the
 values passed to gpio_set_value instead of 0 and 1. It completely defeats
 the purpose of a generic API, namely that you are able to write generic code
 that makes use of the API without having to know about which implementation
 API it is talking to. The dmaengine framework provides the
 dmaengine_prep_interleaved_dma() function to setup two dimensional
 transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
 

The comments in the include/linux/dmaengine.h state that if you have
non-generic, non-fixed configuration then you can just create your own
structure and embed the dma_slave_config.  Using the container_of you can get
back your structure.

I agree that we should always use the generic structure if possible, but
sometimes there are some non-standard things that you have to do for your
hardware.  I am currently in a bind for adding some quirky features that are
required by peripherals who want to use the QCOM DMA devices.

If the context field in prep_slave_sg and prep_dma_cyclic was exposed to
everyone, that would allow an easy way to pass in hardware specific
configuration without bastardizing the slave_config.  I noticed that rapidio is
the only consumer of that field and that they have their own prep function.

If we are not going to allow people to do their own slave_config when they need
to, then we need to remove the comments from the include file and expose the
context to the dmaengine_prep_slave_sg and dmaengine_prep_dma_cyclic.

-- 
sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-30 Thread Srikanth Thokala

Hi Vinod,

On Mon, Jan 27, 2014 at 4:36 PM, Srikanth Thokala  wrote:
> Hi Vinod,
>
> On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul  wrote:
>> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
>>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
>>> > Hi Lars,
>>> >
>>> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  
>>> > wrote:
>>> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>>> >> [...]
>>> >>> +/**
>>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>>> >>> + * @dchan: DMA Channel pointer
>>> >>> + * @cmd: DMA control command
>>> >>> + * @arg: Channel configuration
>>> >>> + *
>>> >>> + * Return: '0' on success and failure value on error
>>> >>> + */
>>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>>> >>> +   enum dma_ctrl_cmd cmd, unsigned 
>>> >>> long arg)
>>> >>> +{
>>> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>>> >>> +
>>> >>> + switch (cmd) {
>>> >>> + case DMA_TERMINATE_ALL:
>>> >>> + xilinx_vdma_terminate_all(chan);
>>> >>> + return 0;
>>> >>> + case DMA_SLAVE_CONFIG:
>>> >>> + return xilinx_vdma_slave_config(chan,
>>> >>> + (struct xilinx_vdma_config *)arg);
>>> >>
>>> >> You really shouldn't be overloading the generic API with your own 
>>> >> semantics.
>>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>>> >
>>> > Ok.  The driver needs few additional configuration from the slave
>>> > device like Vertical
>>> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case 
>>> > do you
>>> > suggest me to define a separate dma_ctrl_cmd like the one 
>>> > FSLDMA_EXTERNAL_START
>>> > defined for Freescale drivers?
>>>
>>> In my opinion it is not a good idea to have driver implement a generic API,
>>> but at the same time let the driver have custom semantics for those API
>>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>>> the purpose of a generic API, namely that you are able to write generic code
>>> that makes use of the API without having to know about which implementation
>>> API it is talking to. The dmaengine framework provides the
>>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
>>
>> The question here i think would be waht this device supports? Is the hardware
>> capable of doing interleaved transfers, then would make sense.
>>
>> While we do try to get users use dma_slave_config, but there will always be
>> someone who have specfic params. If we can generalize then we might want to 
>> add
>> to the dma_slave_config as well
>
> There are many configuration parameters which are specific to IP and I
> would like to
> give an overview of some of parameteres here:
>
> 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame
> referenced by
> 'cfg->park_frm', so user will have control on each frame in this mode.
>
> 2) Interrupt Coalesce ('cfg->coalesce'):  Used for setting interrupt
> threshold. This value
>determines the number of frame buffers to process. To use this feature,
>'cfg->frm_cnt_en' should be set.
>
> 3) Frame Synchronization Source ('cfg->ext_fsync'):  Can be an
> external/internal frame
> synchronization source. Used to synchronize one channel (MM2S/S2MM) with
> another (S2MM/MM2S) channel.
>
> 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate 
> between
> master and slave.  In master mode (cfg->master), frames are not dropped 
> and
> slave can drop frames to adjust to master frame rate.
>
> And in future, this Engine being a soft IP, we could expect some more 
> additional
> parameters.  Isn't a good idea to have a private member in dma_slave_config 
> for
> sharing additional configuration between slave device and dma engine? Or a new
> dma_ctrl_cmd like FSLDMA_EXTERNAL_START?


Ping?

>
> Srikanth
>
>>
>> --
>> ~Vinod
>> --/EX
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-30 Thread Srikanth Thokala

Hi Vinod,

On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul  wrote:
> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
>> Hi Lars/Vinod,
>> >> The question here i think would be waht this device supports? Is the 
>> >> hardware
>> >> capable of doing interleaved transfers, then would make sense.
>> >
>> > The hardware does 2D transfers. The parameters for a transfer are height,
>> > width and stride. That's only a subset of what interleaved transfers can be
>> > (xt->num_frames must be one for 2d transfers). But if I remember correctly
>> > there has been some discussion on this in the past and the result of that
>> > discussion was that using interleaved transfers for 2D transfers is
>> > preferred over adding a custom API for 2D transfers.
>>
>> I went through the prep_interleaved_dma API and I see only one descriptor
>> is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
>> buffers (can be more in future), isn't it less efficient compared to the
>> prep_slave_sg where we get a single sg list and can prepare all the 
>> descriptors
>> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
>> know your opinions.
> Well the descriptor maybe one, but that can represent multiple frames, for
> example 16 as in your case. Can you read up the documentation of how multiple
> frames are passed. Pls see include/linux/dmaengine.h
>
> /**
>  * Interleaved Transfer Request
>  * 
>  * A chunk is collection of contiguous bytes to be transfered.
>  * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
>  * ICGs may or maynot change between chunks.
>  * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
>  *  that when repeated an integral number of times, specifies the transfer.
>  * A transfer template is specification of a Frame, the number of times
>  *  it is to be repeated and other per-transfer attributes.
>  *
>  * Practically, a client driver would have ready a template for each
>  *  type of transfer it is going to need during its lifetime and
>  *  set only 'src_start' and 'dst_start' before submitting the requests.
>  *
>  *
>  *  |  Frame-1|   Frame-2   | ~ |   Frame-'numf'  |
>  *  |==.===...=...|==.===...=...| ~ |==.===...=...|
>  *
>  *==  Chunk size
>  *... ICG
>  */

Yes, it can handle multiple frames specified by 'numf' each of size
'frame_size * sgl[0].size'.
But, I see it only works if all the frames' memory is contiguous and
in this case we
can just increment 'src_start' by the total frame size 'numf' number
of times to fill in
for each HW descriptor (each frame is one HW descriptor).  So, there
is no issue when the
memory is contiguous.  If the frames are non contiguous, we have to
call this API for each
frame (hence for each descriptor), as the src_start for each frame is
different.  Is it correct?

FYI: This hardware has an inbuilt Scatter-Gather engine.

Srikanth

>
> --
> ~Vinod
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-30 Thread Srikanth Thokala

Hi Vinod,

On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote:
 On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
 Hi Lars/Vinod,
  The question here i think would be waht this device supports? Is the 
  hardware
  capable of doing interleaved transfers, then would make sense.
 
  The hardware does 2D transfers. The parameters for a transfer are height,
  width and stride. That's only a subset of what interleaved transfers can be
  (xt-num_frames must be one for 2d transfers). But if I remember correctly
  there has been some discussion on this in the past and the result of that
  discussion was that using interleaved transfers for 2D transfers is
  preferred over adding a custom API for 2D transfers.

 I went through the prep_interleaved_dma API and I see only one descriptor
 is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
 buffers (can be more in future), isn't it less efficient compared to the
 prep_slave_sg where we get a single sg list and can prepare all the 
 descriptors
 (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
 know your opinions.
 Well the descriptor maybe one, but that can represent multiple frames, for
 example 16 as in your case. Can you read up the documentation of how multiple
 frames are passed. Pls see include/linux/dmaengine.h

 /**
  * Interleaved Transfer Request
  * 
  * A chunk is collection of contiguous bytes to be transfered.
  * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
  * ICGs may or maynot change between chunks.
  * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
  *  that when repeated an integral number of times, specifies the transfer.
  * A transfer template is specification of a Frame, the number of times
  *  it is to be repeated and other per-transfer attributes.
  *
  * Practically, a client driver would have ready a template for each
  *  type of transfer it is going to need during its lifetime and
  *  set only 'src_start' and 'dst_start' before submitting the requests.
  *
  *
  *  |  Frame-1|   Frame-2   | ~ |   Frame-'numf'  |
  *  |==.===...=...|==.===...=...| ~ |==.===...=...|
  *
  *==  Chunk size
  *... ICG
  */

Yes, it can handle multiple frames specified by 'numf' each of size
'frame_size * sgl[0].size'.
But, I see it only works if all the frames' memory is contiguous and
in this case we
can just increment 'src_start' by the total frame size 'numf' number
of times to fill in
for each HW descriptor (each frame is one HW descriptor).  So, there
is no issue when the
memory is contiguous.  If the frames are non contiguous, we have to
call this API for each
frame (hence for each descriptor), as the src_start for each frame is
different.  Is it correct?

FYI: This hardware has an inbuilt Scatter-Gather engine.

Srikanth


 --
 ~Vinod
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-30 Thread Srikanth Thokala

Hi Vinod,

On Mon, Jan 27, 2014 at 4:36 PM, Srikanth Thokala stho...@xilinx.com wrote:
 Hi Vinod,

 On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul vinod.k...@intel.com wrote:
 On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
 On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
  Hi Lars,
 
  On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de 
  wrote:
  On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
  [...]
  +/**
  + * xilinx_vdma_device_control - Configure DMA channel of the device
  + * @dchan: DMA Channel pointer
  + * @cmd: DMA control command
  + * @arg: Channel configuration
  + *
  + * Return: '0' on success and failure value on error
  + */
  +static int xilinx_vdma_device_control(struct dma_chan *dchan,
  +   enum dma_ctrl_cmd cmd, unsigned 
  long arg)
  +{
  + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
  +
  + switch (cmd) {
  + case DMA_TERMINATE_ALL:
  + xilinx_vdma_terminate_all(chan);
  + return 0;
  + case DMA_SLAVE_CONFIG:
  + return xilinx_vdma_slave_config(chan,
  + (struct xilinx_vdma_config *)arg);
 
  You really shouldn't be overloading the generic API with your own 
  semantics.
  DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
 
  Ok.  The driver needs few additional configuration from the slave
  device like Vertical
  Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case 
  do you
  suggest me to define a separate dma_ctrl_cmd like the one 
  FSLDMA_EXTERNAL_START
  defined for Freescale drivers?

 In my opinion it is not a good idea to have driver implement a generic API,
 but at the same time let the driver have custom semantics for those API
 calls. It's a bit like having a gpio driver that expects 23 and 42 as the
 values passed to gpio_set_value instead of 0 and 1. It completely defeats
 the purpose of a generic API, namely that you are able to write generic code
 that makes use of the API without having to know about which implementation
 API it is talking to. The dmaengine framework provides the
 dmaengine_prep_interleaved_dma() function to setup two dimensional
 transfers, e.g. take a look at sirf-dma.c or imx-dma.c.

 The question here i think would be waht this device supports? Is the hardware
 capable of doing interleaved transfers, then would make sense.

 While we do try to get users use dma_slave_config, but there will always be
 someone who have specfic params. If we can generalize then we might want to 
 add
 to the dma_slave_config as well

 There are many configuration parameters which are specific to IP and I
 would like to
 give an overview of some of parameteres here:

 1) Park Mode ('cfg-park'): In Park mode, engine will park on frame
 referenced by
 'cfg-park_frm', so user will have control on each frame in this mode.

 2) Interrupt Coalesce ('cfg-coalesce'):  Used for setting interrupt
 threshold. This value
determines the number of frame buffers to process. To use this feature,
'cfg-frm_cnt_en' should be set.

 3) Frame Synchronization Source ('cfg-ext_fsync'):  Can be an
 external/internal frame
 synchronization source. Used to synchronize one channel (MM2S/S2MM) with
 another (S2MM/MM2S) channel.

 4) Genlock Synchronization ('cfg-genlock'): Used to avoid mismatch rate 
 between
 master and slave.  In master mode (cfg-master), frames are not dropped 
 and
 slave can drop frames to adjust to master frame rate.

 And in future, this Engine being a soft IP, we could expect some more 
 additional
 parameters.  Isn't a good idea to have a private member in dma_slave_config 
 for
 sharing additional configuration between slave device and dma engine? Or a new
 dma_ctrl_cmd like FSLDMA_EXTERNAL_START?


Ping?


 Srikanth


 --
 ~Vinod
 --/EX
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-27 Thread Vinod Koul

On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
> Hi Lars/Vinod,
> >> The question here i think would be waht this device supports? Is the 
> >> hardware
> >> capable of doing interleaved transfers, then would make sense.
> >
> > The hardware does 2D transfers. The parameters for a transfer are height,
> > width and stride. That's only a subset of what interleaved transfers can be
> > (xt->num_frames must be one for 2d transfers). But if I remember correctly
> > there has been some discussion on this in the past and the result of that
> > discussion was that using interleaved transfers for 2D transfers is
> > preferred over adding a custom API for 2D transfers.
> 
> I went through the prep_interleaved_dma API and I see only one descriptor
> is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
> buffers (can be more in future), isn't it less efficient compared to the
> prep_slave_sg where we get a single sg list and can prepare all the 
> descriptors
> (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
> know your opinions.
Well the descriptor maybe one, but that can represent multiple frames, for
example 16 as in your case. Can you read up the documentation of how multiple
frames are passed. Pls see include/linux/dmaengine.h 

/**
 * Interleaved Transfer Request
 * 
 * A chunk is collection of contiguous bytes to be transfered.
 * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
 * ICGs may or maynot change between chunks.
 * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
 *  that when repeated an integral number of times, specifies the transfer.
 * A transfer template is specification of a Frame, the number of times
 *  it is to be repeated and other per-transfer attributes.
 *
 * Practically, a client driver would have ready a template for each
 *  type of transfer it is going to need during its lifetime and
 *  set only 'src_start' and 'dst_start' before submitting the requests.
 *
 *
 *  |  Frame-1|   Frame-2   | ~ |   Frame-'numf'  |
 *  |==.===...=...|==.===...=...| ~ |==.===...=...|
 *
 *==  Chunk size
 *... ICG
 */

-- 
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-27 Thread Vinod Koul

On Sun, Jan 26, 2014 at 06:39:21PM +0100, Lars-Peter Clausen wrote:
> On 01/26/2014 02:59 PM, Vinod Koul wrote:
> > On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
> >> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
> >>> Hi Lars,
> >>>
> >>> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  
> >>> wrote:
>  On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>  [...]
> > +/**
> > + * xilinx_vdma_device_control - Configure DMA channel of the device
> > + * @dchan: DMA Channel pointer
> > + * @cmd: DMA control command
> > + * @arg: Channel configuration
> > + *
> > + * Return: '0' on success and failure value on error
> > + */
> > +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> > +   enum dma_ctrl_cmd cmd, unsigned 
> > long arg)
> > +{
> > + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> > +
> > + switch (cmd) {
> > + case DMA_TERMINATE_ALL:
> > + xilinx_vdma_terminate_all(chan);
> > + return 0;
> > + case DMA_SLAVE_CONFIG:
> > + return xilinx_vdma_slave_config(chan,
> > + (struct xilinx_vdma_config *)arg);
> 
>  You really shouldn't be overloading the generic API with your own 
>  semantics.
>  DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> >>>
> >>> Ok.  The driver needs few additional configuration from the slave
> >>> device like Vertical
> >>> Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case 
> >>> do you
> >>> suggest me to define a separate dma_ctrl_cmd like the one 
> >>> FSLDMA_EXTERNAL_START
> >>> defined for Freescale drivers?
> >>
> >> In my opinion it is not a good idea to have driver implement a generic API,
> >> but at the same time let the driver have custom semantics for those API
> >> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
> >> values passed to gpio_set_value instead of 0 and 1. It completely defeats
> >> the purpose of a generic API, namely that you are able to write generic 
> >> code
> >> that makes use of the API without having to know about which implementation
> >> API it is talking to. The dmaengine framework provides the
> >> dmaengine_prep_interleaved_dma() function to setup two dimensional
> >> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
> > 
> > The question here i think would be waht this device supports? Is the 
> > hardware
> > capable of doing interleaved transfers, then would make sense.
> 
> The hardware does 2D transfers. The parameters for a transfer are height,
> width and stride. That's only a subset of what interleaved transfers can be
> (xt->num_frames must be one for 2d transfers). But if I remember correctly
> there has been some discussion on this in the past and the result of that
> discussion was that using interleaved transfers for 2D transfers is
> preferred over adding a custom API for 2D transfers.
Yup that would be my recomendation. Moving this driver to interleaved API seems
right to me

-- 
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-27 Thread Srikanth Thokala

Hi Lars/Vinod,

On Sun, Jan 26, 2014 at 11:09 PM, Lars-Peter Clausen  wrote:
> On 01/26/2014 02:59 PM, Vinod Koul wrote:
>> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
>>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
 Hi Lars,

 On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  
 wrote:
> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> [...]
>> +/**
>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>> + * @dchan: DMA Channel pointer
>> + * @cmd: DMA control command
>> + * @arg: Channel configuration
>> + *
>> + * Return: '0' on success and failure value on error
>> + */
>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>> +   enum dma_ctrl_cmd cmd, unsigned long 
>> arg)
>> +{
>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +
>> + switch (cmd) {
>> + case DMA_TERMINATE_ALL:
>> + xilinx_vdma_terminate_all(chan);
>> + return 0;
>> + case DMA_SLAVE_CONFIG:
>> + return xilinx_vdma_slave_config(chan,
>> + (struct xilinx_vdma_config *)arg);
>
> You really shouldn't be overloading the generic API with your own 
> semantics.
> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.

 Ok.  The driver needs few additional configuration from the slave
 device like Vertical
 Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case 
 do you
 suggest me to define a separate dma_ctrl_cmd like the one 
 FSLDMA_EXTERNAL_START
 defined for Freescale drivers?
>>>
>>> In my opinion it is not a good idea to have driver implement a generic API,
>>> but at the same time let the driver have custom semantics for those API
>>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>>> the purpose of a generic API, namely that you are able to write generic code
>>> that makes use of the API without having to know about which implementation
>>> API it is talking to. The dmaengine framework provides the
>>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
>>
>> The question here i think would be waht this device supports? Is the hardware
>> capable of doing interleaved transfers, then would make sense.
>
> The hardware does 2D transfers. The parameters for a transfer are height,
> width and stride. That's only a subset of what interleaved transfers can be
> (xt->num_frames must be one for 2d transfers). But if I remember correctly
> there has been some discussion on this in the past and the result of that
> discussion was that using interleaved transfers for 2D transfers is
> preferred over adding a custom API for 2D transfers.

I went through the prep_interleaved_dma API and I see only one descriptor
is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
buffers (can be more in future), isn't it less efficient compared to the
prep_slave_sg where we get a single sg list and can prepare all the descriptors
(of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
know your opinions.

Srikanth

>
> - Lars
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-27 Thread Srikanth Thokala

Hi Vinod,

On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul  wrote:
> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
>> > Hi Lars,
>> >
>> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  
>> > wrote:
>> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>> >> [...]
>> >>> +/**
>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>> >>> + * @dchan: DMA Channel pointer
>> >>> + * @cmd: DMA control command
>> >>> + * @arg: Channel configuration
>> >>> + *
>> >>> + * Return: '0' on success and failure value on error
>> >>> + */
>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>> >>> +   enum dma_ctrl_cmd cmd, unsigned long 
>> >>> arg)
>> >>> +{
>> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> >>> +
>> >>> + switch (cmd) {
>> >>> + case DMA_TERMINATE_ALL:
>> >>> + xilinx_vdma_terminate_all(chan);
>> >>> + return 0;
>> >>> + case DMA_SLAVE_CONFIG:
>> >>> + return xilinx_vdma_slave_config(chan,
>> >>> + (struct xilinx_vdma_config *)arg);
>> >>
>> >> You really shouldn't be overloading the generic API with your own 
>> >> semantics.
>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>> >
>> > Ok.  The driver needs few additional configuration from the slave
>> > device like Vertical
>> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case 
>> > do you
>> > suggest me to define a separate dma_ctrl_cmd like the one 
>> > FSLDMA_EXTERNAL_START
>> > defined for Freescale drivers?
>>
>> In my opinion it is not a good idea to have driver implement a generic API,
>> but at the same time let the driver have custom semantics for those API
>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>> the purpose of a generic API, namely that you are able to write generic code
>> that makes use of the API without having to know about which implementation
>> API it is talking to. The dmaengine framework provides the
>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
>
> The question here i think would be waht this device supports? Is the hardware
> capable of doing interleaved transfers, then would make sense.
>
> While we do try to get users use dma_slave_config, but there will always be
> someone who have specfic params. If we can generalize then we might want to 
> add
> to the dma_slave_config as well

There are many configuration parameters which are specific to IP and I
would like to
give an overview of some of parameteres here:

1) Park Mode ('cfg->park'): In Park mode, engine will park on frame
referenced by
'cfg->park_frm', so user will have control on each frame in this mode.

2) Interrupt Coalesce ('cfg->coalesce'):  Used for setting interrupt
threshold. This value
   determines the number of frame buffers to process. To use this feature,
   'cfg->frm_cnt_en' should be set.

3) Frame Synchronization Source ('cfg->ext_fsync'):  Can be an
external/internal frame
synchronization source. Used to synchronize one channel (MM2S/S2MM) with
another (S2MM/MM2S) channel.

4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate between
master and slave.  In master mode (cfg->master), frames are not dropped and
slave can drop frames to adjust to master frame rate.

And in future, this Engine being a soft IP, we could expect some more additional
parameters.  Isn't a good idea to have a private member in dma_slave_config for
sharing additional configuration between slave device and dma engine? Or a new
dma_ctrl_cmd like FSLDMA_EXTERNAL_START?

Srikanth

>
> --
> ~Vinod
> --/EX
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-27 Thread Srikanth Thokala

Hi Vinod,

On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul vinod.k...@intel.com wrote:
 On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
 On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
  Hi Lars,
 
  On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de 
  wrote:
  On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
  [...]
  +/**
  + * xilinx_vdma_device_control - Configure DMA channel of the device
  + * @dchan: DMA Channel pointer
  + * @cmd: DMA control command
  + * @arg: Channel configuration
  + *
  + * Return: '0' on success and failure value on error
  + */
  +static int xilinx_vdma_device_control(struct dma_chan *dchan,
  +   enum dma_ctrl_cmd cmd, unsigned long 
  arg)
  +{
  + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
  +
  + switch (cmd) {
  + case DMA_TERMINATE_ALL:
  + xilinx_vdma_terminate_all(chan);
  + return 0;
  + case DMA_SLAVE_CONFIG:
  + return xilinx_vdma_slave_config(chan,
  + (struct xilinx_vdma_config *)arg);
 
  You really shouldn't be overloading the generic API with your own 
  semantics.
  DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
 
  Ok.  The driver needs few additional configuration from the slave
  device like Vertical
  Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case 
  do you
  suggest me to define a separate dma_ctrl_cmd like the one 
  FSLDMA_EXTERNAL_START
  defined for Freescale drivers?

 In my opinion it is not a good idea to have driver implement a generic API,
 but at the same time let the driver have custom semantics for those API
 calls. It's a bit like having a gpio driver that expects 23 and 42 as the
 values passed to gpio_set_value instead of 0 and 1. It completely defeats
 the purpose of a generic API, namely that you are able to write generic code
 that makes use of the API without having to know about which implementation
 API it is talking to. The dmaengine framework provides the
 dmaengine_prep_interleaved_dma() function to setup two dimensional
 transfers, e.g. take a look at sirf-dma.c or imx-dma.c.

 The question here i think would be waht this device supports? Is the hardware
 capable of doing interleaved transfers, then would make sense.

 While we do try to get users use dma_slave_config, but there will always be
 someone who have specfic params. If we can generalize then we might want to 
 add
 to the dma_slave_config as well

There are many configuration parameters which are specific to IP and I
would like to
give an overview of some of parameteres here:

1) Park Mode ('cfg-park'): In Park mode, engine will park on frame
referenced by
'cfg-park_frm', so user will have control on each frame in this mode.

2) Interrupt Coalesce ('cfg-coalesce'):  Used for setting interrupt
threshold. This value
   determines the number of frame buffers to process. To use this feature,
   'cfg-frm_cnt_en' should be set.

3) Frame Synchronization Source ('cfg-ext_fsync'):  Can be an
external/internal frame
synchronization source. Used to synchronize one channel (MM2S/S2MM) with
another (S2MM/MM2S) channel.

4) Genlock Synchronization ('cfg-genlock'): Used to avoid mismatch rate between
master and slave.  In master mode (cfg-master), frames are not dropped and
slave can drop frames to adjust to master frame rate.

And in future, this Engine being a soft IP, we could expect some more additional
parameters.  Isn't a good idea to have a private member in dma_slave_config for
sharing additional configuration between slave device and dma engine? Or a new
dma_ctrl_cmd like FSLDMA_EXTERNAL_START?

Srikanth


 --
 ~Vinod
 --/EX
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-27 Thread Srikanth Thokala

Hi Lars/Vinod,

On Sun, Jan 26, 2014 at 11:09 PM, Lars-Peter Clausen l...@metafoo.de wrote:
 On 01/26/2014 02:59 PM, Vinod Koul wrote:
 On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
 On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
 Hi Lars,

 On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de 
 wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
 [...]
 +/**
 + * xilinx_vdma_device_control - Configure DMA channel of the device
 + * @dchan: DMA Channel pointer
 + * @cmd: DMA control command
 + * @arg: Channel configuration
 + *
 + * Return: '0' on success and failure value on error
 + */
 +static int xilinx_vdma_device_control(struct dma_chan *dchan,
 +   enum dma_ctrl_cmd cmd, unsigned long 
 arg)
 +{
 + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
 +
 + switch (cmd) {
 + case DMA_TERMINATE_ALL:
 + xilinx_vdma_terminate_all(chan);
 + return 0;
 + case DMA_SLAVE_CONFIG:
 + return xilinx_vdma_slave_config(chan,
 + (struct xilinx_vdma_config *)arg);

 You really shouldn't be overloading the generic API with your own 
 semantics.
 DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.

 Ok.  The driver needs few additional configuration from the slave
 device like Vertical
 Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case 
 do you
 suggest me to define a separate dma_ctrl_cmd like the one 
 FSLDMA_EXTERNAL_START
 defined for Freescale drivers?

 In my opinion it is not a good idea to have driver implement a generic API,
 but at the same time let the driver have custom semantics for those API
 calls. It's a bit like having a gpio driver that expects 23 and 42 as the
 values passed to gpio_set_value instead of 0 and 1. It completely defeats
 the purpose of a generic API, namely that you are able to write generic code
 that makes use of the API without having to know about which implementation
 API it is talking to. The dmaengine framework provides the
 dmaengine_prep_interleaved_dma() function to setup two dimensional
 transfers, e.g. take a look at sirf-dma.c or imx-dma.c.

 The question here i think would be waht this device supports? Is the hardware
 capable of doing interleaved transfers, then would make sense.

 The hardware does 2D transfers. The parameters for a transfer are height,
 width and stride. That's only a subset of what interleaved transfers can be
 (xt-num_frames must be one for 2d transfers). But if I remember correctly
 there has been some discussion on this in the past and the result of that
 discussion was that using interleaved transfers for 2D transfers is
 preferred over adding a custom API for 2D transfers.

I went through the prep_interleaved_dma API and I see only one descriptor
is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
buffers (can be more in future), isn't it less efficient compared to the
prep_slave_sg where we get a single sg list and can prepare all the descriptors
(of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
know your opinions.

Srikanth


 - Lars

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-27 Thread Vinod Koul

On Sun, Jan 26, 2014 at 06:39:21PM +0100, Lars-Peter Clausen wrote:
 On 01/26/2014 02:59 PM, Vinod Koul wrote:
  On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
  On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
  Hi Lars,
 
  On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de 
  wrote:
  On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
  [...]
  +/**
  + * xilinx_vdma_device_control - Configure DMA channel of the device
  + * @dchan: DMA Channel pointer
  + * @cmd: DMA control command
  + * @arg: Channel configuration
  + *
  + * Return: '0' on success and failure value on error
  + */
  +static int xilinx_vdma_device_control(struct dma_chan *dchan,
  +   enum dma_ctrl_cmd cmd, unsigned 
  long arg)
  +{
  + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
  +
  + switch (cmd) {
  + case DMA_TERMINATE_ALL:
  + xilinx_vdma_terminate_all(chan);
  + return 0;
  + case DMA_SLAVE_CONFIG:
  + return xilinx_vdma_slave_config(chan,
  + (struct xilinx_vdma_config *)arg);
 
  You really shouldn't be overloading the generic API with your own 
  semantics.
  DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
 
  Ok.  The driver needs few additional configuration from the slave
  device like Vertical
  Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case 
  do you
  suggest me to define a separate dma_ctrl_cmd like the one 
  FSLDMA_EXTERNAL_START
  defined for Freescale drivers?
 
  In my opinion it is not a good idea to have driver implement a generic API,
  but at the same time let the driver have custom semantics for those API
  calls. It's a bit like having a gpio driver that expects 23 and 42 as the
  values passed to gpio_set_value instead of 0 and 1. It completely defeats
  the purpose of a generic API, namely that you are able to write generic 
  code
  that makes use of the API without having to know about which implementation
  API it is talking to. The dmaengine framework provides the
  dmaengine_prep_interleaved_dma() function to setup two dimensional
  transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
  
  The question here i think would be waht this device supports? Is the 
  hardware
  capable of doing interleaved transfers, then would make sense.
 
 The hardware does 2D transfers. The parameters for a transfer are height,
 width and stride. That's only a subset of what interleaved transfers can be
 (xt-num_frames must be one for 2d transfers). But if I remember correctly
 there has been some discussion on this in the past and the result of that
 discussion was that using interleaved transfers for 2D transfers is
 preferred over adding a custom API for 2D transfers.
Yup that would be my recomendation. Moving this driver to interleaved API seems
right to me

-- 
~Vinod
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-27 Thread Vinod Koul

On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote:
 Hi Lars/Vinod,
  The question here i think would be waht this device supports? Is the 
  hardware
  capable of doing interleaved transfers, then would make sense.
 
  The hardware does 2D transfers. The parameters for a transfer are height,
  width and stride. That's only a subset of what interleaved transfers can be
  (xt-num_frames must be one for 2d transfers). But if I remember correctly
  there has been some discussion on this in the past and the result of that
  discussion was that using interleaved transfers for 2D transfers is
  preferred over adding a custom API for 2D transfers.
 
 I went through the prep_interleaved_dma API and I see only one descriptor
 is prepared per API call (i.e. per frame).  As our IP supports upto 16 frame
 buffers (can be more in future), isn't it less efficient compared to the
 prep_slave_sg where we get a single sg list and can prepare all the 
 descriptors
 (of non-contiguous buffers) in one go?  Correct me, if am wrong and let me
 know your opinions.
Well the descriptor maybe one, but that can represent multiple frames, for
example 16 as in your case. Can you read up the documentation of how multiple
frames are passed. Pls see include/linux/dmaengine.h 

/**
 * Interleaved Transfer Request
 * 
 * A chunk is collection of contiguous bytes to be transfered.
 * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG).
 * ICGs may or maynot change between chunks.
 * A FRAME is the smallest series of contiguous {chunk,icg} pairs,
 *  that when repeated an integral number of times, specifies the transfer.
 * A transfer template is specification of a Frame, the number of times
 *  it is to be repeated and other per-transfer attributes.
 *
 * Practically, a client driver would have ready a template for each
 *  type of transfer it is going to need during its lifetime and
 *  set only 'src_start' and 'dst_start' before submitting the requests.
 *
 *
 *  |  Frame-1|   Frame-2   | ~ |   Frame-'numf'  |
 *  |==.===...=...|==.===...=...| ~ |==.===...=...|
 *
 *==  Chunk size
 *... ICG
 */

-- 
~Vinod
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Lars-Peter Clausen

On 01/26/2014 03:24 PM, Vinod Koul wrote:
> On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote:
>> This is the driver for the AXI Video Direct Memory Access (AXI
>> VDMA) core, which is a soft Xilinx IP core that provides high-
>> bandwidth direct memory access between memory and AXI4-Stream
>> type video target peripherals. The core provides efficient two
>> dimensional DMA operations with independent asynchronous read
> ok here is tha catch, do you want to support interleaved API rather?
> 
>> +* DMA client + +Required properties: +- dmas: a list of <[Video DMA device
>> phandle] [Channel ID]> pairs, +  where Channel ID is '0' for write/tx and
>> '1' for read/rx +channel.  +- dma-names: a list of DMA channel names, one
>> per "dmas" entry + +Example: + + +vdmatest_0: vdmatest@0 { +
>> compatible ="xlnx,axi-vdma-test-1.00.a"; +   dmas = <_vdma_0 0 +
>> _vdma_0 1>; +dma-names = "vdma0", "vdma1"; +} ;
> Need ack from DT folks. ALso would be better to split the binding to a 
> separate
> patch
> 
> 
>> +/**
>> + * struct xilinx_vdma_chan - Driver specific VDMA channel structure
>> + * @xdev: Driver specific device structure
>> + * @ctrl_offset: Control registers offset
>> + * @desc_offset: TX descriptor registers offset
>> + * @completed_cookie: Maximum cookie completed
>> + * @cookie: The current cookie
>> + * @lock: Descriptor operation lock
>> + * @pending_list: Descriptors waiting
>> + * @active_desc: Active descriptor
>> + * @done_list: Complete descriptors
>> + * @common: DMA common channel
>> + * @desc_pool: Descriptors pool
>> + * @dev: The dma device
>> + * @irq: Channel IRQ
>> + * @id: Channel ID
>> + * @direction: Transfer direction
>> + * @num_frms: Number of frames
>> + * @has_sg: Support scatter transfers
>> + * @genlock: Support genlock mode
>> + * @err: Channel has errors
>> + * @tasklet: Cleanup work after irq
>> + * @config: Device configuration info
>> + * @flush_on_fsync: Flush on Frame sync
>> + */
>> +struct xilinx_vdma_chan {
>> +struct xilinx_vdma_device *xdev;
>> +u32 ctrl_offset;
>> +u32 desc_offset;
>> +dma_cookie_t completed_cookie;
>> +dma_cookie_t cookie;
>> +spinlock_t lock;
>> +struct list_head pending_list;
>> +struct xilinx_vdma_tx_descriptor *active_desc;
>> +struct list_head done_list;
>> +struct dma_chan common;
>> +struct dma_pool *desc_pool;
>> +struct device *dev;
>> +int irq;
>> +int id;
>> +enum dma_transfer_direction direction;
> why should channel have a direction... descriptor should have direction and 
> not
> the channel!

The channel only supports transfers in one direction. Either from memory to
peripheral or from peripheral to memory, that's fixed and can't be changed
at runtime. The driver needs to know which direction the channel supports so
it can reject transfers with the wrong direction.

[...]
>> +
> 
>> + * xilinx_vdma_tx_status - Get VDMA transaction status
>> + * @dchan: DMA channel
>> + * @cookie: Transaction identifier
>> + * @txstate: Transaction state
>> + *
>> + * Return: DMA transaction status
>> + */
>> +static enum dma_status xilinx_vdma_tx_status(struct dma_chan *dchan,
>> +dma_cookie_t cookie,
>> +struct dma_tx_state *txstate)
>> +{
>> +struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +dma_cookie_t last_used;
>> +dma_cookie_t last_complete;
>> +
>> +xilinx_vdma_chan_desc_cleanup(chan);
>> +
>> +last_used = dchan->cookie;
>> +last_complete = chan->completed_cookie;
>> +
>> +dma_set_tx_state(txstate, last_complete, last_used, 0);
>> +
>> +return dma_async_is_complete(cookie, last_complete, last_used);
> no residue calculation?
> 

The hardware doesn't support that.

>> +/**
>> + * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE 
>> transaction
>> + * @dchan: DMA channel
>> + * @sgl: scatterlist to transfer to/from
>> + * @sg_len: number of entries in @sgl
>> + * @dir: DMA direction
>> + * @flags: transfer ack flags
>> + * @context: unused
>> + *
>> + * Return: Async transaction descriptor on success and NULL on failure
>> + */
>> +static struct dma_async_tx_descriptor *
>> +xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
>> +  unsigned int sg_len, enum dma_transfer_direction dir,
>> +  unsigned long flags, void *context)
> okay now am worried, this is supposed to memcpy DMA so why slave-sg??

The DMA is either from memory to peripheral or from peripheral to memory
depending on the direction. So slave sg should be fine.

> 
> Looking at the driver overall, IMHO we need to do:
> - use the virt-dma to simplfy the cookie handling and perhpasn the descrptors
>   too!
> - Perhpas use interleaved API..?
> - I dont think we should use the slave API as this seems memcpy case!
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Lars-Peter Clausen

On 01/26/2014 03:03 PM, Vinod Koul wrote:
> On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote:
>> On 01/23/2014 03:00 PM, Andy Shevchenko wrote:
>>> On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
 On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>
> [...]
>
>>> +   /* Request the interrupt */
>>> +   chan->irq = irq_of_parse_and_map(node, 0);
>>> +   err = devm_request_irq(xdev->dev, chan->irq, 
>>> xilinx_vdma_irq_handler,
>>> +  IRQF_SHARED, "xilinx-vdma-controller", 
>>> chan);
>>
>> This is a clasic example of where to not use devm_request_irq. 'chan' is
>> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
>> will be freed before the interrupt handler has been released, which means
>> there is now a race condition where the interrupt handler can access 
>> already
>> freed memory.ta
>
> Could you elaborate this case? As far as I understood managed resources
> are a kind of stack pile. In this case you have no such condition. Where
> am I wrong?

 The stacked stuff is only ran after the remove() function. Which means that
 you call dma_async_device_unregister() before the interrupt handler is
 freed. Another issue with the interrupt handler is a bit hidden. The driver
 does not call tasklet_kill in the remove function. Which it should though 
 to
 make sure that the tasklet does not race against the freeing of the memory.
 And in order to make sure that the tasklet is not rescheduled you need to
 free the irq before killing the tasklet, since the interrupt handler
 schedules the tasklet.
>>>
>>> So, you mean devm_request_irq() will race in any DMA driver?
>>
>> Most likely yes. devm_request_irq() is race condition prone for the majority
>> of device driver. You have to be really careful if you want to use it.
>>
>>>
>>> I think the proper solution is to disable all device work in
>>> the .remove() and devm will care about resources.
>>
>> As long as the interrupt handler is registered it can be called, the only
>> proper solution is to make sure that the order in which resources are torn
>> down is correct.
> Wouldn't it work if we register the irq last in the probe. That wil ensure on
> success the channel is always valid.

Yes, but only if the irq is not device managed. All device managed resources
will be freed after the remove function has been called. Which is to late in
our case since we make sure that the tasklet is not running in the remove
function.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Lars-Peter Clausen

On 01/26/2014 02:59 PM, Vinod Koul wrote:
> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
>>> Hi Lars,
>>>
>>> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
 [...]
> +/**
> + * xilinx_vdma_device_control - Configure DMA channel of the device
> + * @dchan: DMA Channel pointer
> + * @cmd: DMA control command
> + * @arg: Channel configuration
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> +   enum dma_ctrl_cmd cmd, unsigned long 
> arg)
> +{
> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> + switch (cmd) {
> + case DMA_TERMINATE_ALL:
> + xilinx_vdma_terminate_all(chan);
> + return 0;
> + case DMA_SLAVE_CONFIG:
> + return xilinx_vdma_slave_config(chan,
> + (struct xilinx_vdma_config *)arg);

 You really shouldn't be overloading the generic API with your own 
 semantics.
 DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
>>>
>>> Ok.  The driver needs few additional configuration from the slave
>>> device like Vertical
>>> Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do 
>>> you
>>> suggest me to define a separate dma_ctrl_cmd like the one 
>>> FSLDMA_EXTERNAL_START
>>> defined for Freescale drivers?
>>
>> In my opinion it is not a good idea to have driver implement a generic API,
>> but at the same time let the driver have custom semantics for those API
>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
>> values passed to gpio_set_value instead of 0 and 1. It completely defeats
>> the purpose of a generic API, namely that you are able to write generic code
>> that makes use of the API without having to know about which implementation
>> API it is talking to. The dmaengine framework provides the
>> dmaengine_prep_interleaved_dma() function to setup two dimensional
>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
> 
> The question here i think would be waht this device supports? Is the hardware
> capable of doing interleaved transfers, then would make sense.

The hardware does 2D transfers. The parameters for a transfer are height,
width and stride. That's only a subset of what interleaved transfers can be
(xt->num_frames must be one for 2d transfers). But if I remember correctly
there has been some discussion on this in the past and the result of that
discussion was that using interleaved transfers for 2D transfers is
preferred over adding a custom API for 2D transfers.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Vinod Koul

On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote:
> This is the driver for the AXI Video Direct Memory Access (AXI
> VDMA) core, which is a soft Xilinx IP core that provides high-
> bandwidth direct memory access between memory and AXI4-Stream
> type video target peripherals. The core provides efficient two
> dimensional DMA operations with independent asynchronous read
ok here is tha catch, do you want to support interleaved API rather?

> +* DMA client + +Required properties: +- dmas: a list of <[Video DMA device
> phandle] [Channel ID]> pairs, +   where Channel ID is '0' for write/tx and
> '1' for read/rx + channel.  +- dma-names: a list of DMA channel names, one
> per "dmas" entry + +Example: + + +vdmatest_0: vdmatest@0 { +
> compatible ="xlnx,axi-vdma-test-1.00.a"; +dmas = <_vdma_0 0 +
> _vdma_0 1>; + dma-names = "vdma0", "vdma1"; +} ;
Need ack from DT folks. ALso would be better to split the binding to a separate
patch


> +/**
> + * struct xilinx_vdma_chan - Driver specific VDMA channel structure
> + * @xdev: Driver specific device structure
> + * @ctrl_offset: Control registers offset
> + * @desc_offset: TX descriptor registers offset
> + * @completed_cookie: Maximum cookie completed
> + * @cookie: The current cookie
> + * @lock: Descriptor operation lock
> + * @pending_list: Descriptors waiting
> + * @active_desc: Active descriptor
> + * @done_list: Complete descriptors
> + * @common: DMA common channel
> + * @desc_pool: Descriptors pool
> + * @dev: The dma device
> + * @irq: Channel IRQ
> + * @id: Channel ID
> + * @direction: Transfer direction
> + * @num_frms: Number of frames
> + * @has_sg: Support scatter transfers
> + * @genlock: Support genlock mode
> + * @err: Channel has errors
> + * @tasklet: Cleanup work after irq
> + * @config: Device configuration info
> + * @flush_on_fsync: Flush on Frame sync
> + */
> +struct xilinx_vdma_chan {
> + struct xilinx_vdma_device *xdev;
> + u32 ctrl_offset;
> + u32 desc_offset;
> + dma_cookie_t completed_cookie;
> + dma_cookie_t cookie;
> + spinlock_t lock;
> + struct list_head pending_list;
> + struct xilinx_vdma_tx_descriptor *active_desc;
> + struct list_head done_list;
> + struct dma_chan common;
> + struct dma_pool *desc_pool;
> + struct device *dev;
> + int irq;
> + int id;
> + enum dma_transfer_direction direction;
why should channel have a direction... descriptor should have direction and not
the channel!

> +/**
> + * xilinx_vdma_free_tx_descriptor - Free transaction descriptor
> + * @chan: Driver specific VDMA channel
> + * @desc: VDMA transaction descriptor
> + */
> +static void
> +xilinx_vdma_free_tx_descriptor(struct xilinx_vdma_chan *chan,
> +struct xilinx_vdma_tx_descriptor *desc)
> +{
> + struct xilinx_vdma_tx_segment *segment, *next;
> +
> + if (!desc)
> + return;
> +
> + list_for_each_entry_safe(segment, next, >segments, node) {
do you want to use _safe. Isee that this is called for cleanup while lock held,
and in other case within another _safe iterator!

> + list_del(>node);
> + xilinx_vdma_free_tx_segment(chan, segment);
> + }
> +
> + kfree(desc);
> +}
> +
> +/* Required functions */
> +

> + * xilinx_vdma_do_tasklet - Schedule completion tasklet
> + * @data: Pointer to the Xilinx VDMA channel structure
> + */
> +static void xilinx_vdma_do_tasklet(unsigned long data)
> +{
> + struct xilinx_vdma_chan *chan = (struct xilinx_vdma_chan *)data;
> +
> + xilinx_vdma_chan_desc_cleanup(chan);
> +}
> +
> +/**
> + * xilinx_vdma_alloc_chan_resources - Allocate channel resources
> + * @dchan: DMA channel
> + *
> + * Return: '1' on success and failure value on error
naaah, we dont do that, pls use standard notation of 0 on success
Also API wants you to return descriptors allocated here!

> + */
> +static int xilinx_vdma_alloc_chan_resources(struct dma_chan *dchan)
> +{
> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> + /* Has this channel already been allocated? */
> + if (chan->desc_pool)
> + return 1;
> +
> + /*
> +  * We need the descriptor to be aligned to 64bytes
> +  * for meeting Xilinx VDMA specification requirement.
> +  */
> + chan->desc_pool = dma_pool_create("xilinx_vdma_desc_pool",
> + chan->dev,
> + sizeof(struct xilinx_vdma_tx_segment),
> + __alignof__(struct xilinx_vdma_tx_segment), 0);
> + if (!chan->desc_pool) {
> + dev_err(chan->dev,
> + "unable to allocate channel %d descriptor pool\n",
> + chan->id);
> + return -ENOMEM;
> + }
> +
> + tasklet_init(>tasklet, xilinx_vdma_do_tasklet,
> + (unsigned long)chan);
> +
> + chan->completed_cookie = DMA_MIN_COOKIE;
> + chan->cookie = DMA_MIN_COOKIE;
Can you

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Vinod Koul

On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote:
> On 01/23/2014 03:00 PM, Andy Shevchenko wrote:
> > On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
> >> On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
> >>> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
>  On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> >>>
> >>> [...]
> >>>
> > +   /* Request the interrupt */
> > +   chan->irq = irq_of_parse_and_map(node, 0);
> > +   err = devm_request_irq(xdev->dev, chan->irq, 
> > xilinx_vdma_irq_handler,
> > +  IRQF_SHARED, "xilinx-vdma-controller", 
> > chan);
> 
>  This is a clasic example of where to not use devm_request_irq. 'chan' is
>  accessed in the interrupt handler, but if you use devm_request_irq 'chan'
>  will be freed before the interrupt handler has been released, which means
>  there is now a race condition where the interrupt handler can access 
>  already
>  freed memory.ta
> >>>
> >>> Could you elaborate this case? As far as I understood managed resources
> >>> are a kind of stack pile. In this case you have no such condition. Where
> >>> am I wrong?
> >>
> >> The stacked stuff is only ran after the remove() function. Which means that
> >> you call dma_async_device_unregister() before the interrupt handler is
> >> freed. Another issue with the interrupt handler is a bit hidden. The driver
> >> does not call tasklet_kill in the remove function. Which it should though 
> >> to
> >> make sure that the tasklet does not race against the freeing of the memory.
> >> And in order to make sure that the tasklet is not rescheduled you need to
> >> free the irq before killing the tasklet, since the interrupt handler
> >> schedules the tasklet.
> > 
> > So, you mean devm_request_irq() will race in any DMA driver?
> 
> Most likely yes. devm_request_irq() is race condition prone for the majority
> of device driver. You have to be really careful if you want to use it.
> 
> > 
> > I think the proper solution is to disable all device work in
> > the .remove() and devm will care about resources.
> 
> As long as the interrupt handler is registered it can be called, the only
> proper solution is to make sure that the order in which resources are torn
> down is correct.
Wouldn't it work if we register the irq last in the probe. That wil ensure on
success the channel is always valid.

Also the tasklet is required to be killed not just in your .remove but also in
drivers .suspend handler, you dont want handler to be invoked after you returned
from your suspend

--
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Vinod Koul

On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
> On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
> > Hi Lars,
> > 
> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  wrote:
> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> >> [...]
> >>> +/**
> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device
> >>> + * @dchan: DMA Channel pointer
> >>> + * @cmd: DMA control command
> >>> + * @arg: Channel configuration
> >>> + *
> >>> + * Return: '0' on success and failure value on error
> >>> + */
> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> >>> +   enum dma_ctrl_cmd cmd, unsigned long 
> >>> arg)
> >>> +{
> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> >>> +
> >>> + switch (cmd) {
> >>> + case DMA_TERMINATE_ALL:
> >>> + xilinx_vdma_terminate_all(chan);
> >>> + return 0;
> >>> + case DMA_SLAVE_CONFIG:
> >>> + return xilinx_vdma_slave_config(chan,
> >>> + (struct xilinx_vdma_config *)arg);
> >>
> >> You really shouldn't be overloading the generic API with your own 
> >> semantics.
> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> > 
> > Ok.  The driver needs few additional configuration from the slave
> > device like Vertical
> > Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do 
> > you
> > suggest me to define a separate dma_ctrl_cmd like the one 
> > FSLDMA_EXTERNAL_START
> > defined for Freescale drivers?
> 
> In my opinion it is not a good idea to have driver implement a generic API,
> but at the same time let the driver have custom semantics for those API
> calls. It's a bit like having a gpio driver that expects 23 and 42 as the
> values passed to gpio_set_value instead of 0 and 1. It completely defeats
> the purpose of a generic API, namely that you are able to write generic code
> that makes use of the API without having to know about which implementation
> API it is talking to. The dmaengine framework provides the
> dmaengine_prep_interleaved_dma() function to setup two dimensional
> transfers, e.g. take a look at sirf-dma.c or imx-dma.c.

The question here i think would be waht this device supports? Is the hardware
capable of doing interleaved transfers, then would make sense.

While we do try to get users use dma_slave_config, but there will always be
someone who have specfic params. If we can generalize then we might want to add
to the dma_slave_config as well

--
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Vinod Koul

On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
 On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
  Hi Lars,
  
  On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote:
  On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
  [...]
  +/**
  + * xilinx_vdma_device_control - Configure DMA channel of the device
  + * @dchan: DMA Channel pointer
  + * @cmd: DMA control command
  + * @arg: Channel configuration
  + *
  + * Return: '0' on success and failure value on error
  + */
  +static int xilinx_vdma_device_control(struct dma_chan *dchan,
  +   enum dma_ctrl_cmd cmd, unsigned long 
  arg)
  +{
  + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
  +
  + switch (cmd) {
  + case DMA_TERMINATE_ALL:
  + xilinx_vdma_terminate_all(chan);
  + return 0;
  + case DMA_SLAVE_CONFIG:
  + return xilinx_vdma_slave_config(chan,
  + (struct xilinx_vdma_config *)arg);
 
  You really shouldn't be overloading the generic API with your own 
  semantics.
  DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
  
  Ok.  The driver needs few additional configuration from the slave
  device like Vertical
  Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do 
  you
  suggest me to define a separate dma_ctrl_cmd like the one 
  FSLDMA_EXTERNAL_START
  defined for Freescale drivers?
 
 In my opinion it is not a good idea to have driver implement a generic API,
 but at the same time let the driver have custom semantics for those API
 calls. It's a bit like having a gpio driver that expects 23 and 42 as the
 values passed to gpio_set_value instead of 0 and 1. It completely defeats
 the purpose of a generic API, namely that you are able to write generic code
 that makes use of the API without having to know about which implementation
 API it is talking to. The dmaengine framework provides the
 dmaengine_prep_interleaved_dma() function to setup two dimensional
 transfers, e.g. take a look at sirf-dma.c or imx-dma.c.

The question here i think would be waht this device supports? Is the hardware
capable of doing interleaved transfers, then would make sense.

While we do try to get users use dma_slave_config, but there will always be
someone who have specfic params. If we can generalize then we might want to add
to the dma_slave_config as well

--
~Vinod
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Vinod Koul

On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote:
 On 01/23/2014 03:00 PM, Andy Shevchenko wrote:
  On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
  On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
  On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
  On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
 
  [...]
 
  +   /* Request the interrupt */
  +   chan-irq = irq_of_parse_and_map(node, 0);
  +   err = devm_request_irq(xdev-dev, chan-irq, 
  xilinx_vdma_irq_handler,
  +  IRQF_SHARED, xilinx-vdma-controller, 
  chan);
 
  This is a clasic example of where to not use devm_request_irq. 'chan' is
  accessed in the interrupt handler, but if you use devm_request_irq 'chan'
  will be freed before the interrupt handler has been released, which means
  there is now a race condition where the interrupt handler can access 
  already
  freed memory.ta
 
  Could you elaborate this case? As far as I understood managed resources
  are a kind of stack pile. In this case you have no such condition. Where
  am I wrong?
 
  The stacked stuff is only ran after the remove() function. Which means that
  you call dma_async_device_unregister() before the interrupt handler is
  freed. Another issue with the interrupt handler is a bit hidden. The driver
  does not call tasklet_kill in the remove function. Which it should though 
  to
  make sure that the tasklet does not race against the freeing of the memory.
  And in order to make sure that the tasklet is not rescheduled you need to
  free the irq before killing the tasklet, since the interrupt handler
  schedules the tasklet.
  
  So, you mean devm_request_irq() will race in any DMA driver?
 
 Most likely yes. devm_request_irq() is race condition prone for the majority
 of device driver. You have to be really careful if you want to use it.
 
  
  I think the proper solution is to disable all device work in
  the .remove() and devm will care about resources.
 
 As long as the interrupt handler is registered it can be called, the only
 proper solution is to make sure that the order in which resources are torn
 down is correct.
Wouldn't it work if we register the irq last in the probe. That wil ensure on
success the channel is always valid.

Also the tasklet is required to be killed not just in your .remove but also in
drivers .suspend handler, you dont want handler to be invoked after you returned
from your suspend

--
~Vinod
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Vinod Koul

On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote:
 This is the driver for the AXI Video Direct Memory Access (AXI
 VDMA) core, which is a soft Xilinx IP core that provides high-
 bandwidth direct memory access between memory and AXI4-Stream
 type video target peripherals. The core provides efficient two
 dimensional DMA operations with independent asynchronous read
ok here is tha catch, do you want to support interleaved API rather?

 +* DMA client + +Required properties: +- dmas: a list of [Video DMA device
 phandle] [Channel ID] pairs, +   where Channel ID is '0' for write/tx and
 '1' for read/rx + channel.  +- dma-names: a list of DMA channel names, one
 per dmas entry + +Example: + + +vdmatest_0: vdmatest@0 { +
 compatible =xlnx,axi-vdma-test-1.00.a; +dmas = axi_vdma_0 0 +
 axi_vdma_0 1; + dma-names = vdma0, vdma1; +} ;
Need ack from DT folks. ALso would be better to split the binding to a separate
patch


 +/**
 + * struct xilinx_vdma_chan - Driver specific VDMA channel structure
 + * @xdev: Driver specific device structure
 + * @ctrl_offset: Control registers offset
 + * @desc_offset: TX descriptor registers offset
 + * @completed_cookie: Maximum cookie completed
 + * @cookie: The current cookie
 + * @lock: Descriptor operation lock
 + * @pending_list: Descriptors waiting
 + * @active_desc: Active descriptor
 + * @done_list: Complete descriptors
 + * @common: DMA common channel
 + * @desc_pool: Descriptors pool
 + * @dev: The dma device
 + * @irq: Channel IRQ
 + * @id: Channel ID
 + * @direction: Transfer direction
 + * @num_frms: Number of frames
 + * @has_sg: Support scatter transfers
 + * @genlock: Support genlock mode
 + * @err: Channel has errors
 + * @tasklet: Cleanup work after irq
 + * @config: Device configuration info
 + * @flush_on_fsync: Flush on Frame sync
 + */
 +struct xilinx_vdma_chan {
 + struct xilinx_vdma_device *xdev;
 + u32 ctrl_offset;
 + u32 desc_offset;
 + dma_cookie_t completed_cookie;
 + dma_cookie_t cookie;
 + spinlock_t lock;
 + struct list_head pending_list;
 + struct xilinx_vdma_tx_descriptor *active_desc;
 + struct list_head done_list;
 + struct dma_chan common;
 + struct dma_pool *desc_pool;
 + struct device *dev;
 + int irq;
 + int id;
 + enum dma_transfer_direction direction;
why should channel have a direction... descriptor should have direction and not
the channel!

 +/**
 + * xilinx_vdma_free_tx_descriptor - Free transaction descriptor
 + * @chan: Driver specific VDMA channel
 + * @desc: VDMA transaction descriptor
 + */
 +static void
 +xilinx_vdma_free_tx_descriptor(struct xilinx_vdma_chan *chan,
 +struct xilinx_vdma_tx_descriptor *desc)
 +{
 + struct xilinx_vdma_tx_segment *segment, *next;
 +
 + if (!desc)
 + return;
 +
 + list_for_each_entry_safe(segment, next, desc-segments, node) {
do you want to use _safe. Isee that this is called for cleanup while lock held,
and in other case within another _safe iterator!

 + list_del(segment-node);
 + xilinx_vdma_free_tx_segment(chan, segment);
 + }
 +
 + kfree(desc);
 +}
 +
 +/* Required functions */
 +

 + * xilinx_vdma_do_tasklet - Schedule completion tasklet
 + * @data: Pointer to the Xilinx VDMA channel structure
 + */
 +static void xilinx_vdma_do_tasklet(unsigned long data)
 +{
 + struct xilinx_vdma_chan *chan = (struct xilinx_vdma_chan *)data;
 +
 + xilinx_vdma_chan_desc_cleanup(chan);
 +}
 +
 +/**
 + * xilinx_vdma_alloc_chan_resources - Allocate channel resources
 + * @dchan: DMA channel
 + *
 + * Return: '1' on success and failure value on error
naaah, we dont do that, pls use standard notation of 0 on success
Also API wants you to return descriptors allocated here!

 + */
 +static int xilinx_vdma_alloc_chan_resources(struct dma_chan *dchan)
 +{
 + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
 +
 + /* Has this channel already been allocated? */
 + if (chan-desc_pool)
 + return 1;
 +
 + /*
 +  * We need the descriptor to be aligned to 64bytes
 +  * for meeting Xilinx VDMA specification requirement.
 +  */
 + chan-desc_pool = dma_pool_create(xilinx_vdma_desc_pool,
 + chan-dev,
 + sizeof(struct xilinx_vdma_tx_segment),
 + __alignof__(struct xilinx_vdma_tx_segment), 0);
 + if (!chan-desc_pool) {
 + dev_err(chan-dev,
 + unable to allocate channel %d descriptor pool\n,
 + chan-id);
 + return -ENOMEM;
 + }
 +
 + tasklet_init(chan-tasklet, xilinx_vdma_do_tasklet,
 + (unsigned long)chan);
 +
 + chan-completed_cookie = DMA_MIN_COOKIE;
 + chan-cookie = DMA_MIN_COOKIE;
Can you use virtual dma implementation to simplfy your implemenattion of lists,
cookies (driver/dma/virt-dma.c)

 + /* There is

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Lars-Peter Clausen

On 01/26/2014 02:59 PM, Vinod Koul wrote:
 On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote:
 On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
 Hi Lars,

 On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
 [...]
 +/**
 + * xilinx_vdma_device_control - Configure DMA channel of the device
 + * @dchan: DMA Channel pointer
 + * @cmd: DMA control command
 + * @arg: Channel configuration
 + *
 + * Return: '0' on success and failure value on error
 + */
 +static int xilinx_vdma_device_control(struct dma_chan *dchan,
 +   enum dma_ctrl_cmd cmd, unsigned long 
 arg)
 +{
 + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
 +
 + switch (cmd) {
 + case DMA_TERMINATE_ALL:
 + xilinx_vdma_terminate_all(chan);
 + return 0;
 + case DMA_SLAVE_CONFIG:
 + return xilinx_vdma_slave_config(chan,
 + (struct xilinx_vdma_config *)arg);

 You really shouldn't be overloading the generic API with your own 
 semantics.
 DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.

 Ok.  The driver needs few additional configuration from the slave
 device like Vertical
 Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do 
 you
 suggest me to define a separate dma_ctrl_cmd like the one 
 FSLDMA_EXTERNAL_START
 defined for Freescale drivers?

 In my opinion it is not a good idea to have driver implement a generic API,
 but at the same time let the driver have custom semantics for those API
 calls. It's a bit like having a gpio driver that expects 23 and 42 as the
 values passed to gpio_set_value instead of 0 and 1. It completely defeats
 the purpose of a generic API, namely that you are able to write generic code
 that makes use of the API without having to know about which implementation
 API it is talking to. The dmaengine framework provides the
 dmaengine_prep_interleaved_dma() function to setup two dimensional
 transfers, e.g. take a look at sirf-dma.c or imx-dma.c.
 
 The question here i think would be waht this device supports? Is the hardware
 capable of doing interleaved transfers, then would make sense.

The hardware does 2D transfers. The parameters for a transfer are height,
width and stride. That's only a subset of what interleaved transfers can be
(xt-num_frames must be one for 2d transfers). But if I remember correctly
there has been some discussion on this in the past and the result of that
discussion was that using interleaved transfers for 2D transfers is
preferred over adding a custom API for 2D transfers.

- Lars

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Lars-Peter Clausen

On 01/26/2014 03:03 PM, Vinod Koul wrote:
 On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote:
 On 01/23/2014 03:00 PM, Andy Shevchenko wrote:
 On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
 On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
 On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:

 [...]

 +   /* Request the interrupt */
 +   chan-irq = irq_of_parse_and_map(node, 0);
 +   err = devm_request_irq(xdev-dev, chan-irq, 
 xilinx_vdma_irq_handler,
 +  IRQF_SHARED, xilinx-vdma-controller, 
 chan);

 This is a clasic example of where to not use devm_request_irq. 'chan' is
 accessed in the interrupt handler, but if you use devm_request_irq 'chan'
 will be freed before the interrupt handler has been released, which means
 there is now a race condition where the interrupt handler can access 
 already
 freed memory.ta

 Could you elaborate this case? As far as I understood managed resources
 are a kind of stack pile. In this case you have no such condition. Where
 am I wrong?

 The stacked stuff is only ran after the remove() function. Which means that
 you call dma_async_device_unregister() before the interrupt handler is
 freed. Another issue with the interrupt handler is a bit hidden. The driver
 does not call tasklet_kill in the remove function. Which it should though 
 to
 make sure that the tasklet does not race against the freeing of the memory.
 And in order to make sure that the tasklet is not rescheduled you need to
 free the irq before killing the tasklet, since the interrupt handler
 schedules the tasklet.

 So, you mean devm_request_irq() will race in any DMA driver?

 Most likely yes. devm_request_irq() is race condition prone for the majority
 of device driver. You have to be really careful if you want to use it.


 I think the proper solution is to disable all device work in
 the .remove() and devm will care about resources.

 As long as the interrupt handler is registered it can be called, the only
 proper solution is to make sure that the order in which resources are torn
 down is correct.
 Wouldn't it work if we register the irq last in the probe. That wil ensure on
 success the channel is always valid.

Yes, but only if the irq is not device managed. All device managed resources
will be freed after the remove function has been called. Which is to late in
our case since we make sure that the tasklet is not running in the remove
function.

- Lars

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-26 Thread Lars-Peter Clausen

On 01/26/2014 03:24 PM, Vinod Koul wrote:
 On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote:
 This is the driver for the AXI Video Direct Memory Access (AXI
 VDMA) core, which is a soft Xilinx IP core that provides high-
 bandwidth direct memory access between memory and AXI4-Stream
 type video target peripherals. The core provides efficient two
 dimensional DMA operations with independent asynchronous read
 ok here is tha catch, do you want to support interleaved API rather?
 
 +* DMA client + +Required properties: +- dmas: a list of [Video DMA device
 phandle] [Channel ID] pairs, +  where Channel ID is '0' for write/tx and
 '1' for read/rx +channel.  +- dma-names: a list of DMA channel names, one
 per dmas entry + +Example: + + +vdmatest_0: vdmatest@0 { +
 compatible =xlnx,axi-vdma-test-1.00.a; +   dmas = axi_vdma_0 0 +
 axi_vdma_0 1; +dma-names = vdma0, vdma1; +} ;
 Need ack from DT folks. ALso would be better to split the binding to a 
 separate
 patch
 
 
 +/**
 + * struct xilinx_vdma_chan - Driver specific VDMA channel structure
 + * @xdev: Driver specific device structure
 + * @ctrl_offset: Control registers offset
 + * @desc_offset: TX descriptor registers offset
 + * @completed_cookie: Maximum cookie completed
 + * @cookie: The current cookie
 + * @lock: Descriptor operation lock
 + * @pending_list: Descriptors waiting
 + * @active_desc: Active descriptor
 + * @done_list: Complete descriptors
 + * @common: DMA common channel
 + * @desc_pool: Descriptors pool
 + * @dev: The dma device
 + * @irq: Channel IRQ
 + * @id: Channel ID
 + * @direction: Transfer direction
 + * @num_frms: Number of frames
 + * @has_sg: Support scatter transfers
 + * @genlock: Support genlock mode
 + * @err: Channel has errors
 + * @tasklet: Cleanup work after irq
 + * @config: Device configuration info
 + * @flush_on_fsync: Flush on Frame sync
 + */
 +struct xilinx_vdma_chan {
 +struct xilinx_vdma_device *xdev;
 +u32 ctrl_offset;
 +u32 desc_offset;
 +dma_cookie_t completed_cookie;
 +dma_cookie_t cookie;
 +spinlock_t lock;
 +struct list_head pending_list;
 +struct xilinx_vdma_tx_descriptor *active_desc;
 +struct list_head done_list;
 +struct dma_chan common;
 +struct dma_pool *desc_pool;
 +struct device *dev;
 +int irq;
 +int id;
 +enum dma_transfer_direction direction;
 why should channel have a direction... descriptor should have direction and 
 not
 the channel!

The channel only supports transfers in one direction. Either from memory to
peripheral or from peripheral to memory, that's fixed and can't be changed
at runtime. The driver needs to know which direction the channel supports so
it can reject transfers with the wrong direction.

[...]
 +
 
 + * xilinx_vdma_tx_status - Get VDMA transaction status
 + * @dchan: DMA channel
 + * @cookie: Transaction identifier
 + * @txstate: Transaction state
 + *
 + * Return: DMA transaction status
 + */
 +static enum dma_status xilinx_vdma_tx_status(struct dma_chan *dchan,
 +dma_cookie_t cookie,
 +struct dma_tx_state *txstate)
 +{
 +struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
 +dma_cookie_t last_used;
 +dma_cookie_t last_complete;
 +
 +xilinx_vdma_chan_desc_cleanup(chan);
 +
 +last_used = dchan-cookie;
 +last_complete = chan-completed_cookie;
 +
 +dma_set_tx_state(txstate, last_complete, last_used, 0);
 +
 +return dma_async_is_complete(cookie, last_complete, last_used);
 no residue calculation?
 

The hardware doesn't support that.

 +/**
 + * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE 
 transaction
 + * @dchan: DMA channel
 + * @sgl: scatterlist to transfer to/from
 + * @sg_len: number of entries in @sgl
 + * @dir: DMA direction
 + * @flags: transfer ack flags
 + * @context: unused
 + *
 + * Return: Async transaction descriptor on success and NULL on failure
 + */
 +static struct dma_async_tx_descriptor *
 +xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl,
 +  unsigned int sg_len, enum dma_transfer_direction dir,
 +  unsigned long flags, void *context)
 okay now am worried, this is supposed to memcpy DMA so why slave-sg??

The DMA is either from memory to peripheral or from peripheral to memory
depending on the direction. So slave sg should be fine.

 
 Looking at the driver overall, IMHO we need to do:
 - use the virt-dma to simplfy the cookie handling and perhpasn the descrptors
   too!
 - Perhpas use interleaved API..?
 - I dont think we should use the slave API as this seems memcpy case!
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-24 Thread Lars-Peter Clausen

On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
> Hi Lars,
> 
> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  wrote:
>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>> [...]
>>> +/**
>>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>>> + * @dchan: DMA Channel pointer
>>> + * @cmd: DMA control command
>>> + * @arg: Channel configuration
>>> + *
>>> + * Return: '0' on success and failure value on error
>>> + */
>>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>>> +   enum dma_ctrl_cmd cmd, unsigned long 
>>> arg)
>>> +{
>>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>>> +
>>> + switch (cmd) {
>>> + case DMA_TERMINATE_ALL:
>>> + xilinx_vdma_terminate_all(chan);
>>> + return 0;
>>> + case DMA_SLAVE_CONFIG:
>>> + return xilinx_vdma_slave_config(chan,
>>> + (struct xilinx_vdma_config *)arg);
>>
>> You really shouldn't be overloading the generic API with your own semantics.
>> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
> 
> Ok.  The driver needs few additional configuration from the slave
> device like Vertical
> Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do 
> you
> suggest me to define a separate dma_ctrl_cmd like the one 
> FSLDMA_EXTERNAL_START
> defined for Freescale drivers?

In my opinion it is not a good idea to have driver implement a generic API,
but at the same time let the driver have custom semantics for those API
calls. It's a bit like having a gpio driver that expects 23 and 42 as the
values passed to gpio_set_value instead of 0 and 1. It completely defeats
the purpose of a generic API, namely that you are able to write generic code
that makes use of the API without having to know about which implementation
API it is talking to. The dmaengine framework provides the
dmaengine_prep_interleaved_dma() function to setup two dimensional
transfers, e.g. take a look at sirf-dma.c or imx-dma.c.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-24 Thread Srikanth Thokala

Hi Lars,

On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen  wrote:
> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> [...]
>> +/**
>> + * xilinx_vdma_device_control - Configure DMA channel of the device
>> + * @dchan: DMA Channel pointer
>> + * @cmd: DMA control command
>> + * @arg: Channel configuration
>> + *
>> + * Return: '0' on success and failure value on error
>> + */
>> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
>> +   enum dma_ctrl_cmd cmd, unsigned long arg)
>> +{
>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
>> +
>> + switch (cmd) {
>> + case DMA_TERMINATE_ALL:
>> + xilinx_vdma_terminate_all(chan);
>> + return 0;
>> + case DMA_SLAVE_CONFIG:
>> + return xilinx_vdma_slave_config(chan,
>> + (struct xilinx_vdma_config *)arg);
>
> You really shouldn't be overloading the generic API with your own semantics.
> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.

Ok.  The driver needs few additional configuration from the slave
device like Vertical
Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
defined for Freescale drivers?

>
>> + default:
>> + return -ENXIO;
>> + }
>> +}
>> +
> [...]
>> +
>> + /* Request the interrupt */
>> + chan->irq = irq_of_parse_and_map(node, 0);
>> + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
>> +IRQF_SHARED, "xilinx-vdma-controller", chan);
>
> This is a clasic example of where to not use devm_request_irq. 'chan' is
> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
> will be freed before the interrupt handler has been released, which means
> there is now a race condition where the interrupt handler can access already
> freed memory.

Ok, thank you for the clarification on this thread.  I will fix it in v3.

>
>> + if (err) {
>> + dev_err(xdev->dev, "unable to request IRQ\n");
>> + return err;
>> + }
>> +
>> + /* Initialize the DMA channel and add it to the DMA engine channels
>> +  * list.
>> +  */
>> + chan->common.device = >common;
>> +
>> + list_add_tail(>common.device_node, >common.channels);
>> + xdev->chan[chan->id] = chan;
>> +
>> + /* Reset the channel */
>> + err = xilinx_vdma_chan_reset(chan);
>> + if (err < 0) {
>> + dev_err(xdev->dev, "Reset channel failed\n");
>> + return err;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +/**
>> + * struct of_dma_filter_xilinx_args - Channel filter args
>> + * @dev: DMA device structure
>> + * @chan_id: Channel id
>> + */
>> +struct of_dma_filter_xilinx_args {
>> + struct dma_device *dev;
>> + u32 chan_id;
>> +};
>> +
>> +/**
>> + * xilinx_vdma_dt_filter - VDMA channel filter function
>> + * @chan: DMA channel pointer
>> + * @param: Filter match value
>> + *
>> + * Return: true/false based on the result
>> + */
>> +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
>> +{
>> + struct of_dma_filter_xilinx_args *args = param;
>> +
>> + return chan->device == args->dev && chan->chan_id == args->chan_id;
>> +}
>> +
>> +/**
>> + * of_dma_xilinx_xlate - Translation function
>> + * @dma_spec: Pointer to DMA specifier as found in the device tree
>> + * @ofdma: Pointer to DMA controller data
>> + *
>> + * Return: DMA channel pointer on success and NULL on error
>> + */
>> +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args 
>> *dma_spec,
>> + struct of_dma *ofdma)
>> +{
>> + struct of_dma_filter_xilinx_args args;
>> + dma_cap_mask_t cap;
>> +
>> + args.dev = ofdma->of_dma_data;
>> + if (!args.dev)
>> + return NULL;
>> +
>> + if (dma_spec->args_count != 1)
>> + return NULL;
>> +
>> + dma_cap_zero(cap);
>> + dma_cap_set(DMA_SLAVE, cap);
>> +
>> + args.chan_id = dma_spec->args[0];
>> +
>> + return dma_request_channel(cap, xilinx_vdma_dt_filter, );
>
> There is a new helper function called dma_get_slave_channel, which makes
> this much easier. Take a look at the k3dma.c driver for an example.

Ok.  I will check and fix it in v3.

Srikanth

>> +}
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-24 Thread Srikanth Thokala

Hi Lars,

On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
 [...]
 +/**
 + * xilinx_vdma_device_control - Configure DMA channel of the device
 + * @dchan: DMA Channel pointer
 + * @cmd: DMA control command
 + * @arg: Channel configuration
 + *
 + * Return: '0' on success and failure value on error
 + */
 +static int xilinx_vdma_device_control(struct dma_chan *dchan,
 +   enum dma_ctrl_cmd cmd, unsigned long arg)
 +{
 + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
 +
 + switch (cmd) {
 + case DMA_TERMINATE_ALL:
 + xilinx_vdma_terminate_all(chan);
 + return 0;
 + case DMA_SLAVE_CONFIG:
 + return xilinx_vdma_slave_config(chan,
 + (struct xilinx_vdma_config *)arg);

 You really shouldn't be overloading the generic API with your own semantics.
 DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.

Ok.  The driver needs few additional configuration from the slave
device like Vertical
Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do you
suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START
defined for Freescale drivers?


 + default:
 + return -ENXIO;
 + }
 +}
 +
 [...]
 +
 + /* Request the interrupt */
 + chan-irq = irq_of_parse_and_map(node, 0);
 + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler,
 +IRQF_SHARED, xilinx-vdma-controller, chan);

 This is a clasic example of where to not use devm_request_irq. 'chan' is
 accessed in the interrupt handler, but if you use devm_request_irq 'chan'
 will be freed before the interrupt handler has been released, which means
 there is now a race condition where the interrupt handler can access already
 freed memory.

Ok, thank you for the clarification on this thread.  I will fix it in v3.


 + if (err) {
 + dev_err(xdev-dev, unable to request IRQ\n);
 + return err;
 + }
 +
 + /* Initialize the DMA channel and add it to the DMA engine channels
 +  * list.
 +  */
 + chan-common.device = xdev-common;
 +
 + list_add_tail(chan-common.device_node, xdev-common.channels);
 + xdev-chan[chan-id] = chan;
 +
 + /* Reset the channel */
 + err = xilinx_vdma_chan_reset(chan);
 + if (err  0) {
 + dev_err(xdev-dev, Reset channel failed\n);
 + return err;
 + }
 +
 + return 0;
 +}
 +
 +/**
 + * struct of_dma_filter_xilinx_args - Channel filter args
 + * @dev: DMA device structure
 + * @chan_id: Channel id
 + */
 +struct of_dma_filter_xilinx_args {
 + struct dma_device *dev;
 + u32 chan_id;
 +};
 +
 +/**
 + * xilinx_vdma_dt_filter - VDMA channel filter function
 + * @chan: DMA channel pointer
 + * @param: Filter match value
 + *
 + * Return: true/false based on the result
 + */
 +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
 +{
 + struct of_dma_filter_xilinx_args *args = param;
 +
 + return chan-device == args-dev  chan-chan_id == args-chan_id;
 +}
 +
 +/**
 + * of_dma_xilinx_xlate - Translation function
 + * @dma_spec: Pointer to DMA specifier as found in the device tree
 + * @ofdma: Pointer to DMA controller data
 + *
 + * Return: DMA channel pointer on success and NULL on error
 + */
 +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args 
 *dma_spec,
 + struct of_dma *ofdma)
 +{
 + struct of_dma_filter_xilinx_args args;
 + dma_cap_mask_t cap;
 +
 + args.dev = ofdma-of_dma_data;
 + if (!args.dev)
 + return NULL;
 +
 + if (dma_spec-args_count != 1)
 + return NULL;
 +
 + dma_cap_zero(cap);
 + dma_cap_set(DMA_SLAVE, cap);
 +
 + args.chan_id = dma_spec-args[0];
 +
 + return dma_request_channel(cap, xilinx_vdma_dt_filter, args);

 There is a new helper function called dma_get_slave_channel, which makes
 this much easier. Take a look at the k3dma.c driver for an example.

Ok.  I will check and fix it in v3.

Srikanth

 +}

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-24 Thread Lars-Peter Clausen

On 01/24/2014 12:16 PM, Srikanth Thokala wrote:
 Hi Lars,
 
 On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
 [...]
 +/**
 + * xilinx_vdma_device_control - Configure DMA channel of the device
 + * @dchan: DMA Channel pointer
 + * @cmd: DMA control command
 + * @arg: Channel configuration
 + *
 + * Return: '0' on success and failure value on error
 + */
 +static int xilinx_vdma_device_control(struct dma_chan *dchan,
 +   enum dma_ctrl_cmd cmd, unsigned long 
 arg)
 +{
 + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
 +
 + switch (cmd) {
 + case DMA_TERMINATE_ALL:
 + xilinx_vdma_terminate_all(chan);
 + return 0;
 + case DMA_SLAVE_CONFIG:
 + return xilinx_vdma_slave_config(chan,
 + (struct xilinx_vdma_config *)arg);

 You really shouldn't be overloading the generic API with your own semantics.
 DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.
 
 Ok.  The driver needs few additional configuration from the slave
 device like Vertical
 Size, Horizontal Size,  Stride etc., for the DMA transfers, in that case do 
 you
 suggest me to define a separate dma_ctrl_cmd like the one 
 FSLDMA_EXTERNAL_START
 defined for Freescale drivers?

In my opinion it is not a good idea to have driver implement a generic API,
but at the same time let the driver have custom semantics for those API
calls. It's a bit like having a gpio driver that expects 23 and 42 as the
values passed to gpio_set_value instead of 0 and 1. It completely defeats
the purpose of a generic API, namely that you are able to write generic code
that makes use of the API without having to know about which implementation
API it is talking to. The dmaengine framework provides the
dmaengine_prep_interleaved_dma() function to setup two dimensional
transfers, e.g. take a look at sirf-dma.c or imx-dma.c.

- Lars

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Srikanth Thokala

Hi Levente,

On Thu, Jan 23, 2014 at 3:00 AM, Levente Kurusa  wrote:
> Hello,
>
> 2014/1/22 Srikanth Thokala :
>> This is the driver for the AXI Video Direct Memory Access (AXI
>> VDMA) core, which is a soft Xilinx IP core that provides high-
>> bandwidth direct memory access between memory and AXI4-Stream
>> type video target peripherals. The core provides efficient two
>> dimensional DMA operations with independent asynchronous read
>> and write channel operation.
>>
>> This module works on Zynq (ARM Based SoC) and Microblaze platforms.
>>
>> Signed-off-by: Srikanth Thokala 
>> ---
>
> Another two remarks, after you fixed them ( or not :-) )
> you can have my:
>
> Reviewed-by: Levente Kurusa 
>
> Oh, and next time please if you post a patch that fixes something I pointed 
> out,
> CC me as I had a hard time finding this patch, thanks. :-)

Sure. Thanks

>
>> NOTE:
>> 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
>>DMA IPs and we are also planning to upstream these drivers soon.
>> 2. Rebased on v3.13.0-rc8
>>
>> Changes in v2:
>> - Removed DMA Test client module from the patchset as suggested
>>   by Andy Shevchenko
>> - Removed device-id DT property, as suggested by Arnd Bergmann
>> - Properly documented DT bindings as suggested by Arnd Bergmann
>> - Returning with error, if registration of DMA to node fails
>> - Fixed typo errors
>> - Used BIT() macro at applicable places
>> - Added missing header file to the patchset
>> - Changed copyright year to include 2014
>> ---
>>  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
>>  drivers/dma/Kconfig|   14 +
>>  drivers/dma/Makefile   |1 +
>>  drivers/dma/xilinx/Makefile|1 +
>>  drivers/dma/xilinx/xilinx_vdma.c   | 1486 
>> 
>>  include/linux/amba/xilinx_dma.h|   50 +
>>  6 files changed, 1627 insertions(+)
>>  create mode 100644 
>> Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>>  create mode 100644 drivers/dma/xilinx/Makefile
>>  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
>>  create mode 100644 include/linux/amba/xilinx_dma.h
>>
>> diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt 
>> b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>> new file mode 100644
>> index 000..ab8be1a
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>> @@ -0,0 +1,75 @@
>> +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
>> +It can be configured to have one channel or two channels. If configured
>> +as two channels, one is to transmit to the video device and another is
>> +to receive from the video device.
>> +
>> +Required properties:
>> +- compatible: Should be "xlnx,axi-vdma-1.00.a"
>> +- #dma-cells: Should be <1>, see "dmas" property below
>> +- reg: Should contain VDMA registers location and length.
>> +- xlnx,num-fstores: Should be the number of framebuffers as configured in 
>> h/w.
>> +- dma-channel child node: Should have atleast one channel and can have upto
>> +   two channels per device. This node specifies the properties of each
>> +   DMA channel (see child node properties below).
>> +
>> +Optional properties:
>> +- xlnx,include-sg: Tells whether configured for Scatter-mode in
>> +   the hardware.
>> [...]
>> +
>> +/**
>> + * xilinx_vdma_is_running - Check if VDMA channel is running
>> + * @chan: Driver specific VDMA channel
>> + *
>> + * Return: '1' if running, '0' if not.
>> + */
>> +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan)
>> +{
>> +   return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
>> +XILINX_VDMA_DMASR_HALTED) &&
>> +   (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
>> +XILINX_VDMA_DMACR_RUNSTOP);
>> +}
>> +

[...]

>> +   /* Retrieve the channel properties from the device tree */
>> +   has_dre = of_property_read_bool(node, "xlnx,include-dre");
>> +
>> +   chan->genlock = of_property_read_bool(node, "xlnx,genlock-mode");
>> +
>> +   err = of_property_read_u32(node, "xlnx,datawidth", );
>> +   if (!err) {
>> +   u32 width = value >> 3; /* Convert bits to bytes */
>> +
>> +   /* If data width is greater than 8 bytes, DRE is not in hw */
>> +   if (width > 8)
>> +   has_dre = false;
>> +
>> +   if (!has_dre)
>> +   xdev->common.copy_align = fls(width - 1);
>> +   } else {
>> +   dev_err(xdev->dev, "missing xlnx,datawidth property\n");
>> +   return err;
>> +   }
>
> Can you please convert this to:
> if (err) {
>  dev_err(...);
>  return err;
> }
>
> That way we can avoid the else clause.

Ok. I will fix it in v3.

>> +
>> +   if (of_device_is_compatible(node, "xlnx,axi-vdma-mm2s-channel")) {
>> +

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Lars-Peter Clausen

On 01/23/2014 03:00 PM, Andy Shevchenko wrote:
> On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
>> On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
>>> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
>>>
>>> [...]
>>>
> + /* Request the interrupt */
> + chan->irq = irq_of_parse_and_map(node, 0);
> + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
> +IRQF_SHARED, "xilinx-vdma-controller", chan);

 This is a clasic example of where to not use devm_request_irq. 'chan' is
 accessed in the interrupt handler, but if you use devm_request_irq 'chan'
 will be freed before the interrupt handler has been released, which means
 there is now a race condition where the interrupt handler can access 
 already
 freed memory.ta
>>>
>>> Could you elaborate this case? As far as I understood managed resources
>>> are a kind of stack pile. In this case you have no such condition. Where
>>> am I wrong?
>>
>> The stacked stuff is only ran after the remove() function. Which means that
>> you call dma_async_device_unregister() before the interrupt handler is
>> freed. Another issue with the interrupt handler is a bit hidden. The driver
>> does not call tasklet_kill in the remove function. Which it should though to
>> make sure that the tasklet does not race against the freeing of the memory.
>> And in order to make sure that the tasklet is not rescheduled you need to
>> free the irq before killing the tasklet, since the interrupt handler
>> schedules the tasklet.
> 
> So, you mean devm_request_irq() will race in any DMA driver?

Most likely yes. devm_request_irq() is race condition prone for the majority
of device driver. You have to be really careful if you want to use it.

> 
> I think the proper solution is to disable all device work in
> the .remove() and devm will care about resources.

As long as the interrupt handler is registered it can be called, the only
proper solution is to make sure that the order in which resources are torn
down is correct.

- Lars
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Andy Shevchenko

On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
> On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
> > On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> > 
> > [...]
> > 
> >>> + /* Request the interrupt */
> >>> + chan->irq = irq_of_parse_and_map(node, 0);
> >>> + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
> >>> +IRQF_SHARED, "xilinx-vdma-controller", chan);
> >>
> >> This is a clasic example of where to not use devm_request_irq. 'chan' is
> >> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
> >> will be freed before the interrupt handler has been released, which means
> >> there is now a race condition where the interrupt handler can access 
> >> already
> >> freed memory.ta
> > 
> > Could you elaborate this case? As far as I understood managed resources
> > are a kind of stack pile. In this case you have no such condition. Where
> > am I wrong?
> 
> The stacked stuff is only ran after the remove() function. Which means that
> you call dma_async_device_unregister() before the interrupt handler is
> freed. Another issue with the interrupt handler is a bit hidden. The driver
> does not call tasklet_kill in the remove function. Which it should though to
> make sure that the tasklet does not race against the freeing of the memory.
> And in order to make sure that the tasklet is not rescheduled you need to
> free the irq before killing the tasklet, since the interrupt handler
> schedules the tasklet.

So, you mean devm_request_irq() will race in any DMA driver?

I think the proper solution is to disable all device work in
the .remove() and devm will care about resources.
> majordomo-info.html

-- 
Andy Shevchenko 
Intel Finland Oy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Lars-Peter Clausen

On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
>> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
> 
> [...]
> 
>>> +   /* Request the interrupt */
>>> +   chan->irq = irq_of_parse_and_map(node, 0);
>>> +   err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
>>> +  IRQF_SHARED, "xilinx-vdma-controller", chan);
>>
>> This is a clasic example of where to not use devm_request_irq. 'chan' is
>> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
>> will be freed before the interrupt handler has been released, which means
>> there is now a race condition where the interrupt handler can access already
>> freed memory.ta
> 
> Could you elaborate this case? As far as I understood managed resources
> are a kind of stack pile. In this case you have no such condition. Where
> am I wrong?

The stacked stuff is only ran after the remove() function. Which means that
you call dma_async_device_unregister() before the interrupt handler is
freed. Another issue with the interrupt handler is a bit hidden. The driver
does not call tasklet_kill in the remove function. Which it should though to
make sure that the tasklet does not race against the freeing of the memory.
And in order to make sure that the tasklet is not rescheduled you need to
free the irq before killing the tasklet, since the interrupt handler
schedules the tasklet.

- Lars

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Shevchenko, Andriy

On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
> On 01/22/2014 05:52 PM, Srikanth Thokala wrote:

[...]

> > +   /* Request the interrupt */
> > +   chan->irq = irq_of_parse_and_map(node, 0);
> > +   err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
> > +  IRQF_SHARED, "xilinx-vdma-controller", chan);
> 
> This is a clasic example of where to not use devm_request_irq. 'chan' is
> accessed in the interrupt handler, but if you use devm_request_irq 'chan'
> will be freed before the interrupt handler has been released, which means
> there is now a race condition where the interrupt handler can access already
> freed memory.

Could you elaborate this case? As far as I understood managed resources
are a kind of stack pile. In this case you have no such condition. Where
am I wrong?

-- 
Andy Shevchenko 
Intel Finland Oy
-
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Andy Shevchenko

On Wed, 2014-01-22 at 22:22 +0530, Srikanth Thokala wrote:
> This is the driver for the AXI Video Direct Memory Access (AXI
> VDMA) core, which is a soft Xilinx IP core that provides high-
> bandwidth direct memory access between memory and AXI4-Stream
> type video target peripherals. The core provides efficient two
> dimensional DMA operations with independent asynchronous read
> and write channel operation.
> 
> This module works on Zynq (ARM Based SoC) and Microblaze platforms.

Few comments below.

> 
> Signed-off-by: Srikanth Thokala 
> ---
> NOTE:
> 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
>DMA IPs and we are also planning to upstream these drivers soon.
> 2. Rebased on v3.13.0-rc8
> 
> Changes in v2:
> - Removed DMA Test client module from the patchset as suggested
>   by Andy Shevchenko
> - Removed device-id DT property, as suggested by Arnd Bergmann
> - Properly documented DT bindings as suggested by Arnd Bergmann
> - Returning with error, if registration of DMA to node fails
> - Fixed typo errors
> - Used BIT() macro at applicable places
> - Added missing header file to the patchset
> - Changed copyright year to include 2014
> ---
>  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
>  drivers/dma/Kconfig|   14 +
>  drivers/dma/Makefile   |1 +
>  drivers/dma/xilinx/Makefile|1 +
>  drivers/dma/xilinx/xilinx_vdma.c   | 1486 
> 
>  include/linux/amba/xilinx_dma.h|   50 +
>  6 files changed, 1627 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>  create mode 100644 drivers/dma/xilinx/Makefile
>  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
>  create mode 100644 include/linux/amba/xilinx_dma.h
> 
> diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt 
> b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
> new file mode 100644
> index 000..ab8be1a
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
> @@ -0,0 +1,75 @@
> +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
> +It can be configured to have one channel or two channels. If configured
> +as two channels, one is to transmit to the video device and another is
> +to receive from the video device.
> +
> +Required properties:
> +- compatible: Should be "xlnx,axi-vdma-1.00.a"
> +- #dma-cells: Should be <1>, see "dmas" property below
> +- reg: Should contain VDMA registers location and length.
> +- xlnx,num-fstores: Should be the number of framebuffers as configured in 
> h/w.
> +- dma-channel child node: Should have atleast one channel and can have upto
> + two channels per device. This node specifies the properties of each
> + DMA channel (see child node properties below).
> +
> +Optional properties:
> +- xlnx,include-sg: Tells whether configured for Scatter-mode in
> + the hardware.
> +- xlnx,flush-fsync: Tells whether which channel to Flush on Frame sync.
> + It takes following values:
> + {1}, flush both channels
> + {2}, flush mm2s channel
> + {3}, flush s2mm channel
> +
> +Required child node properties:
> +- compatible: It should be either "xlnx,axi-vdma-mm2s-channel" or
> + "xlnx,axi-vdma-s2mm-channel".
> +- interrupts: Should contain per channel VDMA interrupts.
> +- xlnx,data-width: Should contain the stream data width, take values
> + {32,64...1024}.
> +
> +Option child node properties:
> +- xlnx,include-dre: Tells whether hardware is configured for Data
> + Realignment Engine.
> +- xlnx,genlock-mode: Tells whether Genlock synchronization is
> + enabled/disabled in hardware.
> +
> +Example:
> +
> +
> +axi_vdma_0: axivdma@4003 {
> + compatible = "xlnx,axi-vdma-1.00.a";
> + #dma_cells = <1>;
> + reg = < 0x4003 0x1 >;
> + xlnx,num-fstores = <0x8>;
> + xlnx,flush-fsync = <0x1>;
> + dma-channel@4003 {
> + compatible = "xlnx,axi-vdma-mm2s-channel";
> + interrupts = < 0 54 4 >;
> + xlnx,datawidth = <0x40>;
> + } ;
> + dma-channel@40030030 {
> + compatible = "xlnx,axi-vdma-s2mm-channel";
> + interrupts = < 0 53 4 >;
> + xlnx,datawidth = <0x40>;
> + } ;
> +} ;
> +
> +
> +* DMA client
> +
> +Required properties:
> +- dmas: a list of <[Video DMA device phandle] [Channel ID]> pairs,
> + where Channel ID is '0' for write/tx and '1' for read/rx
> + channel.
> +- dma-names: a list of DMA channel names, one per "dmas" entry
> +
> +Example:
> +
> +
> +vdmatest_0: vdmatest@0 {
> + compatible ="xlnx,axi-vdma-test-1.00.a";
> + dmas = <_vdma_0 0
> + _vdma_0 1>;
> + dma-names = "vdma0", "vdma1";
> +} ;
> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
> index c823daa..2a74651 100644
> ---

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Lars-Peter Clausen

On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
[...]
> +/**
> + * xilinx_vdma_device_control - Configure DMA channel of the device
> + * @dchan: DMA Channel pointer
> + * @cmd: DMA control command
> + * @arg: Channel configuration
> + *
> + * Return: '0' on success and failure value on error
> + */
> +static int xilinx_vdma_device_control(struct dma_chan *dchan,
> +   enum dma_ctrl_cmd cmd, unsigned long arg)
> +{
> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
> +
> + switch (cmd) {
> + case DMA_TERMINATE_ALL:
> + xilinx_vdma_terminate_all(chan);
> + return 0;
> + case DMA_SLAVE_CONFIG:
> + return xilinx_vdma_slave_config(chan,
> + (struct xilinx_vdma_config *)arg);

You really shouldn't be overloading the generic API with your own semantics.
DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.

> + default:
> + return -ENXIO;
> + }
> +}
> +
[...]
> +
> + /* Request the interrupt */
> + chan->irq = irq_of_parse_and_map(node, 0);
> + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler,
> +IRQF_SHARED, "xilinx-vdma-controller", chan);

This is a clasic example of where to not use devm_request_irq. 'chan' is
accessed in the interrupt handler, but if you use devm_request_irq 'chan'
will be freed before the interrupt handler has been released, which means
there is now a race condition where the interrupt handler can access already
freed memory.

> + if (err) {
> + dev_err(xdev->dev, "unable to request IRQ\n");
> + return err;
> + }
> +
> + /* Initialize the DMA channel and add it to the DMA engine channels
> +  * list.
> +  */
> + chan->common.device = >common;
> +
> + list_add_tail(>common.device_node, >common.channels);
> + xdev->chan[chan->id] = chan;
> +
> + /* Reset the channel */
> + err = xilinx_vdma_chan_reset(chan);
> + if (err < 0) {
> + dev_err(xdev->dev, "Reset channel failed\n");
> + return err;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * struct of_dma_filter_xilinx_args - Channel filter args
> + * @dev: DMA device structure
> + * @chan_id: Channel id
> + */
> +struct of_dma_filter_xilinx_args {
> + struct dma_device *dev;
> + u32 chan_id;
> +};
> +
> +/**
> + * xilinx_vdma_dt_filter - VDMA channel filter function
> + * @chan: DMA channel pointer
> + * @param: Filter match value
> + *
> + * Return: true/false based on the result
> + */
> +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
> +{
> + struct of_dma_filter_xilinx_args *args = param;
> +
> + return chan->device == args->dev && chan->chan_id == args->chan_id;
> +}
> +
> +/**
> + * of_dma_xilinx_xlate - Translation function
> + * @dma_spec: Pointer to DMA specifier as found in the device tree
> + * @ofdma: Pointer to DMA controller data
> + *
> + * Return: DMA channel pointer on success and NULL on error
> + */
> +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
> + struct of_dma *ofdma)
> +{
> + struct of_dma_filter_xilinx_args args;
> + dma_cap_mask_t cap;
> +
> + args.dev = ofdma->of_dma_data;
> + if (!args.dev)
> + return NULL;
> +
> + if (dma_spec->args_count != 1)
> + return NULL;
> +
> + dma_cap_zero(cap);
> + dma_cap_set(DMA_SLAVE, cap);
> +
> + args.chan_id = dma_spec->args[0];
> +
> + return dma_request_channel(cap, xilinx_vdma_dt_filter, );

There is a new helper function called dma_get_slave_channel, which makes
this much easier. Take a look at the k3dma.c driver for an example.
> +}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Lars-Peter Clausen

On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
[...]
 +/**
 + * xilinx_vdma_device_control - Configure DMA channel of the device
 + * @dchan: DMA Channel pointer
 + * @cmd: DMA control command
 + * @arg: Channel configuration
 + *
 + * Return: '0' on success and failure value on error
 + */
 +static int xilinx_vdma_device_control(struct dma_chan *dchan,
 +   enum dma_ctrl_cmd cmd, unsigned long arg)
 +{
 + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan);
 +
 + switch (cmd) {
 + case DMA_TERMINATE_ALL:
 + xilinx_vdma_terminate_all(chan);
 + return 0;
 + case DMA_SLAVE_CONFIG:
 + return xilinx_vdma_slave_config(chan,
 + (struct xilinx_vdma_config *)arg);

You really shouldn't be overloading the generic API with your own semantics.
DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else.

 + default:
 + return -ENXIO;
 + }
 +}
 +
[...]
 +
 + /* Request the interrupt */
 + chan-irq = irq_of_parse_and_map(node, 0);
 + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler,
 +IRQF_SHARED, xilinx-vdma-controller, chan);

This is a clasic example of where to not use devm_request_irq. 'chan' is
accessed in the interrupt handler, but if you use devm_request_irq 'chan'
will be freed before the interrupt handler has been released, which means
there is now a race condition where the interrupt handler can access already
freed memory.

 + if (err) {
 + dev_err(xdev-dev, unable to request IRQ\n);
 + return err;
 + }
 +
 + /* Initialize the DMA channel and add it to the DMA engine channels
 +  * list.
 +  */
 + chan-common.device = xdev-common;
 +
 + list_add_tail(chan-common.device_node, xdev-common.channels);
 + xdev-chan[chan-id] = chan;
 +
 + /* Reset the channel */
 + err = xilinx_vdma_chan_reset(chan);
 + if (err  0) {
 + dev_err(xdev-dev, Reset channel failed\n);
 + return err;
 + }
 +
 + return 0;
 +}
 +
 +/**
 + * struct of_dma_filter_xilinx_args - Channel filter args
 + * @dev: DMA device structure
 + * @chan_id: Channel id
 + */
 +struct of_dma_filter_xilinx_args {
 + struct dma_device *dev;
 + u32 chan_id;
 +};
 +
 +/**
 + * xilinx_vdma_dt_filter - VDMA channel filter function
 + * @chan: DMA channel pointer
 + * @param: Filter match value
 + *
 + * Return: true/false based on the result
 + */
 +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param)
 +{
 + struct of_dma_filter_xilinx_args *args = param;
 +
 + return chan-device == args-dev  chan-chan_id == args-chan_id;
 +}
 +
 +/**
 + * of_dma_xilinx_xlate - Translation function
 + * @dma_spec: Pointer to DMA specifier as found in the device tree
 + * @ofdma: Pointer to DMA controller data
 + *
 + * Return: DMA channel pointer on success and NULL on error
 + */
 +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec,
 + struct of_dma *ofdma)
 +{
 + struct of_dma_filter_xilinx_args args;
 + dma_cap_mask_t cap;
 +
 + args.dev = ofdma-of_dma_data;
 + if (!args.dev)
 + return NULL;
 +
 + if (dma_spec-args_count != 1)
 + return NULL;
 +
 + dma_cap_zero(cap);
 + dma_cap_set(DMA_SLAVE, cap);
 +
 + args.chan_id = dma_spec-args[0];
 +
 + return dma_request_channel(cap, xilinx_vdma_dt_filter, args);

There is a new helper function called dma_get_slave_channel, which makes
this much easier. Take a look at the k3dma.c driver for an example.
 +}

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Andy Shevchenko

On Wed, 2014-01-22 at 22:22 +0530, Srikanth Thokala wrote:
 This is the driver for the AXI Video Direct Memory Access (AXI
 VDMA) core, which is a soft Xilinx IP core that provides high-
 bandwidth direct memory access between memory and AXI4-Stream
 type video target peripherals. The core provides efficient two
 dimensional DMA operations with independent asynchronous read
 and write channel operation.
 
 This module works on Zynq (ARM Based SoC) and Microblaze platforms.

Few comments below.

 
 Signed-off-by: Srikanth Thokala stho...@xilinx.com
 ---
 NOTE:
 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
DMA IPs and we are also planning to upstream these drivers soon.
 2. Rebased on v3.13.0-rc8
 
 Changes in v2:
 - Removed DMA Test client module from the patchset as suggested
   by Andy Shevchenko
 - Removed device-id DT property, as suggested by Arnd Bergmann
 - Properly documented DT bindings as suggested by Arnd Bergmann
 - Returning with error, if registration of DMA to node fails
 - Fixed typo errors
 - Used BIT() macro at applicable places
 - Added missing header file to the patchset
 - Changed copyright year to include 2014
 ---
  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
  drivers/dma/Kconfig|   14 +
  drivers/dma/Makefile   |1 +
  drivers/dma/xilinx/Makefile|1 +
  drivers/dma/xilinx/xilinx_vdma.c   | 1486 
 
  include/linux/amba/xilinx_dma.h|   50 +
  6 files changed, 1627 insertions(+)
  create mode 100644 
 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
  create mode 100644 drivers/dma/xilinx/Makefile
  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
  create mode 100644 include/linux/amba/xilinx_dma.h
 
 diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt 
 b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
 new file mode 100644
 index 000..ab8be1a
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
 @@ -0,0 +1,75 @@
 +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
 +It can be configured to have one channel or two channels. If configured
 +as two channels, one is to transmit to the video device and another is
 +to receive from the video device.
 +
 +Required properties:
 +- compatible: Should be xlnx,axi-vdma-1.00.a
 +- #dma-cells: Should be 1, see dmas property below
 +- reg: Should contain VDMA registers location and length.
 +- xlnx,num-fstores: Should be the number of framebuffers as configured in 
 h/w.
 +- dma-channel child node: Should have atleast one channel and can have upto
 + two channels per device. This node specifies the properties of each
 + DMA channel (see child node properties below).
 +
 +Optional properties:
 +- xlnx,include-sg: Tells whether configured for Scatter-mode in
 + the hardware.
 +- xlnx,flush-fsync: Tells whether which channel to Flush on Frame sync.
 + It takes following values:
 + {1}, flush both channels
 + {2}, flush mm2s channel
 + {3}, flush s2mm channel
 +
 +Required child node properties:
 +- compatible: It should be either xlnx,axi-vdma-mm2s-channel or
 + xlnx,axi-vdma-s2mm-channel.
 +- interrupts: Should contain per channel VDMA interrupts.
 +- xlnx,data-width: Should contain the stream data width, take values
 + {32,64...1024}.
 +
 +Option child node properties:
 +- xlnx,include-dre: Tells whether hardware is configured for Data
 + Realignment Engine.
 +- xlnx,genlock-mode: Tells whether Genlock synchronization is
 + enabled/disabled in hardware.
 +
 +Example:
 +
 +
 +axi_vdma_0: axivdma@4003 {
 + compatible = xlnx,axi-vdma-1.00.a;
 + #dma_cells = 1;
 + reg =  0x4003 0x1 ;
 + xlnx,num-fstores = 0x8;
 + xlnx,flush-fsync = 0x1;
 + dma-channel@4003 {
 + compatible = xlnx,axi-vdma-mm2s-channel;
 + interrupts =  0 54 4 ;
 + xlnx,datawidth = 0x40;
 + } ;
 + dma-channel@40030030 {
 + compatible = xlnx,axi-vdma-s2mm-channel;
 + interrupts =  0 53 4 ;
 + xlnx,datawidth = 0x40;
 + } ;
 +} ;
 +
 +
 +* DMA client
 +
 +Required properties:
 +- dmas: a list of [Video DMA device phandle] [Channel ID] pairs,
 + where Channel ID is '0' for write/tx and '1' for read/rx
 + channel.
 +- dma-names: a list of DMA channel names, one per dmas entry
 +
 +Example:
 +
 +
 +vdmatest_0: vdmatest@0 {
 + compatible =xlnx,axi-vdma-test-1.00.a;
 + dmas = axi_vdma_0 0
 + axi_vdma_0 1;
 + dma-names = vdma0, vdma1;
 +} ;
 diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
 index c823daa..2a74651 100644
 --- a/drivers/dma/Kconfig
 +++ b/drivers/dma/Kconfig
 @@ -334,6 +334,20 @@ config K3_DMA
 Support the DMA engine for Hisilicon K3 platform
 devices.

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Shevchenko, Andriy

On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:

[...]

  +   /* Request the interrupt */
  +   chan-irq = irq_of_parse_and_map(node, 0);
  +   err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler,
  +  IRQF_SHARED, xilinx-vdma-controller, chan);
 
 This is a clasic example of where to not use devm_request_irq. 'chan' is
 accessed in the interrupt handler, but if you use devm_request_irq 'chan'
 will be freed before the interrupt handler has been released, which means
 there is now a race condition where the interrupt handler can access already
 freed memory.

Could you elaborate this case? As far as I understood managed resources
are a kind of stack pile. In this case you have no such condition. Where
am I wrong?


-- 
Andy Shevchenko andriy.shevche...@intel.com
Intel Finland Oy
-
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Lars-Peter Clausen

On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
 On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
 
 [...]
 
 +   /* Request the interrupt */
 +   chan-irq = irq_of_parse_and_map(node, 0);
 +   err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler,
 +  IRQF_SHARED, xilinx-vdma-controller, chan);

 This is a clasic example of where to not use devm_request_irq. 'chan' is
 accessed in the interrupt handler, but if you use devm_request_irq 'chan'
 will be freed before the interrupt handler has been released, which means
 there is now a race condition where the interrupt handler can access already
 freed memory.ta
 
 Could you elaborate this case? As far as I understood managed resources
 are a kind of stack pile. In this case you have no such condition. Where
 am I wrong?

The stacked stuff is only ran after the remove() function. Which means that
you call dma_async_device_unregister() before the interrupt handler is
freed. Another issue with the interrupt handler is a bit hidden. The driver
does not call tasklet_kill in the remove function. Which it should though to
make sure that the tasklet does not race against the freeing of the memory.
And in order to make sure that the tasklet is not rescheduled you need to
free the irq before killing the tasklet, since the interrupt handler
schedules the tasklet.

- Lars

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Andy Shevchenko

On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
 On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
  On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
  On 01/22/2014 05:52 PM, Srikanth Thokala wrote:
  
  [...]
  
  + /* Request the interrupt */
  + chan-irq = irq_of_parse_and_map(node, 0);
  + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler,
  +IRQF_SHARED, xilinx-vdma-controller, chan);
 
  This is a clasic example of where to not use devm_request_irq. 'chan' is
  accessed in the interrupt handler, but if you use devm_request_irq 'chan'
  will be freed before the interrupt handler has been released, which means
  there is now a race condition where the interrupt handler can access 
  already
  freed memory.ta
  
  Could you elaborate this case? As far as I understood managed resources
  are a kind of stack pile. In this case you have no such condition. Where
  am I wrong?
 
 The stacked stuff is only ran after the remove() function. Which means that
 you call dma_async_device_unregister() before the interrupt handler is
 freed. Another issue with the interrupt handler is a bit hidden. The driver
 does not call tasklet_kill in the remove function. Which it should though to
 make sure that the tasklet does not race against the freeing of the memory.
 And in order to make sure that the tasklet is not rescheduled you need to
 free the irq before killing the tasklet, since the interrupt handler
 schedules the tasklet.

So, you mean devm_request_irq() will race in any DMA driver?

I think the proper solution is to disable all device work in
the .remove() and devm will care about resources.
 majordomo-info.html

-- 
Andy Shevchenko andriy.shevche...@linux.intel.com
Intel Finland Oy

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Lars-Peter Clausen

On 01/23/2014 03:00 PM, Andy Shevchenko wrote:
 On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote:
 On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote:
 On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote:
 On 01/22/2014 05:52 PM, Srikanth Thokala wrote:

 [...]

 + /* Request the interrupt */
 + chan-irq = irq_of_parse_and_map(node, 0);
 + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler,
 +IRQF_SHARED, xilinx-vdma-controller, chan);

 This is a clasic example of where to not use devm_request_irq. 'chan' is
 accessed in the interrupt handler, but if you use devm_request_irq 'chan'
 will be freed before the interrupt handler has been released, which means
 there is now a race condition where the interrupt handler can access 
 already
 freed memory.ta

 Could you elaborate this case? As far as I understood managed resources
 are a kind of stack pile. In this case you have no such condition. Where
 am I wrong?

 The stacked stuff is only ran after the remove() function. Which means that
 you call dma_async_device_unregister() before the interrupt handler is
 freed. Another issue with the interrupt handler is a bit hidden. The driver
 does not call tasklet_kill in the remove function. Which it should though to
 make sure that the tasklet does not race against the freeing of the memory.
 And in order to make sure that the tasklet is not rescheduled you need to
 free the irq before killing the tasklet, since the interrupt handler
 schedules the tasklet.
 
 So, you mean devm_request_irq() will race in any DMA driver?

Most likely yes. devm_request_irq() is race condition prone for the majority
of device driver. You have to be really careful if you want to use it.

 
 I think the proper solution is to disable all device work in
 the .remove() and devm will care about resources.

As long as the interrupt handler is registered it can be called, the only
proper solution is to make sure that the order in which resources are torn
down is correct.

- Lars
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-23 Thread Srikanth Thokala

Hi Levente,

On Thu, Jan 23, 2014 at 3:00 AM, Levente Kurusa le...@linux.com wrote:
 Hello,

 2014/1/22 Srikanth Thokala stho...@xilinx.com:
 This is the driver for the AXI Video Direct Memory Access (AXI
 VDMA) core, which is a soft Xilinx IP core that provides high-
 bandwidth direct memory access between memory and AXI4-Stream
 type video target peripherals. The core provides efficient two
 dimensional DMA operations with independent asynchronous read
 and write channel operation.

 This module works on Zynq (ARM Based SoC) and Microblaze platforms.

 Signed-off-by: Srikanth Thokala stho...@xilinx.com
 ---

 Another two remarks, after you fixed them ( or not :-) )
 you can have my:

 Reviewed-by: Levente Kurusa le...@linux.com

 Oh, and next time please if you post a patch that fixes something I pointed 
 out,
 CC me as I had a hard time finding this patch, thanks. :-)

Sure. Thanks


 NOTE:
 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
DMA IPs and we are also planning to upstream these drivers soon.
 2. Rebased on v3.13.0-rc8

 Changes in v2:
 - Removed DMA Test client module from the patchset as suggested
   by Andy Shevchenko
 - Removed device-id DT property, as suggested by Arnd Bergmann
 - Properly documented DT bindings as suggested by Arnd Bergmann
 - Returning with error, if registration of DMA to node fails
 - Fixed typo errors
 - Used BIT() macro at applicable places
 - Added missing header file to the patchset
 - Changed copyright year to include 2014
 ---
  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
  drivers/dma/Kconfig|   14 +
  drivers/dma/Makefile   |1 +
  drivers/dma/xilinx/Makefile|1 +
  drivers/dma/xilinx/xilinx_vdma.c   | 1486 
 
  include/linux/amba/xilinx_dma.h|   50 +
  6 files changed, 1627 insertions(+)
  create mode 100644 
 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
  create mode 100644 drivers/dma/xilinx/Makefile
  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
  create mode 100644 include/linux/amba/xilinx_dma.h

 diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt 
 b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
 new file mode 100644
 index 000..ab8be1a
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
 @@ -0,0 +1,75 @@
 +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
 +It can be configured to have one channel or two channels. If configured
 +as two channels, one is to transmit to the video device and another is
 +to receive from the video device.
 +
 +Required properties:
 +- compatible: Should be xlnx,axi-vdma-1.00.a
 +- #dma-cells: Should be 1, see dmas property below
 +- reg: Should contain VDMA registers location and length.
 +- xlnx,num-fstores: Should be the number of framebuffers as configured in 
 h/w.
 +- dma-channel child node: Should have atleast one channel and can have upto
 +   two channels per device. This node specifies the properties of each
 +   DMA channel (see child node properties below).
 +
 +Optional properties:
 +- xlnx,include-sg: Tells whether configured for Scatter-mode in
 +   the hardware.
 [...]
 +
 +/**
 + * xilinx_vdma_is_running - Check if VDMA channel is running
 + * @chan: Driver specific VDMA channel
 + *
 + * Return: '1' if running, '0' if not.
 + */
 +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan)
 +{
 +   return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) 
 +XILINX_VDMA_DMASR_HALTED) 
 +   (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) 
 +XILINX_VDMA_DMACR_RUNSTOP);
 +}
 +

[...]

 +   /* Retrieve the channel properties from the device tree */
 +   has_dre = of_property_read_bool(node, xlnx,include-dre);
 +
 +   chan-genlock = of_property_read_bool(node, xlnx,genlock-mode);
 +
 +   err = of_property_read_u32(node, xlnx,datawidth, value);
 +   if (!err) {
 +   u32 width = value  3; /* Convert bits to bytes */
 +
 +   /* If data width is greater than 8 bytes, DRE is not in hw */
 +   if (width  8)
 +   has_dre = false;
 +
 +   if (!has_dre)
 +   xdev-common.copy_align = fls(width - 1);
 +   } else {
 +   dev_err(xdev-dev, missing xlnx,datawidth property\n);
 +   return err;
 +   }

 Can you please convert this to:
 if (err) {
  dev_err(...);
  return err;
 }

 That way we can avoid the else clause.

Ok. I will fix it in v3.

 +
 +   if (of_device_is_compatible(node, xlnx,axi-vdma-mm2s-channel)) {
 +   chan-direction = DMA_MEM_TO_DEV;
 +   chan-id = 0;
 +
 +   chan-ctrl_offset = XILINX_VDMA_MM2S_CTRL_OFFSET;
 +   chan-desc_offset =

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-22 Thread Levente Kurusa

Hello,

2014/1/22 Srikanth Thokala :
> This is the driver for the AXI Video Direct Memory Access (AXI
> VDMA) core, which is a soft Xilinx IP core that provides high-
> bandwidth direct memory access between memory and AXI4-Stream
> type video target peripherals. The core provides efficient two
> dimensional DMA operations with independent asynchronous read
> and write channel operation.
>
> This module works on Zynq (ARM Based SoC) and Microblaze platforms.
>
> Signed-off-by: Srikanth Thokala 
> ---

Another two remarks, after you fixed them ( or not :-) )
you can have my:

Reviewed-by: Levente Kurusa 

Oh, and next time please if you post a patch that fixes something I pointed out,
CC me as I had a hard time finding this patch, thanks. :-)

> NOTE:
> 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
>DMA IPs and we are also planning to upstream these drivers soon.
> 2. Rebased on v3.13.0-rc8
>
> Changes in v2:
> - Removed DMA Test client module from the patchset as suggested
>   by Andy Shevchenko
> - Removed device-id DT property, as suggested by Arnd Bergmann
> - Properly documented DT bindings as suggested by Arnd Bergmann
> - Returning with error, if registration of DMA to node fails
> - Fixed typo errors
> - Used BIT() macro at applicable places
> - Added missing header file to the patchset
> - Changed copyright year to include 2014
> ---
>  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
>  drivers/dma/Kconfig|   14 +
>  drivers/dma/Makefile   |1 +
>  drivers/dma/xilinx/Makefile|1 +
>  drivers/dma/xilinx/xilinx_vdma.c   | 1486 
> 
>  include/linux/amba/xilinx_dma.h|   50 +
>  6 files changed, 1627 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
>  create mode 100644 drivers/dma/xilinx/Makefile
>  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
>  create mode 100644 include/linux/amba/xilinx_dma.h
>
> diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt 
> b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
> new file mode 100644
> index 000..ab8be1a
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
> @@ -0,0 +1,75 @@
> +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
> +It can be configured to have one channel or two channels. If configured
> +as two channels, one is to transmit to the video device and another is
> +to receive from the video device.
> +
> +Required properties:
> +- compatible: Should be "xlnx,axi-vdma-1.00.a"
> +- #dma-cells: Should be <1>, see "dmas" property below
> +- reg: Should contain VDMA registers location and length.
> +- xlnx,num-fstores: Should be the number of framebuffers as configured in 
> h/w.
> +- dma-channel child node: Should have atleast one channel and can have upto
> +   two channels per device. This node specifies the properties of each
> +   DMA channel (see child node properties below).
> +
> +Optional properties:
> +- xlnx,include-sg: Tells whether configured for Scatter-mode in
> +   the hardware.
> [...]
> +
> +/**
> + * xilinx_vdma_is_running - Check if VDMA channel is running
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '1' if running, '0' if not.
> + */
> +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan)
> +{
> +   return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +XILINX_VDMA_DMASR_HALTED) &&
> +   (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) &
> +XILINX_VDMA_DMACR_RUNSTOP);
> +}
> +
> +/**
> + * xilinx_vdma_is_idle - Check if VDMA channel is idle
> + * @chan: Driver specific VDMA channel
> + *
> + * Return: '1' if idle, '0' if not.
> + */
> +static int xilinx_vdma_is_idle(struct xilinx_vdma_chan *chan)
> +{
> +   return vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +   XILINX_VDMA_DMASR_IDLE;
> +}
> +
> +/**
> + * xilinx_vdma_halt - Halt VDMA channel
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_halt(struct xilinx_vdma_chan *chan)
> +{
> +   int loop = XILINX_VDMA_LOOP_COUNT + 1;
> +
> +   vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
> +
> +   /* Wait for the hardware to halt */
> +   while (loop--)
> +   if (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) &
> +   XILINX_VDMA_DMASR_HALTED)
> +   break;
> +
> +   if (!loop) {
> +   dev_err(chan->dev, "Cannot stop channel %p: %x\n",
> +   chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
> +   chan->err = true;
> +   }
> +
> +   return;
> +}
> +
> +/**
> + * xilinx_vdma_start - Start VDMA channel
> + * @chan: Driver specific VDMA channel
> + */
> +static void xilinx_vdma_start(struct

Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support

2014-01-22 Thread Levente Kurusa

Hello,

2014/1/22 Srikanth Thokala stho...@xilinx.com:
 This is the driver for the AXI Video Direct Memory Access (AXI
 VDMA) core, which is a soft Xilinx IP core that provides high-
 bandwidth direct memory access between memory and AXI4-Stream
 type video target peripherals. The core provides efficient two
 dimensional DMA operations with independent asynchronous read
 and write channel operation.

 This module works on Zynq (ARM Based SoC) and Microblaze platforms.

 Signed-off-by: Srikanth Thokala stho...@xilinx.com
 ---

Another two remarks, after you fixed them ( or not :-) )
you can have my:

Reviewed-by: Levente Kurusa le...@linux.com

Oh, and next time please if you post a patch that fixes something I pointed out,
CC me as I had a hard time finding this patch, thanks. :-)

 NOTE:
 1. Created a separate directory 'dma/xilinx' as Xilinx has two more
DMA IPs and we are also planning to upstream these drivers soon.
 2. Rebased on v3.13.0-rc8

 Changes in v2:
 - Removed DMA Test client module from the patchset as suggested
   by Andy Shevchenko
 - Removed device-id DT property, as suggested by Arnd Bergmann
 - Properly documented DT bindings as suggested by Arnd Bergmann
 - Returning with error, if registration of DMA to node fails
 - Fixed typo errors
 - Used BIT() macro at applicable places
 - Added missing header file to the patchset
 - Changed copyright year to include 2014
 ---
  .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt |   75 +
  drivers/dma/Kconfig|   14 +
  drivers/dma/Makefile   |1 +
  drivers/dma/xilinx/Makefile|1 +
  drivers/dma/xilinx/xilinx_vdma.c   | 1486 
 
  include/linux/amba/xilinx_dma.h|   50 +
  6 files changed, 1627 insertions(+)
  create mode 100644 
 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
  create mode 100644 drivers/dma/xilinx/Makefile
  create mode 100644 drivers/dma/xilinx/xilinx_vdma.c
  create mode 100644 include/linux/amba/xilinx_dma.h

 diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt 
 b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
 new file mode 100644
 index 000..ab8be1a
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt
 @@ -0,0 +1,75 @@
 +Xilinx AXI VDMA engine, it does transfers between memory and video devices.
 +It can be configured to have one channel or two channels. If configured
 +as two channels, one is to transmit to the video device and another is
 +to receive from the video device.
 +
 +Required properties:
 +- compatible: Should be xlnx,axi-vdma-1.00.a
 +- #dma-cells: Should be 1, see dmas property below
 +- reg: Should contain VDMA registers location and length.
 +- xlnx,num-fstores: Should be the number of framebuffers as configured in 
 h/w.
 +- dma-channel child node: Should have atleast one channel and can have upto
 +   two channels per device. This node specifies the properties of each
 +   DMA channel (see child node properties below).
 +
 +Optional properties:
 +- xlnx,include-sg: Tells whether configured for Scatter-mode in
 +   the hardware.
 [...]
 +
 +/**
 + * xilinx_vdma_is_running - Check if VDMA channel is running
 + * @chan: Driver specific VDMA channel
 + *
 + * Return: '1' if running, '0' if not.
 + */
 +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan)
 +{
 +   return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) 
 +XILINX_VDMA_DMASR_HALTED) 
 +   (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) 
 +XILINX_VDMA_DMACR_RUNSTOP);
 +}
 +
 +/**
 + * xilinx_vdma_is_idle - Check if VDMA channel is idle
 + * @chan: Driver specific VDMA channel
 + *
 + * Return: '1' if idle, '0' if not.
 + */
 +static int xilinx_vdma_is_idle(struct xilinx_vdma_chan *chan)
 +{
 +   return vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) 
 +   XILINX_VDMA_DMASR_IDLE;
 +}
 +
 +/**
 + * xilinx_vdma_halt - Halt VDMA channel
 + * @chan: Driver specific VDMA channel
 + */
 +static void xilinx_vdma_halt(struct xilinx_vdma_chan *chan)
 +{
 +   int loop = XILINX_VDMA_LOOP_COUNT + 1;
 +
 +   vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP);
 +
 +   /* Wait for the hardware to halt */
 +   while (loop--)
 +   if (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) 
 +   XILINX_VDMA_DMASR_HALTED)
 +   break;
 +
 +   if (!loop) {
 +   dev_err(chan-dev, Cannot stop channel %p: %x\n,
 +   chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR));
 +   chan-err = true;
 +   }
 +
 +   return;
 +}
 +
 +/**
 + * xilinx_vdma_start - Start VDMA channel
 + * @chan: Driver specific VDMA channel
 + */
 +static void xilinx_vdma_start(struct xilinx_vdma_chan *chan)
 +{
 +   int loop = XILINX_VDMA_LOOP_COUNT + 1;
 +
 +

62 matches

Mail list logo