Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Thu, Feb 6, 2014 at 9:23 PM, Lars-Peter Clausen wrote: > On 02/06/2014 02:34 PM, Srikanth Thokala wrote: >> >> On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen >> wrote: >>> >>> On 02/05/2014 05:25 PM, Srikanth Thokala wrote: On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala wrote: > > > Hi Vinod, > > On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul > wrote: >> >> >> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: >>> >>> >>> Hi Lars/Vinod, > > > The question here i think would be waht this device supports? Is > the > hardware > capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt->num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. >>> >>> >>> >>> I went through the prep_interleaved_dma API and I see only one >>> descriptor >>> is prepared per API call (i.e. per frame). As our IP supports upto >>> 16 >>> frame >>> buffers (can be more in future), isn't it less efficient compared to >>> the >>> prep_slave_sg where we get a single sg list and can prepare all the >>> descriptors >>> (of non-contiguous buffers) in one go? Correct me, if am wrong and >>> let >>> me >>> know your opinions. >> >> >> Well the descriptor maybe one, but that can represent multiple frames, >> for >> example 16 as in your case. Can you read up the documentation of how >> multiple >> frames are passed. Pls see include/linux/dmaengine.h >> >> /** >>* Interleaved Transfer Request >>* >>* A chunk is collection of contiguous bytes to be transfered. >>* The gap(in bytes) between two chunks is called >> inter-chunk-gap(ICG). >>* ICGs may or maynot change between chunks. >>* A FRAME is the smallest series of contiguous {chunk,icg} pairs, >>* that when repeated an integral number of times, specifies the >> transfer. >>* A transfer template is specification of a Frame, the number of >> times >>* it is to be repeated and other per-transfer attributes. >>* >>* Practically, a client driver would have ready a template for each >>* type of transfer it is going to need during its lifetime and >>* set only 'src_start' and 'dst_start' before submitting the >> requests. >>* >>* >>* | Frame-1| Frame-2 | ~ | >> Frame-'numf' | >>* |==.===...=...|==.===...=...| ~ >> |==.===...=...| >>* >>*== Chunk size >>*... ICG >>*/ > > > > Yes, it can handle multiple frames specified by 'numf' each of size > 'frame_size * sgl[0].size'. > But, I see it only works if all the frames' memory is contiguous and > in this case we > can just increment 'src_start' by the total frame size 'numf' number > of times to fill in > for each HW descriptor (each frame is one HW descriptor). So, there > is no issue when the > memory is contiguous. If the frames are non contiguous, we have to > call this API for each > frame (hence for each descriptor), as the src_start for each frame is > different. Is it correct? > > FYI: This hardware has an inbuilt Scatter-Gather engine. > Ping? >>> >>> >>> >>> If you want to submit multiple frames at once I think you should look at >>> how >>> the current dmaengine API can be extended to allow that. And also provide >>> an >>> explanation on how this is superior over submitting them one by one. >> >> >> >> Sure. I would start with explaning the current implementation of this >> driver. >> >> Using prep_slave_sg(), we can define multiple segments in a >> async_tx_descriptor where each frame is defined by a segment (a sg >> list entry). So, the slave device could DMA the data (of multiple >> frames) with a descriptor by calling tx_submit in a transaction i.e., >> >> prep_slave_sg(16) -> tx_submit(1) -> interrupt (16 frames) >> >> Using interleaved_dma(), we could not divide into segments when we >> have scattered memory (for the reasons mentioned in above thread). >> This implies we are restricting the slave device to process frame by >> frame i.e., >> >> interleaved_dma(1) -> tx_submit(1) -> interrupt -> interleaved_dma(2) >> -> tx_submit (2) -> interrupt ->
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Thu, Feb 6, 2014 at 9:23 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 02/06/2014 02:34 PM, Srikanth Thokala wrote: On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 02/05/2014 05:25 PM, Srikanth Thokala wrote: On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com wrote: Hi Vinod, On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote: On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: Hi Lars/Vinod, The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ Yes, it can handle multiple frames specified by 'numf' each of size 'frame_size * sgl[0].size'. But, I see it only works if all the frames' memory is contiguous and in this case we can just increment 'src_start' by the total frame size 'numf' number of times to fill in for each HW descriptor (each frame is one HW descriptor). So, there is no issue when the memory is contiguous. If the frames are non contiguous, we have to call this API for each frame (hence for each descriptor), as the src_start for each frame is different. Is it correct? FYI: This hardware has an inbuilt Scatter-Gather engine. Ping? If you want to submit multiple frames at once I think you should look at how the current dmaengine API can be extended to allow that. And also provide an explanation on how this is superior over submitting them one by one. Sure. I would start with explaning the current implementation of this driver. Using prep_slave_sg(), we can define multiple segments in a async_tx_descriptor where each frame is defined by a segment (a sg list entry). So, the slave device could DMA the data (of multiple frames) with a descriptor by calling tx_submit in a transaction i.e., prep_slave_sg(16) - tx_submit(1) - interrupt (16 frames) Using interleaved_dma(), we could not divide into segments when we have scattered memory (for the reasons mentioned in above thread). This implies we are restricting the slave device to process frame by frame i.e., interleaved_dma(1) - tx_submit(1) - interrupt - interleaved_dma(2) - tx_submit (2) - interrupt - tx_submit(16) - interrupt The API allows you to create and submit multiple interleaved descriptors before you have to issue them. interleaved_dma(1) - tx_submit(1) - interleaved_dma(2) - tx_submit(2) - ... - issue_pending() - interrupt This implementation makes the hardware to wait until the next frame is submitted. To overcome this, I feel it would be a good option if we could extend interleaved_dma template to modify src_start/dest_start to be a pointer to an array of addresses. Here, number of addresses will be defined by numf. The other option would be to include scatterlist in the interleaved template. This way we can handle scattered memory using
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 02/06/2014 02:34 PM, Srikanth Thokala wrote: On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen wrote: On 02/05/2014 05:25 PM, Srikanth Thokala wrote: On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala wrote: Hi Vinod, On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul wrote: On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: Hi Lars/Vinod, The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt->num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ Yes, it can handle multiple frames specified by 'numf' each of size 'frame_size * sgl[0].size'. But, I see it only works if all the frames' memory is contiguous and in this case we can just increment 'src_start' by the total frame size 'numf' number of times to fill in for each HW descriptor (each frame is one HW descriptor). So, there is no issue when the memory is contiguous. If the frames are non contiguous, we have to call this API for each frame (hence for each descriptor), as the src_start for each frame is different. Is it correct? FYI: This hardware has an inbuilt Scatter-Gather engine. Ping? If you want to submit multiple frames at once I think you should look at how the current dmaengine API can be extended to allow that. And also provide an explanation on how this is superior over submitting them one by one. Sure. I would start with explaning the current implementation of this driver. Using prep_slave_sg(), we can define multiple segments in a async_tx_descriptor where each frame is defined by a segment (a sg list entry). So, the slave device could DMA the data (of multiple frames) with a descriptor by calling tx_submit in a transaction i.e., prep_slave_sg(16) -> tx_submit(1) -> interrupt (16 frames) Using interleaved_dma(), we could not divide into segments when we have scattered memory (for the reasons mentioned in above thread). This implies we are restricting the slave device to process frame by frame i.e., interleaved_dma(1) -> tx_submit(1) -> interrupt -> interleaved_dma(2) -> tx_submit (2) -> interrupt -> tx_submit(16) -> interrupt The API allows you to create and submit multiple interleaved descriptors before you have to issue them. interleaved_dma(1) -> tx_submit(1) -> interleaved_dma(2) -> tx_submit(2) -> ... -> issue_pending() -> interrupt This implementation makes the hardware to wait until the next frame is submitted. To overcome this, I feel it would be a good option if we could extend interleaved_dma template to modify src_start/dest_start to be a pointer to an array of addresses. Here, number of addresses will be defined by numf. The other option would be to include scatterlist in the interleaved template. This way we can handle scattered memory using this API. Each "frame" in a interleaved transfer describes a single line in your video frame (size = width, icg = stride). numf is the number of lines per video frame. So the suggested change does not make that much
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen wrote: > On 02/05/2014 05:25 PM, Srikanth Thokala wrote: >> >> On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala >> wrote: >>> >>> Hi Vinod, >>> >>> On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul wrote: On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: > > Hi Lars/Vinod, >>> >>> The question here i think would be waht this device supports? Is the >>> hardware >>> capable of doing interleaved transfers, then would make sense. >> >> >> The hardware does 2D transfers. The parameters for a transfer are >> height, >> width and stride. That's only a subset of what interleaved transfers >> can be >> (xt->num_frames must be one for 2d transfers). But if I remember >> correctly >> there has been some discussion on this in the past and the result of >> that >> discussion was that using interleaved transfers for 2D transfers is >> preferred over adding a custom API for 2D transfers. > > > I went through the prep_interleaved_dma API and I see only one > descriptor > is prepared per API call (i.e. per frame). As our IP supports upto 16 > frame > buffers (can be more in future), isn't it less efficient compared to > the > prep_slave_sg where we get a single sg list and can prepare all the > descriptors > (of non-contiguous buffers) in one go? Correct me, if am wrong and let > me > know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ >>> >>> >>> Yes, it can handle multiple frames specified by 'numf' each of size >>> 'frame_size * sgl[0].size'. >>> But, I see it only works if all the frames' memory is contiguous and >>> in this case we >>> can just increment 'src_start' by the total frame size 'numf' number >>> of times to fill in >>> for each HW descriptor (each frame is one HW descriptor). So, there >>> is no issue when the >>> memory is contiguous. If the frames are non contiguous, we have to >>> call this API for each >>> frame (hence for each descriptor), as the src_start for each frame is >>> different. Is it correct? >>> >>> FYI: This hardware has an inbuilt Scatter-Gather engine. >>> >> >> Ping? > > > If you want to submit multiple frames at once I think you should look at how > the current dmaengine API can be extended to allow that. And also provide an > explanation on how this is superior over submitting them one by one. Sure. I would start with explaning the current implementation of this driver. Using prep_slave_sg(), we can define multiple segments in a async_tx_descriptor where each frame is defined by a segment (a sg list entry). So, the slave device could DMA the data (of multiple frames) with a descriptor by calling tx_submit in a transaction i.e., prep_slave_sg(16) -> tx_submit(1) -> interrupt (16 frames) Using interleaved_dma(), we could not divide into segments when we have scattered memory (for the reasons mentioned in above thread). This implies we are restricting the slave device to process frame by frame i.e., interleaved_dma(1) -> tx_submit(1) -> interrupt -> interleaved_dma(2) -> tx_submit (2) -> interrupt -> tx_submit(16) -> interrupt This implementation makes the hardware to wait until the next frame is submitted. To overcome this, I feel it would be a good option if we could extend interleaved_dma template to modify src_start/dest_start to be a pointer to an array of addresses. Here, number of addresses will be defined by numf. The other option would be to include scatterlist in the interleaved template. This way we can handle scattered memory using this API. Srikanth > > - Lars > > > -- > To
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 02/05/2014 05:25 PM, Srikanth Thokala wrote: On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com wrote: Hi Vinod, On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote: On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: Hi Lars/Vinod, The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ Yes, it can handle multiple frames specified by 'numf' each of size 'frame_size * sgl[0].size'. But, I see it only works if all the frames' memory is contiguous and in this case we can just increment 'src_start' by the total frame size 'numf' number of times to fill in for each HW descriptor (each frame is one HW descriptor). So, there is no issue when the memory is contiguous. If the frames are non contiguous, we have to call this API for each frame (hence for each descriptor), as the src_start for each frame is different. Is it correct? FYI: This hardware has an inbuilt Scatter-Gather engine. Ping? If you want to submit multiple frames at once I think you should look at how the current dmaengine API can be extended to allow that. And also provide an explanation on how this is superior over submitting them one by one. Sure. I would start with explaning the current implementation of this driver. Using prep_slave_sg(), we can define multiple segments in a async_tx_descriptor where each frame is defined by a segment (a sg list entry). So, the slave device could DMA the data (of multiple frames) with a descriptor by calling tx_submit in a transaction i.e., prep_slave_sg(16) - tx_submit(1) - interrupt (16 frames) Using interleaved_dma(), we could not divide into segments when we have scattered memory (for the reasons mentioned in above thread). This implies we are restricting the slave device to process frame by frame i.e., interleaved_dma(1) - tx_submit(1) - interrupt - interleaved_dma(2) - tx_submit (2) - interrupt - tx_submit(16) - interrupt This implementation makes the hardware to wait until the next frame is submitted. To overcome this, I feel it would be a good option if we could extend interleaved_dma template to modify src_start/dest_start to be a pointer to an array of addresses. Here, number of addresses will be defined by numf. The other option would be to include scatterlist in the interleaved template. This way we can handle scattered memory using this API. Srikanth - Lars -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 02/06/2014 02:34 PM, Srikanth Thokala wrote: On Wed, Feb 5, 2014 at 10:00 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 02/05/2014 05:25 PM, Srikanth Thokala wrote: On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com wrote: Hi Vinod, On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote: On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: Hi Lars/Vinod, The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ Yes, it can handle multiple frames specified by 'numf' each of size 'frame_size * sgl[0].size'. But, I see it only works if all the frames' memory is contiguous and in this case we can just increment 'src_start' by the total frame size 'numf' number of times to fill in for each HW descriptor (each frame is one HW descriptor). So, there is no issue when the memory is contiguous. If the frames are non contiguous, we have to call this API for each frame (hence for each descriptor), as the src_start for each frame is different. Is it correct? FYI: This hardware has an inbuilt Scatter-Gather engine. Ping? If you want to submit multiple frames at once I think you should look at how the current dmaengine API can be extended to allow that. And also provide an explanation on how this is superior over submitting them one by one. Sure. I would start with explaning the current implementation of this driver. Using prep_slave_sg(), we can define multiple segments in a async_tx_descriptor where each frame is defined by a segment (a sg list entry). So, the slave device could DMA the data (of multiple frames) with a descriptor by calling tx_submit in a transaction i.e., prep_slave_sg(16) - tx_submit(1) - interrupt (16 frames) Using interleaved_dma(), we could not divide into segments when we have scattered memory (for the reasons mentioned in above thread). This implies we are restricting the slave device to process frame by frame i.e., interleaved_dma(1) - tx_submit(1) - interrupt - interleaved_dma(2) - tx_submit (2) - interrupt - tx_submit(16) - interrupt The API allows you to create and submit multiple interleaved descriptors before you have to issue them. interleaved_dma(1) - tx_submit(1) - interleaved_dma(2) - tx_submit(2) - ... - issue_pending() - interrupt This implementation makes the hardware to wait until the next frame is submitted. To overcome this, I feel it would be a good option if we could extend interleaved_dma template to modify src_start/dest_start to be a pointer to an array of addresses. Here, number of addresses will be defined by numf. The other option would be to include scatterlist in the interleaved template. This way we can handle scattered memory using this API. Each frame in a interleaved transfer describes a single line in your video frame (size = width, icg = stride). numf is the number of lines per video frame. So the
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 02/05/2014 05:25 PM, Srikanth Thokala wrote: On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala wrote: Hi Vinod, On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul wrote: On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: Hi Lars/Vinod, The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt->num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ Yes, it can handle multiple frames specified by 'numf' each of size 'frame_size * sgl[0].size'. But, I see it only works if all the frames' memory is contiguous and in this case we can just increment 'src_start' by the total frame size 'numf' number of times to fill in for each HW descriptor (each frame is one HW descriptor). So, there is no issue when the memory is contiguous. If the frames are non contiguous, we have to call this API for each frame (hence for each descriptor), as the src_start for each frame is different. Is it correct? FYI: This hardware has an inbuilt Scatter-Gather engine. Ping? If you want to submit multiple frames at once I think you should look at how the current dmaengine API can be extended to allow that. And also provide an explanation on how this is superior over submitting them one by one. - Lars -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala wrote: > Hi Vinod, > > On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul wrote: >> On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: >>> Hi Lars/Vinod, >>> >> The question here i think would be waht this device supports? Is the >>> >> hardware >>> >> capable of doing interleaved transfers, then would make sense. >>> > >>> > The hardware does 2D transfers. The parameters for a transfer are height, >>> > width and stride. That's only a subset of what interleaved transfers can >>> > be >>> > (xt->num_frames must be one for 2d transfers). But if I remember correctly >>> > there has been some discussion on this in the past and the result of that >>> > discussion was that using interleaved transfers for 2D transfers is >>> > preferred over adding a custom API for 2D transfers. >>> >>> I went through the prep_interleaved_dma API and I see only one descriptor >>> is prepared per API call (i.e. per frame). As our IP supports upto 16 frame >>> buffers (can be more in future), isn't it less efficient compared to the >>> prep_slave_sg where we get a single sg list and can prepare all the >>> descriptors >>> (of non-contiguous buffers) in one go? Correct me, if am wrong and let me >>> know your opinions. >> Well the descriptor maybe one, but that can represent multiple frames, for >> example 16 as in your case. Can you read up the documentation of how multiple >> frames are passed. Pls see include/linux/dmaengine.h >> >> /** >> * Interleaved Transfer Request >> * >> * A chunk is collection of contiguous bytes to be transfered. >> * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). >> * ICGs may or maynot change between chunks. >> * A FRAME is the smallest series of contiguous {chunk,icg} pairs, >> * that when repeated an integral number of times, specifies the transfer. >> * A transfer template is specification of a Frame, the number of times >> * it is to be repeated and other per-transfer attributes. >> * >> * Practically, a client driver would have ready a template for each >> * type of transfer it is going to need during its lifetime and >> * set only 'src_start' and 'dst_start' before submitting the requests. >> * >> * >> * | Frame-1| Frame-2 | ~ | Frame-'numf' | >> * |==.===...=...|==.===...=...| ~ |==.===...=...| >> * >> *== Chunk size >> *... ICG >> */ > > Yes, it can handle multiple frames specified by 'numf' each of size > 'frame_size * sgl[0].size'. > But, I see it only works if all the frames' memory is contiguous and > in this case we > can just increment 'src_start' by the total frame size 'numf' number > of times to fill in > for each HW descriptor (each frame is one HW descriptor). So, there > is no issue when the > memory is contiguous. If the frames are non contiguous, we have to > call this API for each > frame (hence for each descriptor), as the src_start for each frame is > different. Is it correct? > > FYI: This hardware has an inbuilt Scatter-Gather engine. > Ping? > Srikanth > >> >> -- >> ~Vinod >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com wrote: Hi Vinod, On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote: On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: Hi Lars/Vinod, The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ Yes, it can handle multiple frames specified by 'numf' each of size 'frame_size * sgl[0].size'. But, I see it only works if all the frames' memory is contiguous and in this case we can just increment 'src_start' by the total frame size 'numf' number of times to fill in for each HW descriptor (each frame is one HW descriptor). So, there is no issue when the memory is contiguous. If the frames are non contiguous, we have to call this API for each frame (hence for each descriptor), as the src_start for each frame is different. Is it correct? FYI: This hardware has an inbuilt Scatter-Gather engine. Ping? Srikanth -- ~Vinod -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 02/05/2014 05:25 PM, Srikanth Thokala wrote: On Fri, Jan 31, 2014 at 12:21 PM, Srikanth Thokala stho...@xilinx.com wrote: Hi Vinod, On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote: On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: Hi Lars/Vinod, The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ Yes, it can handle multiple frames specified by 'numf' each of size 'frame_size * sgl[0].size'. But, I see it only works if all the frames' memory is contiguous and in this case we can just increment 'src_start' by the total frame size 'numf' number of times to fill in for each HW descriptor (each frame is one HW descriptor). So, there is no issue when the memory is contiguous. If the frames are non contiguous, we have to call this API for each frame (hence for each descriptor), as the src_start for each frame is different. Is it correct? FYI: This hardware has an inbuilt Scatter-Gather engine. Ping? If you want to submit multiple frames at once I think you should look at how the current dmaengine API can be extended to allow that. And also provide an explanation on how this is superior over submitting them one by one. - Lars -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Tue, Feb 4, 2014 at 10:58 AM, Vinod Koul wrote: > On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote: >> >>> >> [...] >> >>> >>> +/** >> >>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device >> >>> >>> + * @dchan: DMA Channel pointer >> >>> >>> + * @cmd: DMA control command >> >>> >>> + * @arg: Channel configuration >> >>> >>> + * >> >>> >>> + * Return: '0' on success and failure value on error >> >>> >>> + */ >> >>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan, >> >>> >>> + enum dma_ctrl_cmd cmd, unsigned >> >>> >>> long arg) >> >>> >>> +{ >> >>> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); >> >>> >>> + >> >>> >>> + switch (cmd) { >> >>> >>> + case DMA_TERMINATE_ALL: >> >>> >>> + xilinx_vdma_terminate_all(chan); >> >>> >>> + return 0; >> >>> >>> + case DMA_SLAVE_CONFIG: >> >>> >>> + return xilinx_vdma_slave_config(chan, >> >>> >>> + (struct xilinx_vdma_config >> >>> >>> *)arg); >> >>> >> >> >>> >> You really shouldn't be overloading the generic API with your own >> >>> >> semantics. >> >>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. >> >>> > >> >>> > Ok. The driver needs few additional configuration from the slave >> >>> > device like Vertical >> >>> > Size, Horizontal Size, Stride etc., for the DMA transfers, in that >> >>> > case do you >> >>> > suggest me to define a separate dma_ctrl_cmd like the one >> >>> > FSLDMA_EXTERNAL_START >> >>> > defined for Freescale drivers? >> >>> >> >>> In my opinion it is not a good idea to have driver implement a generic >> >>> API, >> >>> but at the same time let the driver have custom semantics for those API >> >>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the >> >>> values passed to gpio_set_value instead of 0 and 1. It completely defeats >> >>> the purpose of a generic API, namely that you are able to write generic >> >>> code >> >>> that makes use of the API without having to know about which >> >>> implementation >> >>> API it is talking to. The dmaengine framework provides the >> >>> dmaengine_prep_interleaved_dma() function to setup two dimensional >> >>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c. >> >> >> >> The question here i think would be waht this device supports? Is the >> >> hardware >> >> capable of doing interleaved transfers, then would make sense. >> >> >> >> While we do try to get users use dma_slave_config, but there will always >> >> be >> >> someone who have specfic params. If we can generalize then we might want >> >> to add >> >> to the dma_slave_config as well >> > >> > There are many configuration parameters which are specific to IP and I >> > would like to >> > give an overview of some of parameteres here: >> > >> > 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame >> > referenced by >> > 'cfg->park_frm', so user will have control on each frame in this mode. >> > >> > 2) Interrupt Coalesce ('cfg->coalesce'): Used for setting interrupt >> > threshold. This value >> >determines the number of frame buffers to process. To use this feature, >> >'cfg->frm_cnt_en' should be set. >> > >> > 3) Frame Synchronization Source ('cfg->ext_fsync'): Can be an >> > external/internal frame >> > synchronization source. Used to synchronize one channel (MM2S/S2MM) >> > with >> > another (S2MM/MM2S) channel. >> > >> > 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate >> > between >> > master and slave. In master mode (cfg->master), frames are not >> > dropped and >> > slave can drop frames to adjust to master frame rate. >> > >> > And in future, this Engine being a soft IP, we could expect some more >> > additional >> > parameters. Isn't a good idea to have a private member in >> > dma_slave_config for >> > sharing additional configuration between slave device and dma engine? Or a >> > new >> > dma_ctrl_cmd like FSLDMA_EXTERNAL_START? > > The idea of a generic API is that we can use it for most of the controllers. > Even > if you are planning to support a family of controllers > > ATM, lets not discuss the possiblity of private member and try to exhanust all > possible options. Worst case you can embed the dma_slave_config in > xilinx_dma_slave_config and retrieve it in dmac driver Ok. Srikanth > > -- > ~Vinod > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Tue, Feb 4, 2014 at 10:58 AM, Vinod Koul vinod.k...@intel.com wrote: On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. While we do try to get users use dma_slave_config, but there will always be someone who have specfic params. If we can generalize then we might want to add to the dma_slave_config as well There are many configuration parameters which are specific to IP and I would like to give an overview of some of parameteres here: 1) Park Mode ('cfg-park'): In Park mode, engine will park on frame referenced by 'cfg-park_frm', so user will have control on each frame in this mode. 2) Interrupt Coalesce ('cfg-coalesce'): Used for setting interrupt threshold. This value determines the number of frame buffers to process. To use this feature, 'cfg-frm_cnt_en' should be set. 3) Frame Synchronization Source ('cfg-ext_fsync'): Can be an external/internal frame synchronization source. Used to synchronize one channel (MM2S/S2MM) with another (S2MM/MM2S) channel. 4) Genlock Synchronization ('cfg-genlock'): Used to avoid mismatch rate between master and slave. In master mode (cfg-master), frames are not dropped and slave can drop frames to adjust to master frame rate. And in future, this Engine being a soft IP, we could expect some more additional parameters. Isn't a good idea to have a private member in dma_slave_config for sharing additional configuration between slave device and dma engine? Or a new dma_ctrl_cmd like FSLDMA_EXTERNAL_START? The idea of a generic API is that we can use it for most of the controllers. Even if you are planning to support a family of controllers ATM, lets not discuss the possiblity of private member and try to exhanust all possible options. Worst case you can embed the dma_slave_config in xilinx_dma_slave_config and retrieve it in dmac driver Ok. Srikanth -- ~Vinod -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote: > >>> >> [...] > >>> >>> +/** > >>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device > >>> >>> + * @dchan: DMA Channel pointer > >>> >>> + * @cmd: DMA control command > >>> >>> + * @arg: Channel configuration > >>> >>> + * > >>> >>> + * Return: '0' on success and failure value on error > >>> >>> + */ > >>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan, > >>> >>> + enum dma_ctrl_cmd cmd, unsigned > >>> >>> long arg) > >>> >>> +{ > >>> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); > >>> >>> + > >>> >>> + switch (cmd) { > >>> >>> + case DMA_TERMINATE_ALL: > >>> >>> + xilinx_vdma_terminate_all(chan); > >>> >>> + return 0; > >>> >>> + case DMA_SLAVE_CONFIG: > >>> >>> + return xilinx_vdma_slave_config(chan, > >>> >>> + (struct xilinx_vdma_config > >>> >>> *)arg); > >>> >> > >>> >> You really shouldn't be overloading the generic API with your own > >>> >> semantics. > >>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. > >>> > > >>> > Ok. The driver needs few additional configuration from the slave > >>> > device like Vertical > >>> > Size, Horizontal Size, Stride etc., for the DMA transfers, in that > >>> > case do you > >>> > suggest me to define a separate dma_ctrl_cmd like the one > >>> > FSLDMA_EXTERNAL_START > >>> > defined for Freescale drivers? > >>> > >>> In my opinion it is not a good idea to have driver implement a generic > >>> API, > >>> but at the same time let the driver have custom semantics for those API > >>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the > >>> values passed to gpio_set_value instead of 0 and 1. It completely defeats > >>> the purpose of a generic API, namely that you are able to write generic > >>> code > >>> that makes use of the API without having to know about which > >>> implementation > >>> API it is talking to. The dmaengine framework provides the > >>> dmaengine_prep_interleaved_dma() function to setup two dimensional > >>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c. > >> > >> The question here i think would be waht this device supports? Is the > >> hardware > >> capable of doing interleaved transfers, then would make sense. > >> > >> While we do try to get users use dma_slave_config, but there will always be > >> someone who have specfic params. If we can generalize then we might want > >> to add > >> to the dma_slave_config as well > > > > There are many configuration parameters which are specific to IP and I > > would like to > > give an overview of some of parameteres here: > > > > 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame > > referenced by > > 'cfg->park_frm', so user will have control on each frame in this mode. > > > > 2) Interrupt Coalesce ('cfg->coalesce'): Used for setting interrupt > > threshold. This value > >determines the number of frame buffers to process. To use this feature, > >'cfg->frm_cnt_en' should be set. > > > > 3) Frame Synchronization Source ('cfg->ext_fsync'): Can be an > > external/internal frame > > synchronization source. Used to synchronize one channel (MM2S/S2MM) with > > another (S2MM/MM2S) channel. > > > > 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate > > between > > master and slave. In master mode (cfg->master), frames are not dropped > > and > > slave can drop frames to adjust to master frame rate. > > > > And in future, this Engine being a soft IP, we could expect some more > > additional > > parameters. Isn't a good idea to have a private member in dma_slave_config > > for > > sharing additional configuration between slave device and dma engine? Or a > > new > > dma_ctrl_cmd like FSLDMA_EXTERNAL_START? The idea of a generic API is that we can use it for most of the controllers. Even if you are planning to support a family of controllers ATM, lets not discuss the possiblity of private member and try to exhanust all possible options. Worst case you can embed the dma_slave_config in xilinx_dma_slave_config and retrieve it in dmac driver -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Fri, Jan 31, 2014 at 12:22:52PM +0530, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. While we do try to get users use dma_slave_config, but there will always be someone who have specfic params. If we can generalize then we might want to add to the dma_slave_config as well There are many configuration parameters which are specific to IP and I would like to give an overview of some of parameteres here: 1) Park Mode ('cfg-park'): In Park mode, engine will park on frame referenced by 'cfg-park_frm', so user will have control on each frame in this mode. 2) Interrupt Coalesce ('cfg-coalesce'): Used for setting interrupt threshold. This value determines the number of frame buffers to process. To use this feature, 'cfg-frm_cnt_en' should be set. 3) Frame Synchronization Source ('cfg-ext_fsync'): Can be an external/internal frame synchronization source. Used to synchronize one channel (MM2S/S2MM) with another (S2MM/MM2S) channel. 4) Genlock Synchronization ('cfg-genlock'): Used to avoid mismatch rate between master and slave. In master mode (cfg-master), frames are not dropped and slave can drop frames to adjust to master frame rate. And in future, this Engine being a soft IP, we could expect some more additional parameters. Isn't a good idea to have a private member in dma_slave_config for sharing additional configuration between slave device and dma engine? Or a new dma_ctrl_cmd like FSLDMA_EXTERNAL_START? The idea of a generic API is that we can use it for most of the controllers. Even if you are planning to support a family of controllers ATM, lets not discuss the possiblity of private member and try to exhanust all possible options. Worst case you can embed the dma_slave_config in xilinx_dma_slave_config and retrieve it in dmac driver -- ~Vinod -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/31/2014 06:44 PM, Andy Gross wrote: On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The comments in the include/linux/dmaengine.h state that if you have non-generic, non-fixed configuration then you can just create your own structure and embed the dma_slave_config. Using the container_of you can get back your structure. We should probably revise that, since it is not going to work that well. I agree that we should always use the generic structure if possible, but sometimes there are some non-standard things that you have to do for your hardware. I am currently in a bind for adding some quirky features that are required by peripherals who want to use the QCOM DMA devices. Well there are two types of extensions to the API. The first type changes the semantics of the API so it is no longer possible to use the API without knowing about the extension. This is in my opinion a complete no-go since goes against the very idea of a common API. If you implement the common API with custom semantics you have a custom API. It's just better hidden since you use the same function names. My opinion on this is if you want/need a custom API make it a custom API with custom function names. This on one hand avoids confusion about the behavior and on the other hand reduces the maintenance burden for the common API (e.g. if somebody makes changes to the common API they don't have to bother to update your driver and don't have to try to understand the custom semantics). The other kind of extensions are those that add additional functionality on top of the common API, while keeping the normal semantics for the common API. Which means a user that does not know about the extensions is still able to function. A user that knows about the extension can make use of the additional features. That said, everybody always thinks their hardware is special and requires special extensions. Usually this is not the case, there will always sooner or later somebody else who needs the same extensions. The dmaengine API is not set in stone, so if you think something is missing to properly to support your hardware it is worth investigating if it makes sense to add the missing parts to the common API. As I said before the whole point of the exercise of having a common API is that we want to abstract away (hardware) implementation specific details. This allows the upper layers to have platform independent common code to take care of setting up the DMA transfers. E.g. in ALSA subsystem we went from 10+ custom implementations of a PCM driver build on top of dmaengine to 1 generic implementation that is shared between platforms. All those custom PCM drivers had hardcoded assumptions about the behavior and features of the underlying dmaengine driver. To be able to have one generic PCM driver it was necessarily to extend the dmaengine API to be able to expose these differences in features and behavior. So as I said the API is not set in stone if it is necessary to extend or
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/31/2014 06:44 PM, Andy Gross wrote: On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The comments in the include/linux/dmaengine.h state that if you have non-generic, non-fixed configuration then you can just create your own structure and embed the dma_slave_config. Using the container_of you can get back your structure. We should probably revise that, since it is not going to work that well. I agree that we should always use the generic structure if possible, but sometimes there are some non-standard things that you have to do for your hardware. I am currently in a bind for adding some quirky features that are required by peripherals who want to use the QCOM DMA devices. Well there are two types of extensions to the API. The first type changes the semantics of the API so it is no longer possible to use the API without knowing about the extension. This is in my opinion a complete no-go since goes against the very idea of a common API. If you implement the common API with custom semantics you have a custom API. It's just better hidden since you use the same function names. My opinion on this is if you want/need a custom API make it a custom API with custom function names. This on one hand avoids confusion about the behavior and on the other hand reduces the maintenance burden for the common API (e.g. if somebody makes changes to the common API they don't have to bother to update your driver and don't have to try to understand the custom semantics). The other kind of extensions are those that add additional functionality on top of the common API, while keeping the normal semantics for the common API. Which means a user that does not know about the extensions is still able to function. A user that knows about the extension can make use of the additional features. That said, everybody always thinks their hardware is special and requires special extensions. Usually this is not the case, there will always sooner or later somebody else who needs the same extensions. The dmaengine API is not set in stone, so if you think something is missing to properly to support your hardware it is worth investigating if it makes sense to add the missing parts to the common API. As I said before the whole point of the exercise of having a common API is that we want to abstract away (hardware) implementation specific details. This allows the upper layers to have platform independent common code to take care of setting up the DMA transfers. E.g. in ALSA subsystem we went from 10+ custom implementations of a PCM driver build on top of dmaengine to 1 generic implementation that is shared between platforms. All those custom PCM drivers had hardcoded assumptions about the behavior and features of the underlying dmaengine driver. To be able to have one generic PCM driver it was necessarily to extend the dmaengine API to be able to expose these differences in features and behavior. So as I said the API is not set in stone if it is necessary
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: > On 01/24/2014 12:16 PM, Srikanth Thokala wrote: > > Hi Lars, > > > > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen wrote: > >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote: > >> [...] > >>> +/** > >>> + * xilinx_vdma_device_control - Configure DMA channel of the device > >>> + * @dchan: DMA Channel pointer > >>> + * @cmd: DMA control command > >>> + * @arg: Channel configuration > >>> + * > >>> + * Return: '0' on success and failure value on error > >>> + */ > >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan, > >>> + enum dma_ctrl_cmd cmd, unsigned long > >>> arg) > >>> +{ > >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); > >>> + > >>> + switch (cmd) { > >>> + case DMA_TERMINATE_ALL: > >>> + xilinx_vdma_terminate_all(chan); > >>> + return 0; > >>> + case DMA_SLAVE_CONFIG: > >>> + return xilinx_vdma_slave_config(chan, > >>> + (struct xilinx_vdma_config *)arg); > >> > >> You really shouldn't be overloading the generic API with your own > >> semantics. > >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. > > > > Ok. The driver needs few additional configuration from the slave > > device like Vertical > > Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do > > you > > suggest me to define a separate dma_ctrl_cmd like the one > > FSLDMA_EXTERNAL_START > > defined for Freescale drivers? > > In my opinion it is not a good idea to have driver implement a generic API, > but at the same time let the driver have custom semantics for those API > calls. It's a bit like having a gpio driver that expects 23 and 42 as the > values passed to gpio_set_value instead of 0 and 1. It completely defeats > the purpose of a generic API, namely that you are able to write generic code > that makes use of the API without having to know about which implementation > API it is talking to. The dmaengine framework provides the > dmaengine_prep_interleaved_dma() function to setup two dimensional > transfers, e.g. take a look at sirf-dma.c or imx-dma.c. > The comments in the include/linux/dmaengine.h state that if you have non-generic, non-fixed configuration then you can just create your own structure and embed the dma_slave_config. Using the container_of you can get back your structure. I agree that we should always use the generic structure if possible, but sometimes there are some non-standard things that you have to do for your hardware. I am currently in a bind for adding some quirky features that are required by peripherals who want to use the QCOM DMA devices. If the context field in prep_slave_sg and prep_dma_cyclic was exposed to everyone, that would allow an easy way to pass in hardware specific configuration without bastardizing the slave_config. I noticed that rapidio is the only consumer of that field and that they have their own prep function. If we are not going to allow people to do their own slave_config when they need to, then we need to remove the comments from the include file and expose the context to the dmaengine_prep_slave_sg and dmaengine_prep_dma_cyclic. -- sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The comments in the include/linux/dmaengine.h state that if you have non-generic, non-fixed configuration then you can just create your own structure and embed the dma_slave_config. Using the container_of you can get back your structure. I agree that we should always use the generic structure if possible, but sometimes there are some non-standard things that you have to do for your hardware. I am currently in a bind for adding some quirky features that are required by peripherals who want to use the QCOM DMA devices. If the context field in prep_slave_sg and prep_dma_cyclic was exposed to everyone, that would allow an easy way to pass in hardware specific configuration without bastardizing the slave_config. I noticed that rapidio is the only consumer of that field and that they have their own prep function. If we are not going to allow people to do their own slave_config when they need to, then we need to remove the comments from the include file and expose the context to the dmaengine_prep_slave_sg and dmaengine_prep_dma_cyclic. -- sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Vinod, On Mon, Jan 27, 2014 at 4:36 PM, Srikanth Thokala wrote: > Hi Vinod, > > On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul wrote: >> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: >>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote: >>> > Hi Lars, >>> > >>> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen >>> > wrote: >>> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote: >>> >> [...] >>> >>> +/** >>> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device >>> >>> + * @dchan: DMA Channel pointer >>> >>> + * @cmd: DMA control command >>> >>> + * @arg: Channel configuration >>> >>> + * >>> >>> + * Return: '0' on success and failure value on error >>> >>> + */ >>> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan, >>> >>> + enum dma_ctrl_cmd cmd, unsigned >>> >>> long arg) >>> >>> +{ >>> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); >>> >>> + >>> >>> + switch (cmd) { >>> >>> + case DMA_TERMINATE_ALL: >>> >>> + xilinx_vdma_terminate_all(chan); >>> >>> + return 0; >>> >>> + case DMA_SLAVE_CONFIG: >>> >>> + return xilinx_vdma_slave_config(chan, >>> >>> + (struct xilinx_vdma_config *)arg); >>> >> >>> >> You really shouldn't be overloading the generic API with your own >>> >> semantics. >>> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. >>> > >>> > Ok. The driver needs few additional configuration from the slave >>> > device like Vertical >>> > Size, Horizontal Size, Stride etc., for the DMA transfers, in that case >>> > do you >>> > suggest me to define a separate dma_ctrl_cmd like the one >>> > FSLDMA_EXTERNAL_START >>> > defined for Freescale drivers? >>> >>> In my opinion it is not a good idea to have driver implement a generic API, >>> but at the same time let the driver have custom semantics for those API >>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the >>> values passed to gpio_set_value instead of 0 and 1. It completely defeats >>> the purpose of a generic API, namely that you are able to write generic code >>> that makes use of the API without having to know about which implementation >>> API it is talking to. The dmaengine framework provides the >>> dmaengine_prep_interleaved_dma() function to setup two dimensional >>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c. >> >> The question here i think would be waht this device supports? Is the hardware >> capable of doing interleaved transfers, then would make sense. >> >> While we do try to get users use dma_slave_config, but there will always be >> someone who have specfic params. If we can generalize then we might want to >> add >> to the dma_slave_config as well > > There are many configuration parameters which are specific to IP and I > would like to > give an overview of some of parameteres here: > > 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame > referenced by > 'cfg->park_frm', so user will have control on each frame in this mode. > > 2) Interrupt Coalesce ('cfg->coalesce'): Used for setting interrupt > threshold. This value >determines the number of frame buffers to process. To use this feature, >'cfg->frm_cnt_en' should be set. > > 3) Frame Synchronization Source ('cfg->ext_fsync'): Can be an > external/internal frame > synchronization source. Used to synchronize one channel (MM2S/S2MM) with > another (S2MM/MM2S) channel. > > 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate > between > master and slave. In master mode (cfg->master), frames are not dropped > and > slave can drop frames to adjust to master frame rate. > > And in future, this Engine being a soft IP, we could expect some more > additional > parameters. Isn't a good idea to have a private member in dma_slave_config > for > sharing additional configuration between slave device and dma engine? Or a new > dma_ctrl_cmd like FSLDMA_EXTERNAL_START? Ping? > > Srikanth > >> >> -- >> ~Vinod >> --/EX >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Vinod, On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul wrote: > On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: >> Hi Lars/Vinod, >> >> The question here i think would be waht this device supports? Is the >> >> hardware >> >> capable of doing interleaved transfers, then would make sense. >> > >> > The hardware does 2D transfers. The parameters for a transfer are height, >> > width and stride. That's only a subset of what interleaved transfers can be >> > (xt->num_frames must be one for 2d transfers). But if I remember correctly >> > there has been some discussion on this in the past and the result of that >> > discussion was that using interleaved transfers for 2D transfers is >> > preferred over adding a custom API for 2D transfers. >> >> I went through the prep_interleaved_dma API and I see only one descriptor >> is prepared per API call (i.e. per frame). As our IP supports upto 16 frame >> buffers (can be more in future), isn't it less efficient compared to the >> prep_slave_sg where we get a single sg list and can prepare all the >> descriptors >> (of non-contiguous buffers) in one go? Correct me, if am wrong and let me >> know your opinions. > Well the descriptor maybe one, but that can represent multiple frames, for > example 16 as in your case. Can you read up the documentation of how multiple > frames are passed. Pls see include/linux/dmaengine.h > > /** > * Interleaved Transfer Request > * > * A chunk is collection of contiguous bytes to be transfered. > * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). > * ICGs may or maynot change between chunks. > * A FRAME is the smallest series of contiguous {chunk,icg} pairs, > * that when repeated an integral number of times, specifies the transfer. > * A transfer template is specification of a Frame, the number of times > * it is to be repeated and other per-transfer attributes. > * > * Practically, a client driver would have ready a template for each > * type of transfer it is going to need during its lifetime and > * set only 'src_start' and 'dst_start' before submitting the requests. > * > * > * | Frame-1| Frame-2 | ~ | Frame-'numf' | > * |==.===...=...|==.===...=...| ~ |==.===...=...| > * > *== Chunk size > *... ICG > */ Yes, it can handle multiple frames specified by 'numf' each of size 'frame_size * sgl[0].size'. But, I see it only works if all the frames' memory is contiguous and in this case we can just increment 'src_start' by the total frame size 'numf' number of times to fill in for each HW descriptor (each frame is one HW descriptor). So, there is no issue when the memory is contiguous. If the frames are non contiguous, we have to call this API for each frame (hence for each descriptor), as the src_start for each frame is different. Is it correct? FYI: This hardware has an inbuilt Scatter-Gather engine. Srikanth > > -- > ~Vinod > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Vinod, On Tue, Jan 28, 2014 at 8:43 AM, Vinod Koul vinod.k...@intel.com wrote: On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: Hi Lars/Vinod, The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ Yes, it can handle multiple frames specified by 'numf' each of size 'frame_size * sgl[0].size'. But, I see it only works if all the frames' memory is contiguous and in this case we can just increment 'src_start' by the total frame size 'numf' number of times to fill in for each HW descriptor (each frame is one HW descriptor). So, there is no issue when the memory is contiguous. If the frames are non contiguous, we have to call this API for each frame (hence for each descriptor), as the src_start for each frame is different. Is it correct? FYI: This hardware has an inbuilt Scatter-Gather engine. Srikanth -- ~Vinod -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Vinod, On Mon, Jan 27, 2014 at 4:36 PM, Srikanth Thokala stho...@xilinx.com wrote: Hi Vinod, On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul vinod.k...@intel.com wrote: On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. While we do try to get users use dma_slave_config, but there will always be someone who have specfic params. If we can generalize then we might want to add to the dma_slave_config as well There are many configuration parameters which are specific to IP and I would like to give an overview of some of parameteres here: 1) Park Mode ('cfg-park'): In Park mode, engine will park on frame referenced by 'cfg-park_frm', so user will have control on each frame in this mode. 2) Interrupt Coalesce ('cfg-coalesce'): Used for setting interrupt threshold. This value determines the number of frame buffers to process. To use this feature, 'cfg-frm_cnt_en' should be set. 3) Frame Synchronization Source ('cfg-ext_fsync'): Can be an external/internal frame synchronization source. Used to synchronize one channel (MM2S/S2MM) with another (S2MM/MM2S) channel. 4) Genlock Synchronization ('cfg-genlock'): Used to avoid mismatch rate between master and slave. In master mode (cfg-master), frames are not dropped and slave can drop frames to adjust to master frame rate. And in future, this Engine being a soft IP, we could expect some more additional parameters. Isn't a good idea to have a private member in dma_slave_config for sharing additional configuration between slave device and dma engine? Or a new dma_ctrl_cmd like FSLDMA_EXTERNAL_START? Ping? Srikanth -- ~Vinod --/EX To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: > Hi Lars/Vinod, > >> The question here i think would be waht this device supports? Is the > >> hardware > >> capable of doing interleaved transfers, then would make sense. > > > > The hardware does 2D transfers. The parameters for a transfer are height, > > width and stride. That's only a subset of what interleaved transfers can be > > (xt->num_frames must be one for 2d transfers). But if I remember correctly > > there has been some discussion on this in the past and the result of that > > discussion was that using interleaved transfers for 2D transfers is > > preferred over adding a custom API for 2D transfers. > > I went through the prep_interleaved_dma API and I see only one descriptor > is prepared per API call (i.e. per frame). As our IP supports upto 16 frame > buffers (can be more in future), isn't it less efficient compared to the > prep_slave_sg where we get a single sg list and can prepare all the > descriptors > (of non-contiguous buffers) in one go? Correct me, if am wrong and let me > know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Sun, Jan 26, 2014 at 06:39:21PM +0100, Lars-Peter Clausen wrote: > On 01/26/2014 02:59 PM, Vinod Koul wrote: > > On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: > >> On 01/24/2014 12:16 PM, Srikanth Thokala wrote: > >>> Hi Lars, > >>> > >>> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen > >>> wrote: > On 01/22/2014 05:52 PM, Srikanth Thokala wrote: > [...] > > +/** > > + * xilinx_vdma_device_control - Configure DMA channel of the device > > + * @dchan: DMA Channel pointer > > + * @cmd: DMA control command > > + * @arg: Channel configuration > > + * > > + * Return: '0' on success and failure value on error > > + */ > > +static int xilinx_vdma_device_control(struct dma_chan *dchan, > > + enum dma_ctrl_cmd cmd, unsigned > > long arg) > > +{ > > + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); > > + > > + switch (cmd) { > > + case DMA_TERMINATE_ALL: > > + xilinx_vdma_terminate_all(chan); > > + return 0; > > + case DMA_SLAVE_CONFIG: > > + return xilinx_vdma_slave_config(chan, > > + (struct xilinx_vdma_config *)arg); > > You really shouldn't be overloading the generic API with your own > semantics. > DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. > >>> > >>> Ok. The driver needs few additional configuration from the slave > >>> device like Vertical > >>> Size, Horizontal Size, Stride etc., for the DMA transfers, in that case > >>> do you > >>> suggest me to define a separate dma_ctrl_cmd like the one > >>> FSLDMA_EXTERNAL_START > >>> defined for Freescale drivers? > >> > >> In my opinion it is not a good idea to have driver implement a generic API, > >> but at the same time let the driver have custom semantics for those API > >> calls. It's a bit like having a gpio driver that expects 23 and 42 as the > >> values passed to gpio_set_value instead of 0 and 1. It completely defeats > >> the purpose of a generic API, namely that you are able to write generic > >> code > >> that makes use of the API without having to know about which implementation > >> API it is talking to. The dmaengine framework provides the > >> dmaengine_prep_interleaved_dma() function to setup two dimensional > >> transfers, e.g. take a look at sirf-dma.c or imx-dma.c. > > > > The question here i think would be waht this device supports? Is the > > hardware > > capable of doing interleaved transfers, then would make sense. > > The hardware does 2D transfers. The parameters for a transfer are height, > width and stride. That's only a subset of what interleaved transfers can be > (xt->num_frames must be one for 2d transfers). But if I remember correctly > there has been some discussion on this in the past and the result of that > discussion was that using interleaved transfers for 2D transfers is > preferred over adding a custom API for 2D transfers. Yup that would be my recomendation. Moving this driver to interleaved API seems right to me -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Lars/Vinod, On Sun, Jan 26, 2014 at 11:09 PM, Lars-Peter Clausen wrote: > On 01/26/2014 02:59 PM, Vinod Koul wrote: >> On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: >>> On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen wrote: > On 01/22/2014 05:52 PM, Srikanth Thokala wrote: > [...] >> +/** >> + * xilinx_vdma_device_control - Configure DMA channel of the device >> + * @dchan: DMA Channel pointer >> + * @cmd: DMA control command >> + * @arg: Channel configuration >> + * >> + * Return: '0' on success and failure value on error >> + */ >> +static int xilinx_vdma_device_control(struct dma_chan *dchan, >> + enum dma_ctrl_cmd cmd, unsigned long >> arg) >> +{ >> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); >> + >> + switch (cmd) { >> + case DMA_TERMINATE_ALL: >> + xilinx_vdma_terminate_all(chan); >> + return 0; >> + case DMA_SLAVE_CONFIG: >> + return xilinx_vdma_slave_config(chan, >> + (struct xilinx_vdma_config *)arg); > > You really shouldn't be overloading the generic API with your own > semantics. > DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? >>> >>> In my opinion it is not a good idea to have driver implement a generic API, >>> but at the same time let the driver have custom semantics for those API >>> calls. It's a bit like having a gpio driver that expects 23 and 42 as the >>> values passed to gpio_set_value instead of 0 and 1. It completely defeats >>> the purpose of a generic API, namely that you are able to write generic code >>> that makes use of the API without having to know about which implementation >>> API it is talking to. The dmaengine framework provides the >>> dmaengine_prep_interleaved_dma() function to setup two dimensional >>> transfers, e.g. take a look at sirf-dma.c or imx-dma.c. >> >> The question here i think would be waht this device supports? Is the hardware >> capable of doing interleaved transfers, then would make sense. > > The hardware does 2D transfers. The parameters for a transfer are height, > width and stride. That's only a subset of what interleaved transfers can be > (xt->num_frames must be one for 2d transfers). But if I remember correctly > there has been some discussion on this in the past and the result of that > discussion was that using interleaved transfers for 2D transfers is > preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Srikanth > > - Lars > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Vinod, On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul wrote: > On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: >> On 01/24/2014 12:16 PM, Srikanth Thokala wrote: >> > Hi Lars, >> > >> > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen >> > wrote: >> >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote: >> >> [...] >> >>> +/** >> >>> + * xilinx_vdma_device_control - Configure DMA channel of the device >> >>> + * @dchan: DMA Channel pointer >> >>> + * @cmd: DMA control command >> >>> + * @arg: Channel configuration >> >>> + * >> >>> + * Return: '0' on success and failure value on error >> >>> + */ >> >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan, >> >>> + enum dma_ctrl_cmd cmd, unsigned long >> >>> arg) >> >>> +{ >> >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); >> >>> + >> >>> + switch (cmd) { >> >>> + case DMA_TERMINATE_ALL: >> >>> + xilinx_vdma_terminate_all(chan); >> >>> + return 0; >> >>> + case DMA_SLAVE_CONFIG: >> >>> + return xilinx_vdma_slave_config(chan, >> >>> + (struct xilinx_vdma_config *)arg); >> >> >> >> You really shouldn't be overloading the generic API with your own >> >> semantics. >> >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. >> > >> > Ok. The driver needs few additional configuration from the slave >> > device like Vertical >> > Size, Horizontal Size, Stride etc., for the DMA transfers, in that case >> > do you >> > suggest me to define a separate dma_ctrl_cmd like the one >> > FSLDMA_EXTERNAL_START >> > defined for Freescale drivers? >> >> In my opinion it is not a good idea to have driver implement a generic API, >> but at the same time let the driver have custom semantics for those API >> calls. It's a bit like having a gpio driver that expects 23 and 42 as the >> values passed to gpio_set_value instead of 0 and 1. It completely defeats >> the purpose of a generic API, namely that you are able to write generic code >> that makes use of the API without having to know about which implementation >> API it is talking to. The dmaengine framework provides the >> dmaengine_prep_interleaved_dma() function to setup two dimensional >> transfers, e.g. take a look at sirf-dma.c or imx-dma.c. > > The question here i think would be waht this device supports? Is the hardware > capable of doing interleaved transfers, then would make sense. > > While we do try to get users use dma_slave_config, but there will always be > someone who have specfic params. If we can generalize then we might want to > add > to the dma_slave_config as well There are many configuration parameters which are specific to IP and I would like to give an overview of some of parameteres here: 1) Park Mode ('cfg->park'): In Park mode, engine will park on frame referenced by 'cfg->park_frm', so user will have control on each frame in this mode. 2) Interrupt Coalesce ('cfg->coalesce'): Used for setting interrupt threshold. This value determines the number of frame buffers to process. To use this feature, 'cfg->frm_cnt_en' should be set. 3) Frame Synchronization Source ('cfg->ext_fsync'): Can be an external/internal frame synchronization source. Used to synchronize one channel (MM2S/S2MM) with another (S2MM/MM2S) channel. 4) Genlock Synchronization ('cfg->genlock'): Used to avoid mismatch rate between master and slave. In master mode (cfg->master), frames are not dropped and slave can drop frames to adjust to master frame rate. And in future, this Engine being a soft IP, we could expect some more additional parameters. Isn't a good idea to have a private member in dma_slave_config for sharing additional configuration between slave device and dma engine? Or a new dma_ctrl_cmd like FSLDMA_EXTERNAL_START? Srikanth > > -- > ~Vinod > --/EX > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Vinod, On Sun, Jan 26, 2014 at 7:29 PM, Vinod Koul vinod.k...@intel.com wrote: On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. While we do try to get users use dma_slave_config, but there will always be someone who have specfic params. If we can generalize then we might want to add to the dma_slave_config as well There are many configuration parameters which are specific to IP and I would like to give an overview of some of parameteres here: 1) Park Mode ('cfg-park'): In Park mode, engine will park on frame referenced by 'cfg-park_frm', so user will have control on each frame in this mode. 2) Interrupt Coalesce ('cfg-coalesce'): Used for setting interrupt threshold. This value determines the number of frame buffers to process. To use this feature, 'cfg-frm_cnt_en' should be set. 3) Frame Synchronization Source ('cfg-ext_fsync'): Can be an external/internal frame synchronization source. Used to synchronize one channel (MM2S/S2MM) with another (S2MM/MM2S) channel. 4) Genlock Synchronization ('cfg-genlock'): Used to avoid mismatch rate between master and slave. In master mode (cfg-master), frames are not dropped and slave can drop frames to adjust to master frame rate. And in future, this Engine being a soft IP, we could expect some more additional parameters. Isn't a good idea to have a private member in dma_slave_config for sharing additional configuration between slave device and dma engine? Or a new dma_ctrl_cmd like FSLDMA_EXTERNAL_START? Srikanth -- ~Vinod --/EX To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Lars/Vinod, On Sun, Jan 26, 2014 at 11:09 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/26/2014 02:59 PM, Vinod Koul wrote: On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Srikanth - Lars -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Sun, Jan 26, 2014 at 06:39:21PM +0100, Lars-Peter Clausen wrote: On 01/26/2014 02:59 PM, Vinod Koul wrote: On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. Yup that would be my recomendation. Moving this driver to interleaved API seems right to me -- ~Vinod -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Mon, Jan 27, 2014 at 06:42:36PM +0530, Srikanth Thokala wrote: Hi Lars/Vinod, The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. I went through the prep_interleaved_dma API and I see only one descriptor is prepared per API call (i.e. per frame). As our IP supports upto 16 frame buffers (can be more in future), isn't it less efficient compared to the prep_slave_sg where we get a single sg list and can prepare all the descriptors (of non-contiguous buffers) in one go? Correct me, if am wrong and let me know your opinions. Well the descriptor maybe one, but that can represent multiple frames, for example 16 as in your case. Can you read up the documentation of how multiple frames are passed. Pls see include/linux/dmaengine.h /** * Interleaved Transfer Request * * A chunk is collection of contiguous bytes to be transfered. * The gap(in bytes) between two chunks is called inter-chunk-gap(ICG). * ICGs may or maynot change between chunks. * A FRAME is the smallest series of contiguous {chunk,icg} pairs, * that when repeated an integral number of times, specifies the transfer. * A transfer template is specification of a Frame, the number of times * it is to be repeated and other per-transfer attributes. * * Practically, a client driver would have ready a template for each * type of transfer it is going to need during its lifetime and * set only 'src_start' and 'dst_start' before submitting the requests. * * * | Frame-1| Frame-2 | ~ | Frame-'numf' | * |==.===...=...|==.===...=...| ~ |==.===...=...| * *== Chunk size *... ICG */ -- ~Vinod -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/26/2014 03:24 PM, Vinod Koul wrote: > On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote: >> This is the driver for the AXI Video Direct Memory Access (AXI >> VDMA) core, which is a soft Xilinx IP core that provides high- >> bandwidth direct memory access between memory and AXI4-Stream >> type video target peripherals. The core provides efficient two >> dimensional DMA operations with independent asynchronous read > ok here is tha catch, do you want to support interleaved API rather? > >> +* DMA client + +Required properties: +- dmas: a list of <[Video DMA device >> phandle] [Channel ID]> pairs, + where Channel ID is '0' for write/tx and >> '1' for read/rx +channel. +- dma-names: a list of DMA channel names, one >> per "dmas" entry + +Example: + + +vdmatest_0: vdmatest@0 { + >> compatible ="xlnx,axi-vdma-test-1.00.a"; + dmas = <_vdma_0 0 + >> _vdma_0 1>; +dma-names = "vdma0", "vdma1"; +} ; > Need ack from DT folks. ALso would be better to split the binding to a > separate > patch > > >> +/** >> + * struct xilinx_vdma_chan - Driver specific VDMA channel structure >> + * @xdev: Driver specific device structure >> + * @ctrl_offset: Control registers offset >> + * @desc_offset: TX descriptor registers offset >> + * @completed_cookie: Maximum cookie completed >> + * @cookie: The current cookie >> + * @lock: Descriptor operation lock >> + * @pending_list: Descriptors waiting >> + * @active_desc: Active descriptor >> + * @done_list: Complete descriptors >> + * @common: DMA common channel >> + * @desc_pool: Descriptors pool >> + * @dev: The dma device >> + * @irq: Channel IRQ >> + * @id: Channel ID >> + * @direction: Transfer direction >> + * @num_frms: Number of frames >> + * @has_sg: Support scatter transfers >> + * @genlock: Support genlock mode >> + * @err: Channel has errors >> + * @tasklet: Cleanup work after irq >> + * @config: Device configuration info >> + * @flush_on_fsync: Flush on Frame sync >> + */ >> +struct xilinx_vdma_chan { >> +struct xilinx_vdma_device *xdev; >> +u32 ctrl_offset; >> +u32 desc_offset; >> +dma_cookie_t completed_cookie; >> +dma_cookie_t cookie; >> +spinlock_t lock; >> +struct list_head pending_list; >> +struct xilinx_vdma_tx_descriptor *active_desc; >> +struct list_head done_list; >> +struct dma_chan common; >> +struct dma_pool *desc_pool; >> +struct device *dev; >> +int irq; >> +int id; >> +enum dma_transfer_direction direction; > why should channel have a direction... descriptor should have direction and > not > the channel! The channel only supports transfers in one direction. Either from memory to peripheral or from peripheral to memory, that's fixed and can't be changed at runtime. The driver needs to know which direction the channel supports so it can reject transfers with the wrong direction. [...] >> + > >> + * xilinx_vdma_tx_status - Get VDMA transaction status >> + * @dchan: DMA channel >> + * @cookie: Transaction identifier >> + * @txstate: Transaction state >> + * >> + * Return: DMA transaction status >> + */ >> +static enum dma_status xilinx_vdma_tx_status(struct dma_chan *dchan, >> +dma_cookie_t cookie, >> +struct dma_tx_state *txstate) >> +{ >> +struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); >> +dma_cookie_t last_used; >> +dma_cookie_t last_complete; >> + >> +xilinx_vdma_chan_desc_cleanup(chan); >> + >> +last_used = dchan->cookie; >> +last_complete = chan->completed_cookie; >> + >> +dma_set_tx_state(txstate, last_complete, last_used, 0); >> + >> +return dma_async_is_complete(cookie, last_complete, last_used); > no residue calculation? > The hardware doesn't support that. >> +/** >> + * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE >> transaction >> + * @dchan: DMA channel >> + * @sgl: scatterlist to transfer to/from >> + * @sg_len: number of entries in @sgl >> + * @dir: DMA direction >> + * @flags: transfer ack flags >> + * @context: unused >> + * >> + * Return: Async transaction descriptor on success and NULL on failure >> + */ >> +static struct dma_async_tx_descriptor * >> +xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl, >> + unsigned int sg_len, enum dma_transfer_direction dir, >> + unsigned long flags, void *context) > okay now am worried, this is supposed to memcpy DMA so why slave-sg?? The DMA is either from memory to peripheral or from peripheral to memory depending on the direction. So slave sg should be fine. > > Looking at the driver overall, IMHO we need to do: > - use the virt-dma to simplfy the cookie handling and perhpasn the descrptors > too! > - Perhpas use interleaved API..? > - I dont think we should use the slave API as this seems memcpy case! > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/26/2014 03:03 PM, Vinod Koul wrote: > On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote: >> On 01/23/2014 03:00 PM, Andy Shevchenko wrote: >>> On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote: On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: > On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote: > > [...] > >>> + /* Request the interrupt */ >>> + chan->irq = irq_of_parse_and_map(node, 0); >>> + err = devm_request_irq(xdev->dev, chan->irq, >>> xilinx_vdma_irq_handler, >>> + IRQF_SHARED, "xilinx-vdma-controller", >>> chan); >> >> This is a clasic example of where to not use devm_request_irq. 'chan' is >> accessed in the interrupt handler, but if you use devm_request_irq 'chan' >> will be freed before the interrupt handler has been released, which means >> there is now a race condition where the interrupt handler can access >> already >> freed memory.ta > > Could you elaborate this case? As far as I understood managed resources > are a kind of stack pile. In this case you have no such condition. Where > am I wrong? The stacked stuff is only ran after the remove() function. Which means that you call dma_async_device_unregister() before the interrupt handler is freed. Another issue with the interrupt handler is a bit hidden. The driver does not call tasklet_kill in the remove function. Which it should though to make sure that the tasklet does not race against the freeing of the memory. And in order to make sure that the tasklet is not rescheduled you need to free the irq before killing the tasklet, since the interrupt handler schedules the tasklet. >>> >>> So, you mean devm_request_irq() will race in any DMA driver? >> >> Most likely yes. devm_request_irq() is race condition prone for the majority >> of device driver. You have to be really careful if you want to use it. >> >>> >>> I think the proper solution is to disable all device work in >>> the .remove() and devm will care about resources. >> >> As long as the interrupt handler is registered it can be called, the only >> proper solution is to make sure that the order in which resources are torn >> down is correct. > Wouldn't it work if we register the irq last in the probe. That wil ensure on > success the channel is always valid. Yes, but only if the irq is not device managed. All device managed resources will be freed after the remove function has been called. Which is to late in our case since we make sure that the tasklet is not running in the remove function. - Lars -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/26/2014 02:59 PM, Vinod Koul wrote: > On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: >> On 01/24/2014 12:16 PM, Srikanth Thokala wrote: >>> Hi Lars, >>> >>> On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] > +/** > + * xilinx_vdma_device_control - Configure DMA channel of the device > + * @dchan: DMA Channel pointer > + * @cmd: DMA control command > + * @arg: Channel configuration > + * > + * Return: '0' on success and failure value on error > + */ > +static int xilinx_vdma_device_control(struct dma_chan *dchan, > + enum dma_ctrl_cmd cmd, unsigned long > arg) > +{ > + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); > + > + switch (cmd) { > + case DMA_TERMINATE_ALL: > + xilinx_vdma_terminate_all(chan); > + return 0; > + case DMA_SLAVE_CONFIG: > + return xilinx_vdma_slave_config(chan, > + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. >>> >>> Ok. The driver needs few additional configuration from the slave >>> device like Vertical >>> Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do >>> you >>> suggest me to define a separate dma_ctrl_cmd like the one >>> FSLDMA_EXTERNAL_START >>> defined for Freescale drivers? >> >> In my opinion it is not a good idea to have driver implement a generic API, >> but at the same time let the driver have custom semantics for those API >> calls. It's a bit like having a gpio driver that expects 23 and 42 as the >> values passed to gpio_set_value instead of 0 and 1. It completely defeats >> the purpose of a generic API, namely that you are able to write generic code >> that makes use of the API without having to know about which implementation >> API it is talking to. The dmaengine framework provides the >> dmaengine_prep_interleaved_dma() function to setup two dimensional >> transfers, e.g. take a look at sirf-dma.c or imx-dma.c. > > The question here i think would be waht this device supports? Is the hardware > capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt->num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. - Lars -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote: > This is the driver for the AXI Video Direct Memory Access (AXI > VDMA) core, which is a soft Xilinx IP core that provides high- > bandwidth direct memory access between memory and AXI4-Stream > type video target peripherals. The core provides efficient two > dimensional DMA operations with independent asynchronous read ok here is tha catch, do you want to support interleaved API rather? > +* DMA client + +Required properties: +- dmas: a list of <[Video DMA device > phandle] [Channel ID]> pairs, + where Channel ID is '0' for write/tx and > '1' for read/rx + channel. +- dma-names: a list of DMA channel names, one > per "dmas" entry + +Example: + + +vdmatest_0: vdmatest@0 { + > compatible ="xlnx,axi-vdma-test-1.00.a"; +dmas = <_vdma_0 0 + > _vdma_0 1>; + dma-names = "vdma0", "vdma1"; +} ; Need ack from DT folks. ALso would be better to split the binding to a separate patch > +/** > + * struct xilinx_vdma_chan - Driver specific VDMA channel structure > + * @xdev: Driver specific device structure > + * @ctrl_offset: Control registers offset > + * @desc_offset: TX descriptor registers offset > + * @completed_cookie: Maximum cookie completed > + * @cookie: The current cookie > + * @lock: Descriptor operation lock > + * @pending_list: Descriptors waiting > + * @active_desc: Active descriptor > + * @done_list: Complete descriptors > + * @common: DMA common channel > + * @desc_pool: Descriptors pool > + * @dev: The dma device > + * @irq: Channel IRQ > + * @id: Channel ID > + * @direction: Transfer direction > + * @num_frms: Number of frames > + * @has_sg: Support scatter transfers > + * @genlock: Support genlock mode > + * @err: Channel has errors > + * @tasklet: Cleanup work after irq > + * @config: Device configuration info > + * @flush_on_fsync: Flush on Frame sync > + */ > +struct xilinx_vdma_chan { > + struct xilinx_vdma_device *xdev; > + u32 ctrl_offset; > + u32 desc_offset; > + dma_cookie_t completed_cookie; > + dma_cookie_t cookie; > + spinlock_t lock; > + struct list_head pending_list; > + struct xilinx_vdma_tx_descriptor *active_desc; > + struct list_head done_list; > + struct dma_chan common; > + struct dma_pool *desc_pool; > + struct device *dev; > + int irq; > + int id; > + enum dma_transfer_direction direction; why should channel have a direction... descriptor should have direction and not the channel! > +/** > + * xilinx_vdma_free_tx_descriptor - Free transaction descriptor > + * @chan: Driver specific VDMA channel > + * @desc: VDMA transaction descriptor > + */ > +static void > +xilinx_vdma_free_tx_descriptor(struct xilinx_vdma_chan *chan, > +struct xilinx_vdma_tx_descriptor *desc) > +{ > + struct xilinx_vdma_tx_segment *segment, *next; > + > + if (!desc) > + return; > + > + list_for_each_entry_safe(segment, next, >segments, node) { do you want to use _safe. Isee that this is called for cleanup while lock held, and in other case within another _safe iterator! > + list_del(>node); > + xilinx_vdma_free_tx_segment(chan, segment); > + } > + > + kfree(desc); > +} > + > +/* Required functions */ > + > + * xilinx_vdma_do_tasklet - Schedule completion tasklet > + * @data: Pointer to the Xilinx VDMA channel structure > + */ > +static void xilinx_vdma_do_tasklet(unsigned long data) > +{ > + struct xilinx_vdma_chan *chan = (struct xilinx_vdma_chan *)data; > + > + xilinx_vdma_chan_desc_cleanup(chan); > +} > + > +/** > + * xilinx_vdma_alloc_chan_resources - Allocate channel resources > + * @dchan: DMA channel > + * > + * Return: '1' on success and failure value on error naaah, we dont do that, pls use standard notation of 0 on success Also API wants you to return descriptors allocated here! > + */ > +static int xilinx_vdma_alloc_chan_resources(struct dma_chan *dchan) > +{ > + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); > + > + /* Has this channel already been allocated? */ > + if (chan->desc_pool) > + return 1; > + > + /* > + * We need the descriptor to be aligned to 64bytes > + * for meeting Xilinx VDMA specification requirement. > + */ > + chan->desc_pool = dma_pool_create("xilinx_vdma_desc_pool", > + chan->dev, > + sizeof(struct xilinx_vdma_tx_segment), > + __alignof__(struct xilinx_vdma_tx_segment), 0); > + if (!chan->desc_pool) { > + dev_err(chan->dev, > + "unable to allocate channel %d descriptor pool\n", > + chan->id); > + return -ENOMEM; > + } > + > + tasklet_init(>tasklet, xilinx_vdma_do_tasklet, > + (unsigned long)chan); > + > + chan->completed_cookie = DMA_MIN_COOKIE; > + chan->cookie = DMA_MIN_COOKIE; Can you
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote: > On 01/23/2014 03:00 PM, Andy Shevchenko wrote: > > On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote: > >> On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: > >>> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: > On 01/22/2014 05:52 PM, Srikanth Thokala wrote: > >>> > >>> [...] > >>> > > + /* Request the interrupt */ > > + chan->irq = irq_of_parse_and_map(node, 0); > > + err = devm_request_irq(xdev->dev, chan->irq, > > xilinx_vdma_irq_handler, > > + IRQF_SHARED, "xilinx-vdma-controller", > > chan); > > This is a clasic example of where to not use devm_request_irq. 'chan' is > accessed in the interrupt handler, but if you use devm_request_irq 'chan' > will be freed before the interrupt handler has been released, which means > there is now a race condition where the interrupt handler can access > already > freed memory.ta > >>> > >>> Could you elaborate this case? As far as I understood managed resources > >>> are a kind of stack pile. In this case you have no such condition. Where > >>> am I wrong? > >> > >> The stacked stuff is only ran after the remove() function. Which means that > >> you call dma_async_device_unregister() before the interrupt handler is > >> freed. Another issue with the interrupt handler is a bit hidden. The driver > >> does not call tasklet_kill in the remove function. Which it should though > >> to > >> make sure that the tasklet does not race against the freeing of the memory. > >> And in order to make sure that the tasklet is not rescheduled you need to > >> free the irq before killing the tasklet, since the interrupt handler > >> schedules the tasklet. > > > > So, you mean devm_request_irq() will race in any DMA driver? > > Most likely yes. devm_request_irq() is race condition prone for the majority > of device driver. You have to be really careful if you want to use it. > > > > > I think the proper solution is to disable all device work in > > the .remove() and devm will care about resources. > > As long as the interrupt handler is registered it can be called, the only > proper solution is to make sure that the order in which resources are torn > down is correct. Wouldn't it work if we register the irq last in the probe. That wil ensure on success the channel is always valid. Also the tasklet is required to be killed not just in your .remove but also in drivers .suspend handler, you dont want handler to be invoked after you returned from your suspend -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: > On 01/24/2014 12:16 PM, Srikanth Thokala wrote: > > Hi Lars, > > > > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen wrote: > >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote: > >> [...] > >>> +/** > >>> + * xilinx_vdma_device_control - Configure DMA channel of the device > >>> + * @dchan: DMA Channel pointer > >>> + * @cmd: DMA control command > >>> + * @arg: Channel configuration > >>> + * > >>> + * Return: '0' on success and failure value on error > >>> + */ > >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan, > >>> + enum dma_ctrl_cmd cmd, unsigned long > >>> arg) > >>> +{ > >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); > >>> + > >>> + switch (cmd) { > >>> + case DMA_TERMINATE_ALL: > >>> + xilinx_vdma_terminate_all(chan); > >>> + return 0; > >>> + case DMA_SLAVE_CONFIG: > >>> + return xilinx_vdma_slave_config(chan, > >>> + (struct xilinx_vdma_config *)arg); > >> > >> You really shouldn't be overloading the generic API with your own > >> semantics. > >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. > > > > Ok. The driver needs few additional configuration from the slave > > device like Vertical > > Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do > > you > > suggest me to define a separate dma_ctrl_cmd like the one > > FSLDMA_EXTERNAL_START > > defined for Freescale drivers? > > In my opinion it is not a good idea to have driver implement a generic API, > but at the same time let the driver have custom semantics for those API > calls. It's a bit like having a gpio driver that expects 23 and 42 as the > values passed to gpio_set_value instead of 0 and 1. It completely defeats > the purpose of a generic API, namely that you are able to write generic code > that makes use of the API without having to know about which implementation > API it is talking to. The dmaengine framework provides the > dmaengine_prep_interleaved_dma() function to setup two dimensional > transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. While we do try to get users use dma_slave_config, but there will always be someone who have specfic params. If we can generalize then we might want to add to the dma_slave_config as well -- ~Vinod -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. While we do try to get users use dma_slave_config, but there will always be someone who have specfic params. If we can generalize then we might want to add to the dma_slave_config as well -- ~Vinod -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote: On 01/23/2014 03:00 PM, Andy Shevchenko wrote: On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote: On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] + /* Request the interrupt */ + chan-irq = irq_of_parse_and_map(node, 0); + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler, + IRQF_SHARED, xilinx-vdma-controller, chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory.ta Could you elaborate this case? As far as I understood managed resources are a kind of stack pile. In this case you have no such condition. Where am I wrong? The stacked stuff is only ran after the remove() function. Which means that you call dma_async_device_unregister() before the interrupt handler is freed. Another issue with the interrupt handler is a bit hidden. The driver does not call tasklet_kill in the remove function. Which it should though to make sure that the tasklet does not race against the freeing of the memory. And in order to make sure that the tasklet is not rescheduled you need to free the irq before killing the tasklet, since the interrupt handler schedules the tasklet. So, you mean devm_request_irq() will race in any DMA driver? Most likely yes. devm_request_irq() is race condition prone for the majority of device driver. You have to be really careful if you want to use it. I think the proper solution is to disable all device work in the .remove() and devm will care about resources. As long as the interrupt handler is registered it can be called, the only proper solution is to make sure that the order in which resources are torn down is correct. Wouldn't it work if we register the irq last in the probe. That wil ensure on success the channel is always valid. Also the tasklet is required to be killed not just in your .remove but also in drivers .suspend handler, you dont want handler to be invoked after you returned from your suspend -- ~Vinod -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote: This is the driver for the AXI Video Direct Memory Access (AXI VDMA) core, which is a soft Xilinx IP core that provides high- bandwidth direct memory access between memory and AXI4-Stream type video target peripherals. The core provides efficient two dimensional DMA operations with independent asynchronous read ok here is tha catch, do you want to support interleaved API rather? +* DMA client + +Required properties: +- dmas: a list of [Video DMA device phandle] [Channel ID] pairs, + where Channel ID is '0' for write/tx and '1' for read/rx + channel. +- dma-names: a list of DMA channel names, one per dmas entry + +Example: + + +vdmatest_0: vdmatest@0 { + compatible =xlnx,axi-vdma-test-1.00.a; +dmas = axi_vdma_0 0 + axi_vdma_0 1; + dma-names = vdma0, vdma1; +} ; Need ack from DT folks. ALso would be better to split the binding to a separate patch +/** + * struct xilinx_vdma_chan - Driver specific VDMA channel structure + * @xdev: Driver specific device structure + * @ctrl_offset: Control registers offset + * @desc_offset: TX descriptor registers offset + * @completed_cookie: Maximum cookie completed + * @cookie: The current cookie + * @lock: Descriptor operation lock + * @pending_list: Descriptors waiting + * @active_desc: Active descriptor + * @done_list: Complete descriptors + * @common: DMA common channel + * @desc_pool: Descriptors pool + * @dev: The dma device + * @irq: Channel IRQ + * @id: Channel ID + * @direction: Transfer direction + * @num_frms: Number of frames + * @has_sg: Support scatter transfers + * @genlock: Support genlock mode + * @err: Channel has errors + * @tasklet: Cleanup work after irq + * @config: Device configuration info + * @flush_on_fsync: Flush on Frame sync + */ +struct xilinx_vdma_chan { + struct xilinx_vdma_device *xdev; + u32 ctrl_offset; + u32 desc_offset; + dma_cookie_t completed_cookie; + dma_cookie_t cookie; + spinlock_t lock; + struct list_head pending_list; + struct xilinx_vdma_tx_descriptor *active_desc; + struct list_head done_list; + struct dma_chan common; + struct dma_pool *desc_pool; + struct device *dev; + int irq; + int id; + enum dma_transfer_direction direction; why should channel have a direction... descriptor should have direction and not the channel! +/** + * xilinx_vdma_free_tx_descriptor - Free transaction descriptor + * @chan: Driver specific VDMA channel + * @desc: VDMA transaction descriptor + */ +static void +xilinx_vdma_free_tx_descriptor(struct xilinx_vdma_chan *chan, +struct xilinx_vdma_tx_descriptor *desc) +{ + struct xilinx_vdma_tx_segment *segment, *next; + + if (!desc) + return; + + list_for_each_entry_safe(segment, next, desc-segments, node) { do you want to use _safe. Isee that this is called for cleanup while lock held, and in other case within another _safe iterator! + list_del(segment-node); + xilinx_vdma_free_tx_segment(chan, segment); + } + + kfree(desc); +} + +/* Required functions */ + + * xilinx_vdma_do_tasklet - Schedule completion tasklet + * @data: Pointer to the Xilinx VDMA channel structure + */ +static void xilinx_vdma_do_tasklet(unsigned long data) +{ + struct xilinx_vdma_chan *chan = (struct xilinx_vdma_chan *)data; + + xilinx_vdma_chan_desc_cleanup(chan); +} + +/** + * xilinx_vdma_alloc_chan_resources - Allocate channel resources + * @dchan: DMA channel + * + * Return: '1' on success and failure value on error naaah, we dont do that, pls use standard notation of 0 on success Also API wants you to return descriptors allocated here! + */ +static int xilinx_vdma_alloc_chan_resources(struct dma_chan *dchan) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + /* Has this channel already been allocated? */ + if (chan-desc_pool) + return 1; + + /* + * We need the descriptor to be aligned to 64bytes + * for meeting Xilinx VDMA specification requirement. + */ + chan-desc_pool = dma_pool_create(xilinx_vdma_desc_pool, + chan-dev, + sizeof(struct xilinx_vdma_tx_segment), + __alignof__(struct xilinx_vdma_tx_segment), 0); + if (!chan-desc_pool) { + dev_err(chan-dev, + unable to allocate channel %d descriptor pool\n, + chan-id); + return -ENOMEM; + } + + tasklet_init(chan-tasklet, xilinx_vdma_do_tasklet, + (unsigned long)chan); + + chan-completed_cookie = DMA_MIN_COOKIE; + chan-cookie = DMA_MIN_COOKIE; Can you use virtual dma implementation to simplfy your implemenattion of lists, cookies (driver/dma/virt-dma.c) + /* There is
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/26/2014 02:59 PM, Vinod Koul wrote: On Fri, Jan 24, 2014 at 02:24:27PM +0100, Lars-Peter Clausen wrote: On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. The question here i think would be waht this device supports? Is the hardware capable of doing interleaved transfers, then would make sense. The hardware does 2D transfers. The parameters for a transfer are height, width and stride. That's only a subset of what interleaved transfers can be (xt-num_frames must be one for 2d transfers). But if I remember correctly there has been some discussion on this in the past and the result of that discussion was that using interleaved transfers for 2D transfers is preferred over adding a custom API for 2D transfers. - Lars -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/26/2014 03:03 PM, Vinod Koul wrote: On Thu, Jan 23, 2014 at 03:07:32PM +0100, Lars-Peter Clausen wrote: On 01/23/2014 03:00 PM, Andy Shevchenko wrote: On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote: On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] + /* Request the interrupt */ + chan-irq = irq_of_parse_and_map(node, 0); + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler, + IRQF_SHARED, xilinx-vdma-controller, chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory.ta Could you elaborate this case? As far as I understood managed resources are a kind of stack pile. In this case you have no such condition. Where am I wrong? The stacked stuff is only ran after the remove() function. Which means that you call dma_async_device_unregister() before the interrupt handler is freed. Another issue with the interrupt handler is a bit hidden. The driver does not call tasklet_kill in the remove function. Which it should though to make sure that the tasklet does not race against the freeing of the memory. And in order to make sure that the tasklet is not rescheduled you need to free the irq before killing the tasklet, since the interrupt handler schedules the tasklet. So, you mean devm_request_irq() will race in any DMA driver? Most likely yes. devm_request_irq() is race condition prone for the majority of device driver. You have to be really careful if you want to use it. I think the proper solution is to disable all device work in the .remove() and devm will care about resources. As long as the interrupt handler is registered it can be called, the only proper solution is to make sure that the order in which resources are torn down is correct. Wouldn't it work if we register the irq last in the probe. That wil ensure on success the channel is always valid. Yes, but only if the irq is not device managed. All device managed resources will be freed after the remove function has been called. Which is to late in our case since we make sure that the tasklet is not running in the remove function. - Lars -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/26/2014 03:24 PM, Vinod Koul wrote: On Wed, Jan 22, 2014 at 10:22:45PM +0530, Srikanth Thokala wrote: This is the driver for the AXI Video Direct Memory Access (AXI VDMA) core, which is a soft Xilinx IP core that provides high- bandwidth direct memory access between memory and AXI4-Stream type video target peripherals. The core provides efficient two dimensional DMA operations with independent asynchronous read ok here is tha catch, do you want to support interleaved API rather? +* DMA client + +Required properties: +- dmas: a list of [Video DMA device phandle] [Channel ID] pairs, + where Channel ID is '0' for write/tx and '1' for read/rx +channel. +- dma-names: a list of DMA channel names, one per dmas entry + +Example: + + +vdmatest_0: vdmatest@0 { + compatible =xlnx,axi-vdma-test-1.00.a; + dmas = axi_vdma_0 0 + axi_vdma_0 1; +dma-names = vdma0, vdma1; +} ; Need ack from DT folks. ALso would be better to split the binding to a separate patch +/** + * struct xilinx_vdma_chan - Driver specific VDMA channel structure + * @xdev: Driver specific device structure + * @ctrl_offset: Control registers offset + * @desc_offset: TX descriptor registers offset + * @completed_cookie: Maximum cookie completed + * @cookie: The current cookie + * @lock: Descriptor operation lock + * @pending_list: Descriptors waiting + * @active_desc: Active descriptor + * @done_list: Complete descriptors + * @common: DMA common channel + * @desc_pool: Descriptors pool + * @dev: The dma device + * @irq: Channel IRQ + * @id: Channel ID + * @direction: Transfer direction + * @num_frms: Number of frames + * @has_sg: Support scatter transfers + * @genlock: Support genlock mode + * @err: Channel has errors + * @tasklet: Cleanup work after irq + * @config: Device configuration info + * @flush_on_fsync: Flush on Frame sync + */ +struct xilinx_vdma_chan { +struct xilinx_vdma_device *xdev; +u32 ctrl_offset; +u32 desc_offset; +dma_cookie_t completed_cookie; +dma_cookie_t cookie; +spinlock_t lock; +struct list_head pending_list; +struct xilinx_vdma_tx_descriptor *active_desc; +struct list_head done_list; +struct dma_chan common; +struct dma_pool *desc_pool; +struct device *dev; +int irq; +int id; +enum dma_transfer_direction direction; why should channel have a direction... descriptor should have direction and not the channel! The channel only supports transfers in one direction. Either from memory to peripheral or from peripheral to memory, that's fixed and can't be changed at runtime. The driver needs to know which direction the channel supports so it can reject transfers with the wrong direction. [...] + + * xilinx_vdma_tx_status - Get VDMA transaction status + * @dchan: DMA channel + * @cookie: Transaction identifier + * @txstate: Transaction state + * + * Return: DMA transaction status + */ +static enum dma_status xilinx_vdma_tx_status(struct dma_chan *dchan, +dma_cookie_t cookie, +struct dma_tx_state *txstate) +{ +struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); +dma_cookie_t last_used; +dma_cookie_t last_complete; + +xilinx_vdma_chan_desc_cleanup(chan); + +last_used = dchan-cookie; +last_complete = chan-completed_cookie; + +dma_set_tx_state(txstate, last_complete, last_used, 0); + +return dma_async_is_complete(cookie, last_complete, last_used); no residue calculation? The hardware doesn't support that. +/** + * xilinx_vdma_prep_slave_sg - prepare a descriptor for a DMA_SLAVE transaction + * @dchan: DMA channel + * @sgl: scatterlist to transfer to/from + * @sg_len: number of entries in @sgl + * @dir: DMA direction + * @flags: transfer ack flags + * @context: unused + * + * Return: Async transaction descriptor on success and NULL on failure + */ +static struct dma_async_tx_descriptor * +xilinx_vdma_prep_slave_sg(struct dma_chan *dchan, struct scatterlist *sgl, + unsigned int sg_len, enum dma_transfer_direction dir, + unsigned long flags, void *context) okay now am worried, this is supposed to memcpy DMA so why slave-sg?? The DMA is either from memory to peripheral or from peripheral to memory depending on the direction. So slave sg should be fine. Looking at the driver overall, IMHO we need to do: - use the virt-dma to simplfy the cookie handling and perhpasn the descrptors too! - Perhpas use interleaved API..? - I dont think we should use the slave API as this seems memcpy case! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/24/2014 12:16 PM, Srikanth Thokala wrote: > Hi Lars, > > On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen wrote: >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote: >> [...] >>> +/** >>> + * xilinx_vdma_device_control - Configure DMA channel of the device >>> + * @dchan: DMA Channel pointer >>> + * @cmd: DMA control command >>> + * @arg: Channel configuration >>> + * >>> + * Return: '0' on success and failure value on error >>> + */ >>> +static int xilinx_vdma_device_control(struct dma_chan *dchan, >>> + enum dma_ctrl_cmd cmd, unsigned long >>> arg) >>> +{ >>> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); >>> + >>> + switch (cmd) { >>> + case DMA_TERMINATE_ALL: >>> + xilinx_vdma_terminate_all(chan); >>> + return 0; >>> + case DMA_SLAVE_CONFIG: >>> + return xilinx_vdma_slave_config(chan, >>> + (struct xilinx_vdma_config *)arg); >> >> You really shouldn't be overloading the generic API with your own semantics. >> DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. > > Ok. The driver needs few additional configuration from the slave > device like Vertical > Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do > you > suggest me to define a separate dma_ctrl_cmd like the one > FSLDMA_EXTERNAL_START > defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. - Lars -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen wrote: > On 01/22/2014 05:52 PM, Srikanth Thokala wrote: > [...] >> +/** >> + * xilinx_vdma_device_control - Configure DMA channel of the device >> + * @dchan: DMA Channel pointer >> + * @cmd: DMA control command >> + * @arg: Channel configuration >> + * >> + * Return: '0' on success and failure value on error >> + */ >> +static int xilinx_vdma_device_control(struct dma_chan *dchan, >> + enum dma_ctrl_cmd cmd, unsigned long arg) >> +{ >> + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); >> + >> + switch (cmd) { >> + case DMA_TERMINATE_ALL: >> + xilinx_vdma_terminate_all(chan); >> + return 0; >> + case DMA_SLAVE_CONFIG: >> + return xilinx_vdma_slave_config(chan, >> + (struct xilinx_vdma_config *)arg); > > You really shouldn't be overloading the generic API with your own semantics. > DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? > >> + default: >> + return -ENXIO; >> + } >> +} >> + > [...] >> + >> + /* Request the interrupt */ >> + chan->irq = irq_of_parse_and_map(node, 0); >> + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler, >> +IRQF_SHARED, "xilinx-vdma-controller", chan); > > This is a clasic example of where to not use devm_request_irq. 'chan' is > accessed in the interrupt handler, but if you use devm_request_irq 'chan' > will be freed before the interrupt handler has been released, which means > there is now a race condition where the interrupt handler can access already > freed memory. Ok, thank you for the clarification on this thread. I will fix it in v3. > >> + if (err) { >> + dev_err(xdev->dev, "unable to request IRQ\n"); >> + return err; >> + } >> + >> + /* Initialize the DMA channel and add it to the DMA engine channels >> + * list. >> + */ >> + chan->common.device = >common; >> + >> + list_add_tail(>common.device_node, >common.channels); >> + xdev->chan[chan->id] = chan; >> + >> + /* Reset the channel */ >> + err = xilinx_vdma_chan_reset(chan); >> + if (err < 0) { >> + dev_err(xdev->dev, "Reset channel failed\n"); >> + return err; >> + } >> + >> + return 0; >> +} >> + >> +/** >> + * struct of_dma_filter_xilinx_args - Channel filter args >> + * @dev: DMA device structure >> + * @chan_id: Channel id >> + */ >> +struct of_dma_filter_xilinx_args { >> + struct dma_device *dev; >> + u32 chan_id; >> +}; >> + >> +/** >> + * xilinx_vdma_dt_filter - VDMA channel filter function >> + * @chan: DMA channel pointer >> + * @param: Filter match value >> + * >> + * Return: true/false based on the result >> + */ >> +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param) >> +{ >> + struct of_dma_filter_xilinx_args *args = param; >> + >> + return chan->device == args->dev && chan->chan_id == args->chan_id; >> +} >> + >> +/** >> + * of_dma_xilinx_xlate - Translation function >> + * @dma_spec: Pointer to DMA specifier as found in the device tree >> + * @ofdma: Pointer to DMA controller data >> + * >> + * Return: DMA channel pointer on success and NULL on error >> + */ >> +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args >> *dma_spec, >> + struct of_dma *ofdma) >> +{ >> + struct of_dma_filter_xilinx_args args; >> + dma_cap_mask_t cap; >> + >> + args.dev = ofdma->of_dma_data; >> + if (!args.dev) >> + return NULL; >> + >> + if (dma_spec->args_count != 1) >> + return NULL; >> + >> + dma_cap_zero(cap); >> + dma_cap_set(DMA_SLAVE, cap); >> + >> + args.chan_id = dma_spec->args[0]; >> + >> + return dma_request_channel(cap, xilinx_vdma_dt_filter, ); > > There is a new helper function called dma_get_slave_channel, which makes > this much easier. Take a look at the k3dma.c driver for an example. Ok. I will check and fix it in v3. Srikanth >> +} > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? + default: + return -ENXIO; + } +} + [...] + + /* Request the interrupt */ + chan-irq = irq_of_parse_and_map(node, 0); + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler, +IRQF_SHARED, xilinx-vdma-controller, chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory. Ok, thank you for the clarification on this thread. I will fix it in v3. + if (err) { + dev_err(xdev-dev, unable to request IRQ\n); + return err; + } + + /* Initialize the DMA channel and add it to the DMA engine channels + * list. + */ + chan-common.device = xdev-common; + + list_add_tail(chan-common.device_node, xdev-common.channels); + xdev-chan[chan-id] = chan; + + /* Reset the channel */ + err = xilinx_vdma_chan_reset(chan); + if (err 0) { + dev_err(xdev-dev, Reset channel failed\n); + return err; + } + + return 0; +} + +/** + * struct of_dma_filter_xilinx_args - Channel filter args + * @dev: DMA device structure + * @chan_id: Channel id + */ +struct of_dma_filter_xilinx_args { + struct dma_device *dev; + u32 chan_id; +}; + +/** + * xilinx_vdma_dt_filter - VDMA channel filter function + * @chan: DMA channel pointer + * @param: Filter match value + * + * Return: true/false based on the result + */ +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param) +{ + struct of_dma_filter_xilinx_args *args = param; + + return chan-device == args-dev chan-chan_id == args-chan_id; +} + +/** + * of_dma_xilinx_xlate - Translation function + * @dma_spec: Pointer to DMA specifier as found in the device tree + * @ofdma: Pointer to DMA controller data + * + * Return: DMA channel pointer on success and NULL on error + */ +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec, + struct of_dma *ofdma) +{ + struct of_dma_filter_xilinx_args args; + dma_cap_mask_t cap; + + args.dev = ofdma-of_dma_data; + if (!args.dev) + return NULL; + + if (dma_spec-args_count != 1) + return NULL; + + dma_cap_zero(cap); + dma_cap_set(DMA_SLAVE, cap); + + args.chan_id = dma_spec-args[0]; + + return dma_request_channel(cap, xilinx_vdma_dt_filter, args); There is a new helper function called dma_get_slave_channel, which makes this much easier. Take a look at the k3dma.c driver for an example. Ok. I will check and fix it in v3. Srikanth +} -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/24/2014 12:16 PM, Srikanth Thokala wrote: Hi Lars, On Thu, Jan 23, 2014 at 4:55 PM, Lars-Peter Clausen l...@metafoo.de wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. Ok. The driver needs few additional configuration from the slave device like Vertical Size, Horizontal Size, Stride etc., for the DMA transfers, in that case do you suggest me to define a separate dma_ctrl_cmd like the one FSLDMA_EXTERNAL_START defined for Freescale drivers? In my opinion it is not a good idea to have driver implement a generic API, but at the same time let the driver have custom semantics for those API calls. It's a bit like having a gpio driver that expects 23 and 42 as the values passed to gpio_set_value instead of 0 and 1. It completely defeats the purpose of a generic API, namely that you are able to write generic code that makes use of the API without having to know about which implementation API it is talking to. The dmaengine framework provides the dmaengine_prep_interleaved_dma() function to setup two dimensional transfers, e.g. take a look at sirf-dma.c or imx-dma.c. - Lars -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Levente, On Thu, Jan 23, 2014 at 3:00 AM, Levente Kurusa wrote: > Hello, > > 2014/1/22 Srikanth Thokala : >> This is the driver for the AXI Video Direct Memory Access (AXI >> VDMA) core, which is a soft Xilinx IP core that provides high- >> bandwidth direct memory access between memory and AXI4-Stream >> type video target peripherals. The core provides efficient two >> dimensional DMA operations with independent asynchronous read >> and write channel operation. >> >> This module works on Zynq (ARM Based SoC) and Microblaze platforms. >> >> Signed-off-by: Srikanth Thokala >> --- > > Another two remarks, after you fixed them ( or not :-) ) > you can have my: > > Reviewed-by: Levente Kurusa > > Oh, and next time please if you post a patch that fixes something I pointed > out, > CC me as I had a hard time finding this patch, thanks. :-) Sure. Thanks > >> NOTE: >> 1. Created a separate directory 'dma/xilinx' as Xilinx has two more >>DMA IPs and we are also planning to upstream these drivers soon. >> 2. Rebased on v3.13.0-rc8 >> >> Changes in v2: >> - Removed DMA Test client module from the patchset as suggested >> by Andy Shevchenko >> - Removed device-id DT property, as suggested by Arnd Bergmann >> - Properly documented DT bindings as suggested by Arnd Bergmann >> - Returning with error, if registration of DMA to node fails >> - Fixed typo errors >> - Used BIT() macro at applicable places >> - Added missing header file to the patchset >> - Changed copyright year to include 2014 >> --- >> .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt | 75 + >> drivers/dma/Kconfig| 14 + >> drivers/dma/Makefile |1 + >> drivers/dma/xilinx/Makefile|1 + >> drivers/dma/xilinx/xilinx_vdma.c | 1486 >> >> include/linux/amba/xilinx_dma.h| 50 + >> 6 files changed, 1627 insertions(+) >> create mode 100644 >> Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt >> create mode 100644 drivers/dma/xilinx/Makefile >> create mode 100644 drivers/dma/xilinx/xilinx_vdma.c >> create mode 100644 include/linux/amba/xilinx_dma.h >> >> diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt >> b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt >> new file mode 100644 >> index 000..ab8be1a >> --- /dev/null >> +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt >> @@ -0,0 +1,75 @@ >> +Xilinx AXI VDMA engine, it does transfers between memory and video devices. >> +It can be configured to have one channel or two channels. If configured >> +as two channels, one is to transmit to the video device and another is >> +to receive from the video device. >> + >> +Required properties: >> +- compatible: Should be "xlnx,axi-vdma-1.00.a" >> +- #dma-cells: Should be <1>, see "dmas" property below >> +- reg: Should contain VDMA registers location and length. >> +- xlnx,num-fstores: Should be the number of framebuffers as configured in >> h/w. >> +- dma-channel child node: Should have atleast one channel and can have upto >> + two channels per device. This node specifies the properties of each >> + DMA channel (see child node properties below). >> + >> +Optional properties: >> +- xlnx,include-sg: Tells whether configured for Scatter-mode in >> + the hardware. >> [...] >> + >> +/** >> + * xilinx_vdma_is_running - Check if VDMA channel is running >> + * @chan: Driver specific VDMA channel >> + * >> + * Return: '1' if running, '0' if not. >> + */ >> +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan) >> +{ >> + return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) & >> +XILINX_VDMA_DMASR_HALTED) && >> + (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) & >> +XILINX_VDMA_DMACR_RUNSTOP); >> +} >> + [...] >> + /* Retrieve the channel properties from the device tree */ >> + has_dre = of_property_read_bool(node, "xlnx,include-dre"); >> + >> + chan->genlock = of_property_read_bool(node, "xlnx,genlock-mode"); >> + >> + err = of_property_read_u32(node, "xlnx,datawidth", ); >> + if (!err) { >> + u32 width = value >> 3; /* Convert bits to bytes */ >> + >> + /* If data width is greater than 8 bytes, DRE is not in hw */ >> + if (width > 8) >> + has_dre = false; >> + >> + if (!has_dre) >> + xdev->common.copy_align = fls(width - 1); >> + } else { >> + dev_err(xdev->dev, "missing xlnx,datawidth property\n"); >> + return err; >> + } > > Can you please convert this to: > if (err) { > dev_err(...); > return err; > } > > That way we can avoid the else clause. Ok. I will fix it in v3. >> + >> + if (of_device_is_compatible(node, "xlnx,axi-vdma-mm2s-channel")) { >> +
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/23/2014 03:00 PM, Andy Shevchenko wrote: > On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote: >> On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: >>> On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: >>> >>> [...] >>> > + /* Request the interrupt */ > + chan->irq = irq_of_parse_and_map(node, 0); > + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler, > +IRQF_SHARED, "xilinx-vdma-controller", chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory.ta >>> >>> Could you elaborate this case? As far as I understood managed resources >>> are a kind of stack pile. In this case you have no such condition. Where >>> am I wrong? >> >> The stacked stuff is only ran after the remove() function. Which means that >> you call dma_async_device_unregister() before the interrupt handler is >> freed. Another issue with the interrupt handler is a bit hidden. The driver >> does not call tasklet_kill in the remove function. Which it should though to >> make sure that the tasklet does not race against the freeing of the memory. >> And in order to make sure that the tasklet is not rescheduled you need to >> free the irq before killing the tasklet, since the interrupt handler >> schedules the tasklet. > > So, you mean devm_request_irq() will race in any DMA driver? Most likely yes. devm_request_irq() is race condition prone for the majority of device driver. You have to be really careful if you want to use it. > > I think the proper solution is to disable all device work in > the .remove() and devm will care about resources. As long as the interrupt handler is registered it can be called, the only proper solution is to make sure that the order in which resources are torn down is correct. - Lars -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote: > On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: > > On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: > >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote: > > > > [...] > > > >>> + /* Request the interrupt */ > >>> + chan->irq = irq_of_parse_and_map(node, 0); > >>> + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler, > >>> +IRQF_SHARED, "xilinx-vdma-controller", chan); > >> > >> This is a clasic example of where to not use devm_request_irq. 'chan' is > >> accessed in the interrupt handler, but if you use devm_request_irq 'chan' > >> will be freed before the interrupt handler has been released, which means > >> there is now a race condition where the interrupt handler can access > >> already > >> freed memory.ta > > > > Could you elaborate this case? As far as I understood managed resources > > are a kind of stack pile. In this case you have no such condition. Where > > am I wrong? > > The stacked stuff is only ran after the remove() function. Which means that > you call dma_async_device_unregister() before the interrupt handler is > freed. Another issue with the interrupt handler is a bit hidden. The driver > does not call tasklet_kill in the remove function. Which it should though to > make sure that the tasklet does not race against the freeing of the memory. > And in order to make sure that the tasklet is not rescheduled you need to > free the irq before killing the tasklet, since the interrupt handler > schedules the tasklet. So, you mean devm_request_irq() will race in any DMA driver? I think the proper solution is to disable all device work in the .remove() and devm will care about resources. > majordomo-info.html -- Andy Shevchenko Intel Finland Oy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: > On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: >> On 01/22/2014 05:52 PM, Srikanth Thokala wrote: > > [...] > >>> + /* Request the interrupt */ >>> + chan->irq = irq_of_parse_and_map(node, 0); >>> + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler, >>> + IRQF_SHARED, "xilinx-vdma-controller", chan); >> >> This is a clasic example of where to not use devm_request_irq. 'chan' is >> accessed in the interrupt handler, but if you use devm_request_irq 'chan' >> will be freed before the interrupt handler has been released, which means >> there is now a race condition where the interrupt handler can access already >> freed memory.ta > > Could you elaborate this case? As far as I understood managed resources > are a kind of stack pile. In this case you have no such condition. Where > am I wrong? The stacked stuff is only ran after the remove() function. Which means that you call dma_async_device_unregister() before the interrupt handler is freed. Another issue with the interrupt handler is a bit hidden. The driver does not call tasklet_kill in the remove function. Which it should though to make sure that the tasklet does not race against the freeing of the memory. And in order to make sure that the tasklet is not rescheduled you need to free the irq before killing the tasklet, since the interrupt handler schedules the tasklet. - Lars -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: > On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] > > + /* Request the interrupt */ > > + chan->irq = irq_of_parse_and_map(node, 0); > > + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler, > > + IRQF_SHARED, "xilinx-vdma-controller", chan); > > This is a clasic example of where to not use devm_request_irq. 'chan' is > accessed in the interrupt handler, but if you use devm_request_irq 'chan' > will be freed before the interrupt handler has been released, which means > there is now a race condition where the interrupt handler can access already > freed memory. Could you elaborate this case? As far as I understood managed resources are a kind of stack pile. In this case you have no such condition. Where am I wrong? -- Andy Shevchenko Intel Finland Oy - Intel Finland Oy Registered Address: PL 281, 00181 Helsinki Business Identity Code: 0357606 - 4 Domiciled in Helsinki This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Wed, 2014-01-22 at 22:22 +0530, Srikanth Thokala wrote: > This is the driver for the AXI Video Direct Memory Access (AXI > VDMA) core, which is a soft Xilinx IP core that provides high- > bandwidth direct memory access between memory and AXI4-Stream > type video target peripherals. The core provides efficient two > dimensional DMA operations with independent asynchronous read > and write channel operation. > > This module works on Zynq (ARM Based SoC) and Microblaze platforms. Few comments below. > > Signed-off-by: Srikanth Thokala > --- > NOTE: > 1. Created a separate directory 'dma/xilinx' as Xilinx has two more >DMA IPs and we are also planning to upstream these drivers soon. > 2. Rebased on v3.13.0-rc8 > > Changes in v2: > - Removed DMA Test client module from the patchset as suggested > by Andy Shevchenko > - Removed device-id DT property, as suggested by Arnd Bergmann > - Properly documented DT bindings as suggested by Arnd Bergmann > - Returning with error, if registration of DMA to node fails > - Fixed typo errors > - Used BIT() macro at applicable places > - Added missing header file to the patchset > - Changed copyright year to include 2014 > --- > .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt | 75 + > drivers/dma/Kconfig| 14 + > drivers/dma/Makefile |1 + > drivers/dma/xilinx/Makefile|1 + > drivers/dma/xilinx/xilinx_vdma.c | 1486 > > include/linux/amba/xilinx_dma.h| 50 + > 6 files changed, 1627 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt > create mode 100644 drivers/dma/xilinx/Makefile > create mode 100644 drivers/dma/xilinx/xilinx_vdma.c > create mode 100644 include/linux/amba/xilinx_dma.h > > diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt > b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt > new file mode 100644 > index 000..ab8be1a > --- /dev/null > +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt > @@ -0,0 +1,75 @@ > +Xilinx AXI VDMA engine, it does transfers between memory and video devices. > +It can be configured to have one channel or two channels. If configured > +as two channels, one is to transmit to the video device and another is > +to receive from the video device. > + > +Required properties: > +- compatible: Should be "xlnx,axi-vdma-1.00.a" > +- #dma-cells: Should be <1>, see "dmas" property below > +- reg: Should contain VDMA registers location and length. > +- xlnx,num-fstores: Should be the number of framebuffers as configured in > h/w. > +- dma-channel child node: Should have atleast one channel and can have upto > + two channels per device. This node specifies the properties of each > + DMA channel (see child node properties below). > + > +Optional properties: > +- xlnx,include-sg: Tells whether configured for Scatter-mode in > + the hardware. > +- xlnx,flush-fsync: Tells whether which channel to Flush on Frame sync. > + It takes following values: > + {1}, flush both channels > + {2}, flush mm2s channel > + {3}, flush s2mm channel > + > +Required child node properties: > +- compatible: It should be either "xlnx,axi-vdma-mm2s-channel" or > + "xlnx,axi-vdma-s2mm-channel". > +- interrupts: Should contain per channel VDMA interrupts. > +- xlnx,data-width: Should contain the stream data width, take values > + {32,64...1024}. > + > +Option child node properties: > +- xlnx,include-dre: Tells whether hardware is configured for Data > + Realignment Engine. > +- xlnx,genlock-mode: Tells whether Genlock synchronization is > + enabled/disabled in hardware. > + > +Example: > + > + > +axi_vdma_0: axivdma@4003 { > + compatible = "xlnx,axi-vdma-1.00.a"; > + #dma_cells = <1>; > + reg = < 0x4003 0x1 >; > + xlnx,num-fstores = <0x8>; > + xlnx,flush-fsync = <0x1>; > + dma-channel@4003 { > + compatible = "xlnx,axi-vdma-mm2s-channel"; > + interrupts = < 0 54 4 >; > + xlnx,datawidth = <0x40>; > + } ; > + dma-channel@40030030 { > + compatible = "xlnx,axi-vdma-s2mm-channel"; > + interrupts = < 0 53 4 >; > + xlnx,datawidth = <0x40>; > + } ; > +} ; > + > + > +* DMA client > + > +Required properties: > +- dmas: a list of <[Video DMA device phandle] [Channel ID]> pairs, > + where Channel ID is '0' for write/tx and '1' for read/rx > + channel. > +- dma-names: a list of DMA channel names, one per "dmas" entry > + > +Example: > + > + > +vdmatest_0: vdmatest@0 { > + compatible ="xlnx,axi-vdma-test-1.00.a"; > + dmas = <_vdma_0 0 > + _vdma_0 1>; > + dma-names = "vdma0", "vdma1"; > +} ; > diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig > index c823daa..2a74651 100644 > ---
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] > +/** > + * xilinx_vdma_device_control - Configure DMA channel of the device > + * @dchan: DMA Channel pointer > + * @cmd: DMA control command > + * @arg: Channel configuration > + * > + * Return: '0' on success and failure value on error > + */ > +static int xilinx_vdma_device_control(struct dma_chan *dchan, > + enum dma_ctrl_cmd cmd, unsigned long arg) > +{ > + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); > + > + switch (cmd) { > + case DMA_TERMINATE_ALL: > + xilinx_vdma_terminate_all(chan); > + return 0; > + case DMA_SLAVE_CONFIG: > + return xilinx_vdma_slave_config(chan, > + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. > + default: > + return -ENXIO; > + } > +} > + [...] > + > + /* Request the interrupt */ > + chan->irq = irq_of_parse_and_map(node, 0); > + err = devm_request_irq(xdev->dev, chan->irq, xilinx_vdma_irq_handler, > +IRQF_SHARED, "xilinx-vdma-controller", chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory. > + if (err) { > + dev_err(xdev->dev, "unable to request IRQ\n"); > + return err; > + } > + > + /* Initialize the DMA channel and add it to the DMA engine channels > + * list. > + */ > + chan->common.device = >common; > + > + list_add_tail(>common.device_node, >common.channels); > + xdev->chan[chan->id] = chan; > + > + /* Reset the channel */ > + err = xilinx_vdma_chan_reset(chan); > + if (err < 0) { > + dev_err(xdev->dev, "Reset channel failed\n"); > + return err; > + } > + > + return 0; > +} > + > +/** > + * struct of_dma_filter_xilinx_args - Channel filter args > + * @dev: DMA device structure > + * @chan_id: Channel id > + */ > +struct of_dma_filter_xilinx_args { > + struct dma_device *dev; > + u32 chan_id; > +}; > + > +/** > + * xilinx_vdma_dt_filter - VDMA channel filter function > + * @chan: DMA channel pointer > + * @param: Filter match value > + * > + * Return: true/false based on the result > + */ > +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param) > +{ > + struct of_dma_filter_xilinx_args *args = param; > + > + return chan->device == args->dev && chan->chan_id == args->chan_id; > +} > + > +/** > + * of_dma_xilinx_xlate - Translation function > + * @dma_spec: Pointer to DMA specifier as found in the device tree > + * @ofdma: Pointer to DMA controller data > + * > + * Return: DMA channel pointer on success and NULL on error > + */ > +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec, > + struct of_dma *ofdma) > +{ > + struct of_dma_filter_xilinx_args args; > + dma_cap_mask_t cap; > + > + args.dev = ofdma->of_dma_data; > + if (!args.dev) > + return NULL; > + > + if (dma_spec->args_count != 1) > + return NULL; > + > + dma_cap_zero(cap); > + dma_cap_set(DMA_SLAVE, cap); > + > + args.chan_id = dma_spec->args[0]; > + > + return dma_request_channel(cap, xilinx_vdma_dt_filter, ); There is a new helper function called dma_get_slave_channel, which makes this much easier. Take a look at the k3dma.c driver for an example. > +} -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] +/** + * xilinx_vdma_device_control - Configure DMA channel of the device + * @dchan: DMA Channel pointer + * @cmd: DMA control command + * @arg: Channel configuration + * + * Return: '0' on success and failure value on error + */ +static int xilinx_vdma_device_control(struct dma_chan *dchan, + enum dma_ctrl_cmd cmd, unsigned long arg) +{ + struct xilinx_vdma_chan *chan = to_xilinx_chan(dchan); + + switch (cmd) { + case DMA_TERMINATE_ALL: + xilinx_vdma_terminate_all(chan); + return 0; + case DMA_SLAVE_CONFIG: + return xilinx_vdma_slave_config(chan, + (struct xilinx_vdma_config *)arg); You really shouldn't be overloading the generic API with your own semantics. DMA_SLAVE_CONFIG should take a dma_slave_config and nothing else. + default: + return -ENXIO; + } +} + [...] + + /* Request the interrupt */ + chan-irq = irq_of_parse_and_map(node, 0); + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler, +IRQF_SHARED, xilinx-vdma-controller, chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory. + if (err) { + dev_err(xdev-dev, unable to request IRQ\n); + return err; + } + + /* Initialize the DMA channel and add it to the DMA engine channels + * list. + */ + chan-common.device = xdev-common; + + list_add_tail(chan-common.device_node, xdev-common.channels); + xdev-chan[chan-id] = chan; + + /* Reset the channel */ + err = xilinx_vdma_chan_reset(chan); + if (err 0) { + dev_err(xdev-dev, Reset channel failed\n); + return err; + } + + return 0; +} + +/** + * struct of_dma_filter_xilinx_args - Channel filter args + * @dev: DMA device structure + * @chan_id: Channel id + */ +struct of_dma_filter_xilinx_args { + struct dma_device *dev; + u32 chan_id; +}; + +/** + * xilinx_vdma_dt_filter - VDMA channel filter function + * @chan: DMA channel pointer + * @param: Filter match value + * + * Return: true/false based on the result + */ +static bool xilinx_vdma_dt_filter(struct dma_chan *chan, void *param) +{ + struct of_dma_filter_xilinx_args *args = param; + + return chan-device == args-dev chan-chan_id == args-chan_id; +} + +/** + * of_dma_xilinx_xlate - Translation function + * @dma_spec: Pointer to DMA specifier as found in the device tree + * @ofdma: Pointer to DMA controller data + * + * Return: DMA channel pointer on success and NULL on error + */ +static struct dma_chan *of_dma_xilinx_xlate(struct of_phandle_args *dma_spec, + struct of_dma *ofdma) +{ + struct of_dma_filter_xilinx_args args; + dma_cap_mask_t cap; + + args.dev = ofdma-of_dma_data; + if (!args.dev) + return NULL; + + if (dma_spec-args_count != 1) + return NULL; + + dma_cap_zero(cap); + dma_cap_set(DMA_SLAVE, cap); + + args.chan_id = dma_spec-args[0]; + + return dma_request_channel(cap, xilinx_vdma_dt_filter, args); There is a new helper function called dma_get_slave_channel, which makes this much easier. Take a look at the k3dma.c driver for an example. +} -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Wed, 2014-01-22 at 22:22 +0530, Srikanth Thokala wrote: This is the driver for the AXI Video Direct Memory Access (AXI VDMA) core, which is a soft Xilinx IP core that provides high- bandwidth direct memory access between memory and AXI4-Stream type video target peripherals. The core provides efficient two dimensional DMA operations with independent asynchronous read and write channel operation. This module works on Zynq (ARM Based SoC) and Microblaze platforms. Few comments below. Signed-off-by: Srikanth Thokala stho...@xilinx.com --- NOTE: 1. Created a separate directory 'dma/xilinx' as Xilinx has two more DMA IPs and we are also planning to upstream these drivers soon. 2. Rebased on v3.13.0-rc8 Changes in v2: - Removed DMA Test client module from the patchset as suggested by Andy Shevchenko - Removed device-id DT property, as suggested by Arnd Bergmann - Properly documented DT bindings as suggested by Arnd Bergmann - Returning with error, if registration of DMA to node fails - Fixed typo errors - Used BIT() macro at applicable places - Added missing header file to the patchset - Changed copyright year to include 2014 --- .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt | 75 + drivers/dma/Kconfig| 14 + drivers/dma/Makefile |1 + drivers/dma/xilinx/Makefile|1 + drivers/dma/xilinx/xilinx_vdma.c | 1486 include/linux/amba/xilinx_dma.h| 50 + 6 files changed, 1627 insertions(+) create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt create mode 100644 drivers/dma/xilinx/Makefile create mode 100644 drivers/dma/xilinx/xilinx_vdma.c create mode 100644 include/linux/amba/xilinx_dma.h diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt new file mode 100644 index 000..ab8be1a --- /dev/null +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt @@ -0,0 +1,75 @@ +Xilinx AXI VDMA engine, it does transfers between memory and video devices. +It can be configured to have one channel or two channels. If configured +as two channels, one is to transmit to the video device and another is +to receive from the video device. + +Required properties: +- compatible: Should be xlnx,axi-vdma-1.00.a +- #dma-cells: Should be 1, see dmas property below +- reg: Should contain VDMA registers location and length. +- xlnx,num-fstores: Should be the number of framebuffers as configured in h/w. +- dma-channel child node: Should have atleast one channel and can have upto + two channels per device. This node specifies the properties of each + DMA channel (see child node properties below). + +Optional properties: +- xlnx,include-sg: Tells whether configured for Scatter-mode in + the hardware. +- xlnx,flush-fsync: Tells whether which channel to Flush on Frame sync. + It takes following values: + {1}, flush both channels + {2}, flush mm2s channel + {3}, flush s2mm channel + +Required child node properties: +- compatible: It should be either xlnx,axi-vdma-mm2s-channel or + xlnx,axi-vdma-s2mm-channel. +- interrupts: Should contain per channel VDMA interrupts. +- xlnx,data-width: Should contain the stream data width, take values + {32,64...1024}. + +Option child node properties: +- xlnx,include-dre: Tells whether hardware is configured for Data + Realignment Engine. +- xlnx,genlock-mode: Tells whether Genlock synchronization is + enabled/disabled in hardware. + +Example: + + +axi_vdma_0: axivdma@4003 { + compatible = xlnx,axi-vdma-1.00.a; + #dma_cells = 1; + reg = 0x4003 0x1 ; + xlnx,num-fstores = 0x8; + xlnx,flush-fsync = 0x1; + dma-channel@4003 { + compatible = xlnx,axi-vdma-mm2s-channel; + interrupts = 0 54 4 ; + xlnx,datawidth = 0x40; + } ; + dma-channel@40030030 { + compatible = xlnx,axi-vdma-s2mm-channel; + interrupts = 0 53 4 ; + xlnx,datawidth = 0x40; + } ; +} ; + + +* DMA client + +Required properties: +- dmas: a list of [Video DMA device phandle] [Channel ID] pairs, + where Channel ID is '0' for write/tx and '1' for read/rx + channel. +- dma-names: a list of DMA channel names, one per dmas entry + +Example: + + +vdmatest_0: vdmatest@0 { + compatible =xlnx,axi-vdma-test-1.00.a; + dmas = axi_vdma_0 0 + axi_vdma_0 1; + dma-names = vdma0, vdma1; +} ; diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig index c823daa..2a74651 100644 --- a/drivers/dma/Kconfig +++ b/drivers/dma/Kconfig @@ -334,6 +334,20 @@ config K3_DMA Support the DMA engine for Hisilicon K3 platform devices.
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] + /* Request the interrupt */ + chan-irq = irq_of_parse_and_map(node, 0); + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler, + IRQF_SHARED, xilinx-vdma-controller, chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory. Could you elaborate this case? As far as I understood managed resources are a kind of stack pile. In this case you have no such condition. Where am I wrong? -- Andy Shevchenko andriy.shevche...@intel.com Intel Finland Oy - Intel Finland Oy Registered Address: PL 281, 00181 Helsinki Business Identity Code: 0357606 - 4 Domiciled in Helsinki This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] + /* Request the interrupt */ + chan-irq = irq_of_parse_and_map(node, 0); + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler, + IRQF_SHARED, xilinx-vdma-controller, chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory.ta Could you elaborate this case? As far as I understood managed resources are a kind of stack pile. In this case you have no such condition. Where am I wrong? The stacked stuff is only ran after the remove() function. Which means that you call dma_async_device_unregister() before the interrupt handler is freed. Another issue with the interrupt handler is a bit hidden. The driver does not call tasklet_kill in the remove function. Which it should though to make sure that the tasklet does not race against the freeing of the memory. And in order to make sure that the tasklet is not rescheduled you need to free the irq before killing the tasklet, since the interrupt handler schedules the tasklet. - Lars -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote: On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] + /* Request the interrupt */ + chan-irq = irq_of_parse_and_map(node, 0); + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler, +IRQF_SHARED, xilinx-vdma-controller, chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory.ta Could you elaborate this case? As far as I understood managed resources are a kind of stack pile. In this case you have no such condition. Where am I wrong? The stacked stuff is only ran after the remove() function. Which means that you call dma_async_device_unregister() before the interrupt handler is freed. Another issue with the interrupt handler is a bit hidden. The driver does not call tasklet_kill in the remove function. Which it should though to make sure that the tasklet does not race against the freeing of the memory. And in order to make sure that the tasklet is not rescheduled you need to free the irq before killing the tasklet, since the interrupt handler schedules the tasklet. So, you mean devm_request_irq() will race in any DMA driver? I think the proper solution is to disable all device work in the .remove() and devm will care about resources. majordomo-info.html -- Andy Shevchenko andriy.shevche...@linux.intel.com Intel Finland Oy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
On 01/23/2014 03:00 PM, Andy Shevchenko wrote: On Thu, 2014-01-23 at 14:50 +0100, Lars-Peter Clausen wrote: On 01/23/2014 02:38 PM, Shevchenko, Andriy wrote: On Thu, 2014-01-23 at 12:25 +0100, Lars-Peter Clausen wrote: On 01/22/2014 05:52 PM, Srikanth Thokala wrote: [...] + /* Request the interrupt */ + chan-irq = irq_of_parse_and_map(node, 0); + err = devm_request_irq(xdev-dev, chan-irq, xilinx_vdma_irq_handler, +IRQF_SHARED, xilinx-vdma-controller, chan); This is a clasic example of where to not use devm_request_irq. 'chan' is accessed in the interrupt handler, but if you use devm_request_irq 'chan' will be freed before the interrupt handler has been released, which means there is now a race condition where the interrupt handler can access already freed memory.ta Could you elaborate this case? As far as I understood managed resources are a kind of stack pile. In this case you have no such condition. Where am I wrong? The stacked stuff is only ran after the remove() function. Which means that you call dma_async_device_unregister() before the interrupt handler is freed. Another issue with the interrupt handler is a bit hidden. The driver does not call tasklet_kill in the remove function. Which it should though to make sure that the tasklet does not race against the freeing of the memory. And in order to make sure that the tasklet is not rescheduled you need to free the irq before killing the tasklet, since the interrupt handler schedules the tasklet. So, you mean devm_request_irq() will race in any DMA driver? Most likely yes. devm_request_irq() is race condition prone for the majority of device driver. You have to be really careful if you want to use it. I think the proper solution is to disable all device work in the .remove() and devm will care about resources. As long as the interrupt handler is registered it can be called, the only proper solution is to make sure that the order in which resources are torn down is correct. - Lars -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hi Levente, On Thu, Jan 23, 2014 at 3:00 AM, Levente Kurusa le...@linux.com wrote: Hello, 2014/1/22 Srikanth Thokala stho...@xilinx.com: This is the driver for the AXI Video Direct Memory Access (AXI VDMA) core, which is a soft Xilinx IP core that provides high- bandwidth direct memory access between memory and AXI4-Stream type video target peripherals. The core provides efficient two dimensional DMA operations with independent asynchronous read and write channel operation. This module works on Zynq (ARM Based SoC) and Microblaze platforms. Signed-off-by: Srikanth Thokala stho...@xilinx.com --- Another two remarks, after you fixed them ( or not :-) ) you can have my: Reviewed-by: Levente Kurusa le...@linux.com Oh, and next time please if you post a patch that fixes something I pointed out, CC me as I had a hard time finding this patch, thanks. :-) Sure. Thanks NOTE: 1. Created a separate directory 'dma/xilinx' as Xilinx has two more DMA IPs and we are also planning to upstream these drivers soon. 2. Rebased on v3.13.0-rc8 Changes in v2: - Removed DMA Test client module from the patchset as suggested by Andy Shevchenko - Removed device-id DT property, as suggested by Arnd Bergmann - Properly documented DT bindings as suggested by Arnd Bergmann - Returning with error, if registration of DMA to node fails - Fixed typo errors - Used BIT() macro at applicable places - Added missing header file to the patchset - Changed copyright year to include 2014 --- .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt | 75 + drivers/dma/Kconfig| 14 + drivers/dma/Makefile |1 + drivers/dma/xilinx/Makefile|1 + drivers/dma/xilinx/xilinx_vdma.c | 1486 include/linux/amba/xilinx_dma.h| 50 + 6 files changed, 1627 insertions(+) create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt create mode 100644 drivers/dma/xilinx/Makefile create mode 100644 drivers/dma/xilinx/xilinx_vdma.c create mode 100644 include/linux/amba/xilinx_dma.h diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt new file mode 100644 index 000..ab8be1a --- /dev/null +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt @@ -0,0 +1,75 @@ +Xilinx AXI VDMA engine, it does transfers between memory and video devices. +It can be configured to have one channel or two channels. If configured +as two channels, one is to transmit to the video device and another is +to receive from the video device. + +Required properties: +- compatible: Should be xlnx,axi-vdma-1.00.a +- #dma-cells: Should be 1, see dmas property below +- reg: Should contain VDMA registers location and length. +- xlnx,num-fstores: Should be the number of framebuffers as configured in h/w. +- dma-channel child node: Should have atleast one channel and can have upto + two channels per device. This node specifies the properties of each + DMA channel (see child node properties below). + +Optional properties: +- xlnx,include-sg: Tells whether configured for Scatter-mode in + the hardware. [...] + +/** + * xilinx_vdma_is_running - Check if VDMA channel is running + * @chan: Driver specific VDMA channel + * + * Return: '1' if running, '0' if not. + */ +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan) +{ + return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) +XILINX_VDMA_DMASR_HALTED) + (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) +XILINX_VDMA_DMACR_RUNSTOP); +} + [...] + /* Retrieve the channel properties from the device tree */ + has_dre = of_property_read_bool(node, xlnx,include-dre); + + chan-genlock = of_property_read_bool(node, xlnx,genlock-mode); + + err = of_property_read_u32(node, xlnx,datawidth, value); + if (!err) { + u32 width = value 3; /* Convert bits to bytes */ + + /* If data width is greater than 8 bytes, DRE is not in hw */ + if (width 8) + has_dre = false; + + if (!has_dre) + xdev-common.copy_align = fls(width - 1); + } else { + dev_err(xdev-dev, missing xlnx,datawidth property\n); + return err; + } Can you please convert this to: if (err) { dev_err(...); return err; } That way we can avoid the else clause. Ok. I will fix it in v3. + + if (of_device_is_compatible(node, xlnx,axi-vdma-mm2s-channel)) { + chan-direction = DMA_MEM_TO_DEV; + chan-id = 0; + + chan-ctrl_offset = XILINX_VDMA_MM2S_CTRL_OFFSET; + chan-desc_offset =
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hello, 2014/1/22 Srikanth Thokala : > This is the driver for the AXI Video Direct Memory Access (AXI > VDMA) core, which is a soft Xilinx IP core that provides high- > bandwidth direct memory access between memory and AXI4-Stream > type video target peripherals. The core provides efficient two > dimensional DMA operations with independent asynchronous read > and write channel operation. > > This module works on Zynq (ARM Based SoC) and Microblaze platforms. > > Signed-off-by: Srikanth Thokala > --- Another two remarks, after you fixed them ( or not :-) ) you can have my: Reviewed-by: Levente Kurusa Oh, and next time please if you post a patch that fixes something I pointed out, CC me as I had a hard time finding this patch, thanks. :-) > NOTE: > 1. Created a separate directory 'dma/xilinx' as Xilinx has two more >DMA IPs and we are also planning to upstream these drivers soon. > 2. Rebased on v3.13.0-rc8 > > Changes in v2: > - Removed DMA Test client module from the patchset as suggested > by Andy Shevchenko > - Removed device-id DT property, as suggested by Arnd Bergmann > - Properly documented DT bindings as suggested by Arnd Bergmann > - Returning with error, if registration of DMA to node fails > - Fixed typo errors > - Used BIT() macro at applicable places > - Added missing header file to the patchset > - Changed copyright year to include 2014 > --- > .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt | 75 + > drivers/dma/Kconfig| 14 + > drivers/dma/Makefile |1 + > drivers/dma/xilinx/Makefile|1 + > drivers/dma/xilinx/xilinx_vdma.c | 1486 > > include/linux/amba/xilinx_dma.h| 50 + > 6 files changed, 1627 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt > create mode 100644 drivers/dma/xilinx/Makefile > create mode 100644 drivers/dma/xilinx/xilinx_vdma.c > create mode 100644 include/linux/amba/xilinx_dma.h > > diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt > b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt > new file mode 100644 > index 000..ab8be1a > --- /dev/null > +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt > @@ -0,0 +1,75 @@ > +Xilinx AXI VDMA engine, it does transfers between memory and video devices. > +It can be configured to have one channel or two channels. If configured > +as two channels, one is to transmit to the video device and another is > +to receive from the video device. > + > +Required properties: > +- compatible: Should be "xlnx,axi-vdma-1.00.a" > +- #dma-cells: Should be <1>, see "dmas" property below > +- reg: Should contain VDMA registers location and length. > +- xlnx,num-fstores: Should be the number of framebuffers as configured in > h/w. > +- dma-channel child node: Should have atleast one channel and can have upto > + two channels per device. This node specifies the properties of each > + DMA channel (see child node properties below). > + > +Optional properties: > +- xlnx,include-sg: Tells whether configured for Scatter-mode in > + the hardware. > [...] > + > +/** > + * xilinx_vdma_is_running - Check if VDMA channel is running > + * @chan: Driver specific VDMA channel > + * > + * Return: '1' if running, '0' if not. > + */ > +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan) > +{ > + return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) & > +XILINX_VDMA_DMASR_HALTED) && > + (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) & > +XILINX_VDMA_DMACR_RUNSTOP); > +} > + > +/** > + * xilinx_vdma_is_idle - Check if VDMA channel is idle > + * @chan: Driver specific VDMA channel > + * > + * Return: '1' if idle, '0' if not. > + */ > +static int xilinx_vdma_is_idle(struct xilinx_vdma_chan *chan) > +{ > + return vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) & > + XILINX_VDMA_DMASR_IDLE; > +} > + > +/** > + * xilinx_vdma_halt - Halt VDMA channel > + * @chan: Driver specific VDMA channel > + */ > +static void xilinx_vdma_halt(struct xilinx_vdma_chan *chan) > +{ > + int loop = XILINX_VDMA_LOOP_COUNT + 1; > + > + vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP); > + > + /* Wait for the hardware to halt */ > + while (loop--) > + if (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) & > + XILINX_VDMA_DMASR_HALTED) > + break; > + > + if (!loop) { > + dev_err(chan->dev, "Cannot stop channel %p: %x\n", > + chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR)); > + chan->err = true; > + } > + > + return; > +} > + > +/** > + * xilinx_vdma_start - Start VDMA channel > + * @chan: Driver specific VDMA channel > + */ > +static void xilinx_vdma_start(struct
Re: [PATCH v2] dma: Add Xilinx AXI Video Direct Memory Access Engine driver support
Hello, 2014/1/22 Srikanth Thokala stho...@xilinx.com: This is the driver for the AXI Video Direct Memory Access (AXI VDMA) core, which is a soft Xilinx IP core that provides high- bandwidth direct memory access between memory and AXI4-Stream type video target peripherals. The core provides efficient two dimensional DMA operations with independent asynchronous read and write channel operation. This module works on Zynq (ARM Based SoC) and Microblaze platforms. Signed-off-by: Srikanth Thokala stho...@xilinx.com --- Another two remarks, after you fixed them ( or not :-) ) you can have my: Reviewed-by: Levente Kurusa le...@linux.com Oh, and next time please if you post a patch that fixes something I pointed out, CC me as I had a hard time finding this patch, thanks. :-) NOTE: 1. Created a separate directory 'dma/xilinx' as Xilinx has two more DMA IPs and we are also planning to upstream these drivers soon. 2. Rebased on v3.13.0-rc8 Changes in v2: - Removed DMA Test client module from the patchset as suggested by Andy Shevchenko - Removed device-id DT property, as suggested by Arnd Bergmann - Properly documented DT bindings as suggested by Arnd Bergmann - Returning with error, if registration of DMA to node fails - Fixed typo errors - Used BIT() macro at applicable places - Added missing header file to the patchset - Changed copyright year to include 2014 --- .../devicetree/bindings/dma/xilinx/xilinx_vdma.txt | 75 + drivers/dma/Kconfig| 14 + drivers/dma/Makefile |1 + drivers/dma/xilinx/Makefile|1 + drivers/dma/xilinx/xilinx_vdma.c | 1486 include/linux/amba/xilinx_dma.h| 50 + 6 files changed, 1627 insertions(+) create mode 100644 Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt create mode 100644 drivers/dma/xilinx/Makefile create mode 100644 drivers/dma/xilinx/xilinx_vdma.c create mode 100644 include/linux/amba/xilinx_dma.h diff --git a/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt new file mode 100644 index 000..ab8be1a --- /dev/null +++ b/Documentation/devicetree/bindings/dma/xilinx/xilinx_vdma.txt @@ -0,0 +1,75 @@ +Xilinx AXI VDMA engine, it does transfers between memory and video devices. +It can be configured to have one channel or two channels. If configured +as two channels, one is to transmit to the video device and another is +to receive from the video device. + +Required properties: +- compatible: Should be xlnx,axi-vdma-1.00.a +- #dma-cells: Should be 1, see dmas property below +- reg: Should contain VDMA registers location and length. +- xlnx,num-fstores: Should be the number of framebuffers as configured in h/w. +- dma-channel child node: Should have atleast one channel and can have upto + two channels per device. This node specifies the properties of each + DMA channel (see child node properties below). + +Optional properties: +- xlnx,include-sg: Tells whether configured for Scatter-mode in + the hardware. [...] + +/** + * xilinx_vdma_is_running - Check if VDMA channel is running + * @chan: Driver specific VDMA channel + * + * Return: '1' if running, '0' if not. + */ +static int xilinx_vdma_is_running(struct xilinx_vdma_chan *chan) +{ + return !(vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) +XILINX_VDMA_DMASR_HALTED) + (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMACR) +XILINX_VDMA_DMACR_RUNSTOP); +} + +/** + * xilinx_vdma_is_idle - Check if VDMA channel is idle + * @chan: Driver specific VDMA channel + * + * Return: '1' if idle, '0' if not. + */ +static int xilinx_vdma_is_idle(struct xilinx_vdma_chan *chan) +{ + return vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) + XILINX_VDMA_DMASR_IDLE; +} + +/** + * xilinx_vdma_halt - Halt VDMA channel + * @chan: Driver specific VDMA channel + */ +static void xilinx_vdma_halt(struct xilinx_vdma_chan *chan) +{ + int loop = XILINX_VDMA_LOOP_COUNT + 1; + + vdma_ctrl_clr(chan, XILINX_VDMA_REG_DMACR, XILINX_VDMA_DMACR_RUNSTOP); + + /* Wait for the hardware to halt */ + while (loop--) + if (vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR) + XILINX_VDMA_DMASR_HALTED) + break; + + if (!loop) { + dev_err(chan-dev, Cannot stop channel %p: %x\n, + chan, vdma_ctrl_read(chan, XILINX_VDMA_REG_DMASR)); + chan-err = true; + } + + return; +} + +/** + * xilinx_vdma_start - Start VDMA channel + * @chan: Driver specific VDMA channel + */ +static void xilinx_vdma_start(struct xilinx_vdma_chan *chan) +{ + int loop = XILINX_VDMA_LOOP_COUNT + 1; + +