RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-07-07 Thread Raju, Sundaram
 -Original Message-
 From: Koul, Vinod [mailto:vinod.k...@intel.com]
 Sent: Thursday, June 16, 2011 11:15 AM
 To: Raju, Sundaram
 Cc: Russell King - ARM Linux; Linus Walleij; Dan; davinci-linux-open-
 sou...@linux.davincidsp.com; linux-omap@vger.kernel.org; linux-
 ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org
 Subject: RE: [RFC] dmaengine: add new api for preparing simple slave transfer
 
 On Tue, 2011-06-14 at 12:12 +0530, Raju, Sundaram wrote:

snip

 Okay now things are more clear on what you are trying to do...

Vinod, 
Sorry for the delayed response. I was OOO and not well.
Thanks for going through the design documents and giving options.


 2) I think this can be achieved in two ways:
 a) you use current standard sg_list mechanism, the dmac driver parses
 the list and  programs the dma offsets into dmac
 Pros: you can use existing APIs, no changes to i/f.
 If dmac has this capability they program dmac accordingly
 Cons: you need to create sg-list in client driver


This option does not satisfy the requirements.
As Russell has pointed out

  
   The overall conclusion which I'm coming to is that we already support
   what you're asking for, but the problem is that we're using different
   (and I'd argue standard) terminology to describe what we have.
  
   The only issue which I see that we don't cover is the case where you want
   to describe a single buffer which is organised as N bytes to be 
   transferred,
   M following bytes to be skipped, N bytes to be transferred, M bytes to be
   skipped.  I doubt there are many controllers which can be programmed with
   both 'N' and 'M' parameters directly.
  
 
  Yes this is what I wanted to communicate and discuss in the list.
  Thanks for describing it in the standard terminology, and pointing me in the
  right direction.
 

The sg_list caters to all the parameters that a client driver has to pass
to the DMAC driver except for the STRIDE related info of skipping certain
bytes within a single buffer entry of the sg_list.

 
 b) create a new api to describe these offset values, something like:
 prep_buffer_offset(struct offset_description *buffer,.)
 I would not like to change the current API for this and rather have a
 new API for this, this should better then overriding current.
 

Yes, it would be better not to change the existing API.
This option seems good. But new API has to be added for this option.
Linus, had suggested something similar, but different,
Using the control API with a new dma_ctrl_cmd enum
defined for TI DMAC special configuration to be passed.
 
| Sundaram is this how your controller works?
| I mean the hardware can skip over sequences like this?
| 
| When we added the config interface to DMAengine I originally included
| a custom config call, but Dan wanted me to keep it out until we
| had some specific usecase for it. FSLDMA recently started
| to use it.
| 
| Notice how dmaengine_slave_config() is implemented:
| 
| static inline int dmaengine_slave_config(struct dma_chan *chan,
| struct dma_slave_config *config)
| {
|   return dmaengine_device_control(chan, DMA_SLAVE_CONFIG,
|   (unsigned long)config);
| }
| 
| So what is passed to the driver is just an unsigned long.
| 
| This is actually modeled to be ioctl()-like so you can pass in a
| custom config to the same callback on the device driver,
| just use some other enumerator than DMA_SLAVE_CONFIG,
| say like FSLDMA already does with FSLDMA_EXTERNAL_START.
| 
| Just put some enumerator in enum dma_ctrl_cmd in
| dmaengine.h such as SDMA_TEXAS_STRIDE_CONFIG and call
| like this:
| 
| /* However that config struct needs to look, basically */
| static struct sdma_ti_stride_cgf = {
|  take = M,
|  skip = N,
| };
| 
| ret = chan-device-device_control(chan, SDMA_TEXAS_STRIDE_CONFIG,
| sdma_ti_stride_cfg);
| 
| Or something like this.

I think this is better option than your 2b. This requires just an addition of
one more enum in the dma_ctrl_cmd. What do you think about this?
If Dan and you are okay with this I will send a small patch for this asap.

Regards,
Sundaram


Re: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-07-07 Thread Linus Walleij
2011/7/7 Raju, Sundaram sunda...@ti.com:

 | Just put some enumerator in enum dma_ctrl_cmd in
 | dmaengine.h such as SDMA_TEXAS_STRIDE_CONFIG and call
 | like this:
 |
 | /* However that config struct needs to look, basically */
 | static struct sdma_ti_stride_cgf = {
 |      take = M,
 |      skip = N,
 | };
 |
 | ret = chan-device-device_control(chan, SDMA_TEXAS_STRIDE_CONFIG,
 | sdma_ti_stride_cfg);
 |
 | Or something like this.

 I think this is better option than your 2b. This requires just an addition of
 one more enum in the dma_ctrl_cmd. What do you think about this?
 If Dan and you are okay with this I will send a small patch for this asap.

I think this is the best way to solve it atleast. It's clear what is
being done, and it's easy for a client trying to create such a slave
transfer to back out if it turns out that the DMAC in use does not
support this striding.

Send a patch out and I'll Ack it, FWIW.

Linus Walleij
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-16 Thread Koul, Vinod
On Tue, 2011-06-14 at 12:12 +0530, Raju, Sundaram wrote:
 snip
  
  The overall conclusion which I'm coming to is that we already support
  what you're asking for, but the problem is that we're using different
  (and I'd argue standard) terminology to describe what we have.
  
  The only issue which I see that we don't cover is the case where you want
  to describe a single buffer which is organised as N bytes to be transferred,
  M following bytes to be skipped, N bytes to be transferred, M bytes to be
  skipped.  I doubt there are many controllers which can be programmed with
  both 'N' and 'M' parameters directly.
 
 
 Yes this is what I wanted to communicate and discuss in the list.
 Thanks for describing it in the standard terminology, and pointing me in the 
 right direction.
 
 Also please note that if the gap between the buffers in a scatterlist is
 uniform and that can again be skipped by the DMAC automatically
 by programming that gap size, then that feature also is not available 
 in the current framework.
 
 I understand it can be done with the existing scatterlist, but just writing
 a value in a DMAC register is much better if available, than preparing 
 a scatterlist and passing to the API.
 
  
  Please, don't generate special cases.  Design proper APIs from the
  outset rather than abusing what's already there.  So no, don't abuse
  the address width stuff.
  
  In any case, the address width stuff must still be used to describe the
  peripherals FIFO register.
 
 
 I did not intend to abuse the existing address width. It might look 
 like that because of how I described it here. 
 I agree that the dma_slave_config is for peripheral related 
 properties to be stored. I was pointing out that the chunk size
 variable in the dma_buffer_config I proposed will be in most cases 
 equal to FIFO register width, to describe what I actually meant by
 chunk size.
 
  IIUC, you have a buffer with gaps in between (given by above params).
  Is your DMA h/w capable of parsing this buffer and directly do a
  transfer based on above parameters (without any work for SW), or you are
  doing this in your dma driver and then programming a list of buffers?
  In former case (although i haven't seen such dma controller yet), can
  you share the datasheet? It would make very little sense to change the
  current API for your extra special case. This special case needs to be
  handled differently rather than making rule out of it!!
  
 
 Yes, Vinod. This is directly handled in the DMAC h/w.
 
 This capability is present in 2 TI DMACs.
 EDMA (Enhanced DMA) in all TI DaVinci SoCs and the
 SDMA (System DMA) in all TI OMAP SoCs. The drivers of these
 controllers are present in the respective DaVinci tree and OMAP tree
 under the SoC folders.
 
 quote
  SDMA and EDMA are TI SoC specific DMA controllers. Their drivers have
  been maintained in the respective SoC folders till now.
  arch/arm/plat-omap/dma.c
  arch/arm/mach-davinci/dma.c
 /quote
 
 The Manual of the EDMA controller in TI DaVinci SoCs is available at
 http://www.ti.com/litv/pdf/sprue23d
 Section 2.2 in the page 23 explains how transfers are made based
 on the gaps programmed. It also explains how the 3D buffer is 
 internally split in EDMA based on the gaps programmed.
 
 The complete Reference manual of TI OMAP SoCs is available at
 http://www.ti.com/litv/pdf/spruf98s
 Chapter 9 in this document describes the SDMA controller.
 Section 9.4.3 in page 981 explains the various address modes,
 how the address is incremented by the DMAC and about the gaps 
 in between buffers and frames and how the DMAC handles them.
 
 Linus,
  
  Is it really so bad? It is a custom configuration after all. Even if
  there were many DMACs out there supporting it, we'd probably
  model it like this, just pass something like DMA_STRIDE_CONFIG
  instead of something named after a specific slave controller.
  
  This way DMACs that didn't support striding could NACK a
  transfer for device drivers requesting it and then it could figure
  out what to do.
  
  If we can get some indication as to whether more DMACs can
  do this kind of stuff, we'd maybe like to introduce
  DMA_STRIDE_CONFIG already now.
 
 
 I wanted to discuss this feature in the list and get this feature 
 added to the current dmaengine framework. If the current APIs
 can handle this feature, then its very good 
 and I will follow that only.
 
 If what you suggest is the right way to go then I am okay.
 IMHO the framework should always handle the complex case
 and individual implementations should implement a subset of
 the capability and hence I suggest the changes I posted to the list.
Okay now things are more clear on what you are trying to do...

1) The changes you need are to describe your buffer and convey to dmac
so please don't talk about slave here as that is specific to dmac config

2) I think this can be achieved in two ways:
a) you use current standard sg_list mechanism, the dmac driver parses
the list 

Re: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-14 Thread Linus Walleij
2011/6/14 Raju, Sundaram sunda...@ti.com:

 Yes, the hardware can skip over sequences like that.
 I also thought about your suggestion, at first before submitting the RFC.
 But I dint pursue this because, this ioctl() call has to be made before
 every single prepare call.

It's not an ioctl(), ioctl()s would be expensive since they involve
userspace to kernelspace context switches. It is just ioctl()-like.
It is a simple, fast function call.

 I may have to end up using this if we decide not to change the API
 signature for prepare APIs.

Is it really so bad? It is a custom configuration after all. Even if
there were many DMACs out there supporting it, we'd probably
model it like this, just pass something like DMA_STRIDE_CONFIG
instead of something named after a specific slave controller.

This way DMACs that didn't support striding could NACK a
transfer for device drivers requesting it and then it could figure
out what to do.

If we can get some indication as to whether more DMACs can
do this kind of stuff, we'd maybe like to introduce
DMA_STRIDE_CONFIG already now.

Thanks,
Linus Walleij
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-14 Thread Raju, Sundaram
Russell,

Thanks for all the quick pointers and the summary of how 
memory-to-peripheral transfers are expected to operate.

 Ok, what I'm envisioning is that your term chunk size means register
 width, and you view that as one dimension.  We already describe this.
 
 A frame is a collection of chunks.  That's already described - the number
 of chunks in a buffer is the buffer size divided by the chunk size.
 Buffers must be a multiple of the chunk size.
 
 Then we have a collection of frames.  These can be non-contiguous.
 That's effectively described by our scatterlist.

snip
 
 The overall conclusion which I'm coming to is that we already support
 what you're asking for, but the problem is that we're using different
 (and I'd argue standard) terminology to describe what we have.
 
 The only issue which I see that we don't cover is the case where you want
 to describe a single buffer which is organised as N bytes to be transferred,
 M following bytes to be skipped, N bytes to be transferred, M bytes to be
 skipped.  I doubt there are many controllers which can be programmed with
 both 'N' and 'M' parameters directly.


Yes this is what I wanted to communicate and discuss in the list.
Thanks for describing it in the standard terminology, and pointing me in the 
right direction.

Also please note that if the gap between the buffers in a scatterlist is
uniform and that can again be skipped by the DMAC automatically
by programming that gap size, then that feature also is not available 
in the current framework.

I understand it can be done with the existing scatterlist, but just writing
a value in a DMAC register is much better if available, than preparing 
a scatterlist and passing to the API.

 
 Please, don't generate special cases.  Design proper APIs from the
 outset rather than abusing what's already there.  So no, don't abuse
 the address width stuff.
 
 In any case, the address width stuff must still be used to describe the
 peripherals FIFO register.


I did not intend to abuse the existing address width. It might look 
like that because of how I described it here. 
I agree that the dma_slave_config is for peripheral related 
properties to be stored. I was pointing out that the chunk size
variable in the dma_buffer_config I proposed will be in most cases 
equal to FIFO register width, to describe what I actually meant by
chunk size.

 IIUC, you have a buffer with gaps in between (given by above params).
 Is your DMA h/w capable of parsing this buffer and directly do a
 transfer based on above parameters (without any work for SW), or you are
 doing this in your dma driver and then programming a list of buffers?
 In former case (although i haven't seen such dma controller yet), can
 you share the datasheet? It would make very little sense to change the
 current API for your extra special case. This special case needs to be
 handled differently rather than making rule out of it!!
 

Yes, Vinod. This is directly handled in the DMAC h/w.

This capability is present in 2 TI DMACs.
EDMA (Enhanced DMA) in all TI DaVinci SoCs and the
SDMA (System DMA) in all TI OMAP SoCs. The drivers of these
controllers are present in the respective DaVinci tree and OMAP tree
under the SoC folders.

quote
 SDMA and EDMA are TI SoC specific DMA controllers. Their drivers have
 been maintained in the respective SoC folders till now.
 arch/arm/plat-omap/dma.c
 arch/arm/mach-davinci/dma.c
/quote

The Manual of the EDMA controller in TI DaVinci SoCs is available at
http://www.ti.com/litv/pdf/sprue23d
Section 2.2 in the page 23 explains how transfers are made based
on the gaps programmed. It also explains how the 3D buffer is 
internally split in EDMA based on the gaps programmed.

The complete Reference manual of TI OMAP SoCs is available at
http://www.ti.com/litv/pdf/spruf98s
Chapter 9 in this document describes the SDMA controller.
Section 9.4.3 in page 981 explains the various address modes,
how the address is incremented by the DMAC and about the gaps 
in between buffers and frames and how the DMAC handles them.

Linus,
 
 Is it really so bad? It is a custom configuration after all. Even if
 there were many DMACs out there supporting it, we'd probably
 model it like this, just pass something like DMA_STRIDE_CONFIG
 instead of something named after a specific slave controller.
 
 This way DMACs that didn't support striding could NACK a
 transfer for device drivers requesting it and then it could figure
 out what to do.
 
 If we can get some indication as to whether more DMACs can
 do this kind of stuff, we'd maybe like to introduce
 DMA_STRIDE_CONFIG already now.


I wanted to discuss this feature in the list and get this feature 
added to the current dmaengine framework. If the current APIs
can handle this feature, then its very good 
and I will follow that only.

If what you suggest is the right way to go then I am okay.
IMHO the framework should always handle the complex case
and individual implementations should 

Re: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-13 Thread Linus Walleij
On Fri, Jun 10, 2011 at 3:33 PM, Russell King - ARM Linux
li...@arm.linux.org.uk wrote:
 On Fri, Jun 10, 2011 at 05:18:46PM +0530, Raju, Sundaram wrote:
 Now DMACs capable of 3D transfer, do transfer of the whole 1D
 buffer per sync received or even whole 2D buffer per sync received
 (based on the sync rate programmed in the DMAC).

 The only issue which I see that we don't cover is the case where you want
 to describe a single buffer which is organised as N bytes to be transferred,
 M following bytes to be skipped, N bytes to be transferred, M bytes to be
 skipped.  I doubt there are many controllers which can be programmed with
 both 'N' and 'M' parameters directly.

Sundaram is this how your controller works?
I mean the hardware can skip over sequences like this?

When we added the config interface to DMAengine I originally included
a custom config call, but Dan wanted me to keep it out until we
had some specific usecase for it. FSLDMA recently started
to use it.

Notice how dmaengine_slave_config() is implemented:

static inline int dmaengine_slave_config(struct dma_chan *chan,
  struct dma_slave_config *config)
{
return dmaengine_device_control(chan, DMA_SLAVE_CONFIG,
(unsigned long)config);
}

So what is passed to the driver is just an unsigned long.

This is actually modeled to be ioctl()-like so you can pass in a
custom config to the same callback on the device driver,
just use some other enumerator than DMA_SLAVE_CONFIG,
say like FSLDMA already does with FSLDMA_EXTERNAL_START.

Just put some enumerator in enum dma_ctrl_cmd in
dmaengine.h such as SDMA_TEXAS_STRIDE_CONFIG and call
like this:

/* However that config struct needs to look, basically */
static struct sdma_ti_stride_cgf = {
 take = M,
 skip = N,
};

ret = chan-device-device_control(chan, SDMA_TEXAS_STRIDE_CONFIG,
sdma_ti_stride_cfg);

Or something like this.

Thanks,
Linus Walleij
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-13 Thread Raju, Sundaram
Linus,

Thanks for the pointers.

 -Original Message-
 From: Linus Walleij [mailto:linus.wall...@linaro.org]
 Sent: Monday, June 13, 2011 7:43 PM
 To: Raju, Sundaram
 Cc: Russell King - ARM Linux; Koul, Vinod; Dan; davinci-linux-open-
 sou...@linux.davincidsp.com; linux-omap@vger.kernel.org; linux-
 ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org
 Subject: Re: [RFC] dmaengine: add new api for preparing simple slave transfer
 
 On Fri, Jun 10, 2011 at 3:33 PM, Russell King - ARM Linux
 li...@arm.linux.org.uk wrote:
  On Fri, Jun 10, 2011 at 05:18:46PM +0530, Raju, Sundaram wrote:
  Now DMACs capable of 3D transfer, do transfer of the whole 1D
  buffer per sync received or even whole 2D buffer per sync received
  (based on the sync rate programmed in the DMAC).
 
  The only issue which I see that we don't cover is the case where you want
  to describe a single buffer which is organised as N bytes to be transferred,
  M following bytes to be skipped, N bytes to be transferred, M bytes to be
  skipped.  I doubt there are many controllers which can be programmed with
  both 'N' and 'M' parameters directly.
 
 Sundaram is this how your controller works?
 I mean the hardware can skip over sequences like this?
 
 When we added the config interface to DMAengine I originally included
 a custom config call, but Dan wanted me to keep it out until we
 had some specific usecase for it. FSLDMA recently started
 to use it.
 
 Notice how dmaengine_slave_config() is implemented:
 
 static inline int dmaengine_slave_config(struct dma_chan *chan,
 struct dma_slave_config *config)
 {
   return dmaengine_device_control(chan, DMA_SLAVE_CONFIG,
   (unsigned long)config);
 }
 
 So what is passed to the driver is just an unsigned long.
 
 This is actually modeled to be ioctl()-like so you can pass in a
 custom config to the same callback on the device driver,
 just use some other enumerator than DMA_SLAVE_CONFIG,
 say like FSLDMA already does with FSLDMA_EXTERNAL_START.
 
 Just put some enumerator in enum dma_ctrl_cmd in
 dmaengine.h such as SDMA_TEXAS_STRIDE_CONFIG and call
 like this:
 
 /* However that config struct needs to look, basically */
 static struct sdma_ti_stride_cgf = {
  take = M,
  skip = N,
 };
 
 ret = chan-device-device_control(chan, SDMA_TEXAS_STRIDE_CONFIG,
 sdma_ti_stride_cfg);
 
 Or something like this.

Yes, the hardware can skip over sequences like that.
I also thought about your suggestion, at first before submitting the RFC.
But I dint pursue this because, this ioctl() call has to be made before
every single prepare call.

I may have to end up using this if we decide not to change the API
signature for prepare APIs.

I actually intend to use this for all DMAC related ioctl(). Configuring
the DMAC and programming various modes etc specific to the
DMAC. I suppose this is the only way to do it.
Let me know if there is any other way to do it.

Thanks,
Sundaram
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-10 Thread Koul, Vinod
 and not as a trivial case of a sg list, 
 this will be useful. 
 I know this can be done using 2 different implementations inside the 
 device_prep_slave_sg itself, but I think it's cleaner to have 
 different APIs. I have provided the generic buffer configuration 
 I can think of, here and also a new API definition, 
 if it makes sense to have one. This buffer configuration might not 
 be completely generic, and hence I ask all of you to please provide 
 comments to improve it. 
 
 Generic buffer description: 
 A generic buffer can be split into number of frames which contain 
 number of chunks inside them. The frames need not be contiguous, 
 nor do the chunks inside a frame. 
 
 ---
  | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame 0 
 ---
  |Inter Frame Gap  | 
 ---
  | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame 1 
 ---
  |Inter Frame Gap  | 
 ---
  | | 
 ---
  |Inter Frame Gap  | 
 ---
  | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame m 
 ---
 
 Note: ICG = Inter Chunk Gap.
 
 struct dma_buffer_config {
   u32 chunk_size; /* number of bytes in a chunk */
   u32 frame_size; /* number of chunks in a frame */
   /* u32 n_frames; number of frames in the buffer */
   u32 inter_chunk_gap; /* number of bytes between end of a chunk 
   and the start of the next chunk */ 
   u32 inter_frame_gap; /* number of bytes between end of a frame 
   and the start of the next frame */ 
   bool sync_rate; /* 0 - a sync event is required from the 
   peripheral to transfer a chunk 
   1 - a sync event is required from the 
   peripheral to transfer a frame */ 
 };
 
 The patch to add a new API for single buffer transfers alone: 
 diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
 @@ -491,6 +492,10 @@ struct dma_device {
   struct dma_chan *chan, struct scatterlist *sgl,
   unsigned int sg_len, enum dma_data_direction direction,
   unsigned long flags);
 + struct dma_async_tx_descriptor *(*device_prep_slave)(
 + struct dma_chan *chan, dma_addr_t buf_addr,
 + unsigned int buf_len, void *buf_prop,
 + enum dma_data_direction direction, unsigned long flags);
   struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
   struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
   size_t period_len, enum dma_data_direction direction);
 
 The number of frames (n_frames) can always be determined using all 
 other values in the dma_buffer_config along with the 
 buffer length (buf_len) passed in the API. 
 n_frames = buf_len / (chunk_sixe * frame_size); 
 
 Regards, 
 Sundaram
 
 -Original Message-
 From: Russell King - ARM Linux [mailto:li...@arm.linux.org.uk] 
 Sent: Thursday, June 09, 2011 6:17 PM
 To: Raju, Sundaram
 Cc: linux-arm-ker...@lists.infradead.org; linux-ker...@vger.kernel.org; 
 davinci-linux-open-sou...@linux.davincidsp.com; linux-omap@vger.kernel.org
 Subject: Re: [RFC] dmaengine: add new api for preparing simple slave transfer
 
 Can you please re-post with sensible wrapping at or before column 72.
 I'm not manually reformatting your entire message just so I can find
 the relevant bits to reply to.
 
 Thanks.
 
 ___
 linux-arm-kernel mailing list
 linux-arm-ker...@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


-- 
~Vinod

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-10 Thread Koul, Vinod
On Thu, 2011-06-09 at 17:32 +0100, Russell King - ARM Linux wrote:
 On Thu, Jun 09, 2011 at 09:31:56PM +0530, Raju, Sundaram wrote:
  Here it is, with proper line wrapping;
 
 Thanks.  This is much easier to reply to.
 
  I believe that even though the dmaengine framework addresses and 
  supports most of the required use cases of a client driver to a DMA 
  controller, some extensions are required in it to make it still more 
  generic.
  
  Current framework contains two APIs to prepare for slave transfers: 
  
  struct dma_async_tx_descriptor *(*device_prep_slave_sg)(
  struct dma_chan *chan, struct scatterlist *sgl,
  unsigned int sg_len, enum dma_data_direction direction,
  unsigned long flags);
  
  struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
  struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
  size_t period_len, enum dma_data_direction direction);
  
  and one control API. 
  int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
  unsigned long arg);
  
  A simple single buffer transfer (i.e. non sg transfer) can be done only
  as a trivial case of the device_prep_slave_sg API. The client driver is
  expected to prepare a sg list and provide to the dmaengine API for that
  single buffer also.
 
 We can avoid preparing a sg list in every driver which wants this by
 having dmaengine.h provide a helper for this case:
 
 static inline dma_async_tx_descriptor *dmaengine_prep_slave_single(
   struct dma_chan *chan, void *buf, size_t len,
   enum dma_data_direction dir, unsigned long flags)
 {
   struct scatterlist sg;
 
   sg_init_one(sg, buf, len);
 
   return chan-device-device_prep_slave_sg(chan, sg, 1, dir, flags);
 }
That sounds good...

 
 I think also providing dmaengine_prep_slave_sg() and
 dmaengine_prep_dma_cyclic() as wrappers to hide the chan-device-blah
 bit would also be beneficial (and helps keep that detail out of the
 users of DMA engine.)
 
  In a slave transfer, the client has to define all the buffer related 
  attributes and the peripheral related attributes. 
 
 I'd argue that it is incorrect to have drivers specify the buffer
 related attributes - that makes the API unnecessarily complicated
 to use, requiring two calls (one to configure the channel, and one
 to prepare the transfer) each time it needs to be used.
 
 Not only that but the memory side of the transfer should be no business
 of the driver - the driver itself should only specify the attributes
 for the device being driven.  The attributes for the memory side of the
 transfer should be a property of the DMA engine itself.
 
 I would like to see in the long term the dma_slave_config structure
 lose its 'direction' argument, and the rest of the parameters used to
 define the device side parameters only.
I am not sure why direction flag is required and can be done away with.
Both sg and cyclic API have a direction parameter and that should be
used. A channel can be used in any direction client wishes to.
 
 This will allow the channel to be configured once when its requested,
 and then the prepare call can configure the direction as required.
 
  Now coming to the buffer related attributes, sg list is a nice way to 
  represent a disjoint buffer; all the offload engines in drivers/dma 
  create a descriptor for each of the contiguous chunk in the sg list 
  buffer and pass it to the controller. 
 
 The sg list is the standard Linux way to describe scattered buffers.
 
  But many a times a client may want to transfer from a single buffer to
  the peripheral and most of the DMA controllers have the capability to 
  transfer data in chunks/frames directly at a stretch. 
 
 So far, I've only seen DMA controllers which operate on a linked list of
 source, destination, length, attributes, and next entry pointer.
 
  All the attributes of a buffer, which are required for the transfer 
  can be programmed in single descriptor and submitted to the 
  DMA controller. 
 
 I'm not sure that this is useful - in order to make use of the data, the
 data would have to be copied in between the descriptors - and doesn't that
 rather negate the point of DMA if you have to memcpy() the data around?
 
 Isn't it far more efficient to have DMA place the data exactly where it's
 required in memory right from the start without any memcpy() operations?
 
 Can you explain where and how you would use something like this:
 
  ---
   | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame 0 
  ---
   |  Inter Frame Gap  | 
  ---
   | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame 1 
  ---
   |  Inter Frame Gap 

RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-10 Thread Raju, Sundaram
I think I should have tried to explain my case with a specific example.
I tried to generalize it and it has confused and misled every one.
I have tried again now. :)

snip
   A simple single buffer transfer (i.e. non sg transfer) can be done only
   as a trivial case of the device_prep_slave_sg API. The client driver is
   expected to prepare a sg list and provide to the dmaengine API for that
   single buffer also.
  
  We can avoid preparing a sg list in every driver which wants this by
  having dmaengine.h provide a helper for this case:
  
  static inline dma_async_tx_descriptor *dmaengine_prep_slave_single(
  struct dma_chan *chan, void *buf, size_t len,
  enum dma_data_direction dir, unsigned long flags)
  {
  struct scatterlist sg;
  
  sg_init_one(sg, buf, len);
  
  return chan-device-device_prep_slave_sg(chan, sg, 1, dir, flags);
  }
 That sounds good...

Yes, this should be sufficient if the only aim is to avoid preparing a sg list
in every driver which wants to transfer a single buffer.

But my main aim was to add more buffer related details to the API.
I have tried to explain a use case below, where this will be useful.

snip
 
  I think also providing dmaengine_prep_slave_sg() and
  dmaengine_prep_dma_cyclic() as wrappers to hide the chan- device-blah
  bit would also be beneficial (and helps keep that detail out of the
  users of DMA engine.)
 

Yes, this would really be helpful if someone is just looking at the 
client drivers and are not familiar with the dmaengine's 
internal data structures.

   In a slave transfer, the client has to define all the buffer related
   attributes and the peripheral related attributes.
 
  I'd argue that it is incorrect to have drivers specify the buffer
  related attributes - that makes the API unnecessarily complicated
  to use, requiring two calls (one to configure the channel, and one
  to prepare the transfer) each time it needs to be used.

I am not able to understand why 2 calls will be required?
Client configures the slave using dma_slave_config only once.
If the direction flag is removed then, this configuration doesn’t
have to be modified at all.

quote
 So I think the slave transfer API must also have a provision to pass 
 the buffer configuration. The buffer attributes have to be passed 
 directly as an argument in the prepare API, unlike dma_slave_config 
 as they will be different for each buffer that is going to be 
 transferred.
/quote

The API is called only once in the current framework to prepare 
the descriptor.
After adding any extra arguments in the prepare API, client will
still have to call it only once.

 
  Not only that but the memory side of the transfer should be no business
  of the driver - the driver itself should only specify the attributes
  for the device being driven.  The attributes for the memory side of the
  transfer should be a property of the DMA engine itself.
 

Yes Russell, you are correct. Attributes of the memory side should
be a property of DMA engine itself.

What is meant here is that, client has told the DMAC how to 
write to/ read from the slave peripheral by defining all the slave
properties in the dma_slave_config.

It is assumed that the buffer side has to be read/written to 
byte after byte continuously by the DMAC. 
I wanted to say client must have provisions to pass the details 
on how it wants the DMAC to read/ write to the 
buffer, if the capability is available in the DMAC.

DMACs have the capability to auto increment the 
source/destination address accordingly after transferring
a byte of data and they automatically read/write to the
next byte location. In some DMACs you can program an offset also. 
They read x bytes, skip y bytes and read the next x bytes.
This detail will be passed to the client driver by the application.
Then it is for the slave driver to communicate this to the
DMA engine, right?

  I would like to see in the long term the dma_slave_config structure
  lose its 'direction' argument, and the rest of the parameters used to
  define the device side parameters only.
 I am not sure why direction flag is required and can be done away with.
 Both sg and cyclic API have a direction parameter and that should be
 used. A channel can be used in any direction client wishes to.

I also agree. The direction flag in the dma_slave_config is redundant.
As Russell had already mentioned in another thread, this redundancy
forces DMAC drivers to check if the direction flag in the 
dma_slave_config is same as that passed to the prepare API.

Also client drivers have to keep modifying the dma_slave_config
every time before they make a direction change in transfer.

 
   But many a times a client may want to transfer from a single buffer to
   the peripheral and most of the DMA controllers have the capability to
   transfer data in chunks/frames directly at a stretch.
 
  So far, I've only seen DMA controllers which operate on a linked list of
  source, destination, length, 

Re: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-10 Thread Russell King - ARM Linux
On Fri, Jun 10, 2011 at 03:51:41PM +0530, Raju, Sundaram wrote:
 Consider a simple video use case of de-interlacing a video buffer.
 Odd lines have to be transferred first, and then the even lines are 
 transferred separately. This can be done by programming the 
 inter frame gap as the line size of the video buffer in the DMAC.
 This will require you to have only 2 descriptors, one for the
 odd lines and another for the even lines. This results in only
 2 descriptors being written to DMAC registers.

How would this be handled with DMACs which can't 'skip' bytes in the
buffer?  You would have to generate a list of LLIs separately
describing each 'line' of video and chain them together.

How do you handle the situation where a driver uses your new proposed
API, but it doesn't support that in hardware.

 Actually we can deduce the chunk_size from the 
 dma_slave_config itself. It is either the src_addr_width or
 dst_addr_width based on the direction. Because at a stretch
 DMAC cannot transfer more than the slave register width.

I think you're misinterpreting those fields.  (dst|src)_addr_width tells
the DMA controller the width of each transaction - whether to issue a
byte, half-word, word or double-word read or write to the peripheral.
It doesn't say how many of those to issue, it just says what the
peripheral access size is to be.

In other words, they describe the width of the FIFO register.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-10 Thread Raju, Sundaram
Vinod,

 -Original Message-
 From: Koul, Vinod [mailto:vinod.k...@intel.com]
 Sent: Friday, June 10, 2011 11:39 AM
 To: Raju, Sundaram; Dan
 Cc: Russell King - ARM Linux; davinci-linux-open-
 sou...@linux.davincidsp.com; linux-omap@vger.kernel.org; linux-
 ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org
 Subject: RE: [RFC] dmaengine: add new api for preparing simple slave transfer
 

snip
 
  The above 2 APIs in the dmaengine framework expect all the
  peripheral/slave related attributes to be filled in the
  dma_slave_config data structure.
  struct dma_slave_config {
  enum dma_data_direction direction;
  dma_addr_t src_addr;
  dma_addr_t dst_addr;
  enum dma_slave_buswidth src_addr_width;
  enum dma_slave_buswidth dst_addr_width;
  u32 src_maxburst;
  u32 dst_maxburst;
  };
 
  This data structure is passed to the offload engine via the dma_chan
  data structure in its private pointer.
 No, this is passed thru control API you described above. Please read
 Documentation/dmaengine.txt

Yes, Vinod its my mistake. I wanted to say control API only, 
but just mixed it up with how the dma_slave_config is attached 
to each dma_chan so that the offload drivers can use it while 
preparing the descriptors.

 
  Now coming to the buffer related attributes, sg list is a nice way to
  represent a disjoint buffer; all the offload engines in drivers/dma
  create a descriptor for each of the contiguous chunk in the sg list
  buffer and pass it to the controller.
 
  But many a times a client may want to transfer from a single buffer to
  the peripheral and most of the DMA controllers have the capability to
  transfer data in chunks/frames directly at a stretch.
  All the attributes of a buffer, which are required for the transfer
  can be programmed in single descriptor and submitted to the
  DMA controller.
 
  So I think the slave transfer API must also have a provision to pass
  the buffer configuration. The buffer attributes have to be passed
  directly as an argument in the prepare API, unlike dma_slave_config
  as they will be different for each buffer that is going to be
  transferred.
 Can you articulate what attributes you are talking about. The
 dma_slave_config parameters don't represent buffer attributes. They
 represent the dma attributes on how you want to transfer. These things
 like bus widths, burst lengths are usually constant for the slave
 transfers, not sure why they should change per buffer?
 

I have tried to explain the attributes in the previous mail 
I posted in this thread.

Yes, buffer attributes should not be represented in 
the dma_slave_config. It is for slave related data.
That is why had mentioned that buffer configuration should be
a separate structure and passed in the prepare API.
See quoted below:

quote
 struct dma_buffer_config {
   u32 chunk_size; /* number of bytes in a chunk */
   u32 frame_size; /* number of chunks in a frame */
   /* u32 n_frames; number of frames in the buffer */
   u32 inter_chunk_gap; /* number of bytes between end of a chunk 
   and the start of the next chunk */ 
   u32 inter_frame_gap; /* number of bytes between end of a frame 
   and the start of the next frame */ 
   bool sync_rate; /* 0 - a sync event is required from the 
   peripheral to transfer a chunk 
   1 - a sync event is required from the 
   peripheral to transfer a frame */ 
 };
 
 The patch to add a new API for single buffer transfers alone: 
 diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
 @@ -491,6 +492,10 @@ struct dma_device {
   struct dma_chan *chan, struct scatterlist *sgl,
   unsigned int sg_len, enum dma_data_direction direction,
   unsigned long flags);
 + struct dma_async_tx_descriptor *(*device_prep_slave)(
 + struct dma_chan *chan, dma_addr_t buf_addr,
 + unsigned int buf_len, void *buf_prop,
 + enum dma_data_direction direction, unsigned long flags);
   struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
   struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
   size_t period_len, enum dma_data_direction direction);

/quote

Regards,
Sundaram
N�r��yb�X��ǧv�^�)޺{.n�+{��f��{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj��!�i

RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-10 Thread Raju, Sundaram
Russell,

 -Original Message-
 From: Russell King - ARM Linux [mailto:li...@arm.linux.org.uk]
 Sent: Friday, June 10, 2011 4:13 PM
 To: Raju, Sundaram
 Cc: Koul, Vinod; Dan; davinci-linux-open-sou...@linux.davincidsp.com; linux-
 o...@vger.kernel.org; linux-ker...@vger.kernel.org; linux-arm-
 ker...@lists.infradead.org
 Subject: Re: [RFC] dmaengine: add new api for preparing simple slave transfer
 
 On Fri, Jun 10, 2011 at 03:51:41PM +0530, Raju, Sundaram wrote:
  Consider a simple video use case of de-interlacing a video buffer.
  Odd lines have to be transferred first, and then the even lines are
  transferred separately. This can be done by programming the
  inter frame gap as the line size of the video buffer in the DMAC.
  This will require you to have only 2 descriptors, one for the
  odd lines and another for the even lines. This results in only
  2 descriptors being written to DMAC registers.
 
 How would this be handled with DMACs which can't 'skip' bytes in the
 buffer?  You would have to generate a list of LLIs separately
 describing each 'line' of video and chain them together.

Correct.

 
 How do you handle the situation where a driver uses your new proposed
 API, but it doesn't support that in hardware.
 

It should be handled the same way how a sg buffer is handled, 
if LLI chaining feature is not available in the hardware. 

It has to be submitted as a queue of separate descriptors to
the DMAC, where each descriptor will be processed after the 
completion of the previous descriptor.

  Actually we can deduce the chunk_size from the
  dma_slave_config itself. It is either the src_addr_width or
  dst_addr_width based on the direction. Because at a stretch
  DMAC cannot transfer more than the slave register width.
 
 I think you're misinterpreting those fields.  (dst|src)_addr_width tells
 the DMA controller the width of each transaction - whether to issue a
 byte, half-word, word or double-word read or write to the peripheral.
 It doesn't say how many of those to issue, it just says what the
 peripheral access size is to be.
 
 In other words, they describe the width of the FIFO register.

Yes correct, I was just giving an example for considering or
understanding a buffer in 3D and how each dimension should be.

chunk_size = (src|dst)_addr_width, for a special case,
i.e, if DMAC is programmed to transfer the entire 1D buffer
per sync event received from the peripheral.

In slave transfer, the peripheral is going to give a sync event to 
DMAC when the FIFO register is full|empty.

Now DMACs capable of 3D transfer, do transfer of the whole 1D
buffer per sync received or even whole 2D buffer per sync received
(based on the sync rate programmed in the DMAC).

So the DMAC has to be programmed for a 1D size (i.e. chunk size)
equal to that of the width of the FIFO register if the sync rate
programmed in DMAC is per chunk.  

Regards,
Sundaram
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-10 Thread Russell King - ARM Linux
On Fri, Jun 10, 2011 at 05:18:46PM +0530, Raju, Sundaram wrote:
 Russell,
 
  How do you handle the situation where a driver uses your new proposed
  API, but it doesn't support that in hardware.
 
 It should be handled the same way how a sg buffer is handled, 
 if LLI chaining feature is not available in the hardware. 

No, if the LLI chaining feature is available in hardware and your special
'skip' feature is not, then you have to generate a set of LLIs for the
hardware to walk through to 'emulate' the skip feature.

If you have no LLI support, then you need to program each transfer manually
into the DMA controller.

   Actually we can deduce the chunk_size from the
   dma_slave_config itself. It is either the src_addr_width or
   dst_addr_width based on the direction. Because at a stretch
   DMAC cannot transfer more than the slave register width.
  
  I think you're misinterpreting those fields.  (dst|src)_addr_width tells
  the DMA controller the width of each transaction - whether to issue a
  byte, half-word, word or double-word read or write to the peripheral.
  It doesn't say how many of those to issue, it just says what the
  peripheral access size is to be.
  
  In other words, they describe the width of the FIFO register.
 
 Yes correct, I was just giving an example for considering or
 understanding a buffer in 3D and how each dimension should be.
 
 chunk_size = (src|dst)_addr_width, for a special case,
 i.e, if DMAC is programmed to transfer the entire 1D buffer
 per sync event received from the peripheral.

Please, don't generate special cases.  Design proper APIs from the
outset rather than abusing what's already there.  So no, don't abuse
the address width stuff.

In any case, the address width stuff must still be used to describe the
peripherals FIFO register.

 In slave transfer, the peripheral is going to give a sync event to 
 DMAC when the FIFO register is full|empty.

'sync event' - peripherals give 'request' signals to the DMAC asking
them to transfer some more data only, and the DMAC gives an acknowledge
signal back so that the peripheral knows that the DMAC is giving them
data.  In the generic case, they typically have no further signalling.

When the DMAC sees an active request, it will attempt to transfer up
to the minimum of (burst size, number of transfers remaining) to the
peripheral in one go.

 Now DMACs capable of 3D transfer, do transfer of the whole 1D
 buffer per sync received or even whole 2D buffer per sync received
 (based on the sync rate programmed in the DMAC).

Ok, what I'm envisioning is that your term chunk size means register
width, and you view that as one dimension.  We already describe this.

A frame is a collection of chunks.  That's already described - the number
of chunks in a buffer is the buffer size divided by the chunk size.
Buffers must be a multiple of the chunk size.

Then we have a collection of frames.  These can be non-contiguous.
That's effectively described by our scatterlist.

 So the DMAC has to be programmed for a 1D size (i.e. chunk size)
 equal to that of the width of the FIFO register if the sync rate
 programmed in DMAC is per chunk.  

This appears to be a circular definition, and makes little sense to me.

The overall conclusion which I'm coming to is that we already support
what you're asking for, but the problem is that we're using different
(and I'd argue standard) terminology to describe what we have.

The only issue which I see that we don't cover is the case where you want
to describe a single buffer which is organised as N bytes to be transferred,
M following bytes to be skipped, N bytes to be transferred, M bytes to be
skipped.  I doubt there are many controllers which can be programmed with
both 'N' and 'M' parameters directly.

Let me finish with a summary of how memory-to-peripheral transfers are
expected to operate:
1. The DMA controller reads data from system RAM using whatever parameters
   are most suitable for it to use up to the current buffer size.

2. When the DMA request goes active, it transfers data that it read
   previously to the device in dest_addr_width chunks of data.  It
   transfers the minimum of the remaining data or the burst size.

3. The peripheral, receiving data and filling its FIFO, drops the DMA
   request.

4. The DMA controller then reads more data from system RAM in the same
   way in (1).

5. If at any point the DMA controller reaches the end of the buffer, and
   as a link to the next buffer, it immediately moves to the next buffer
   and starts fetching data from that new buffer.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-10 Thread Koul, Vinod
On Fri, 2011-06-10 at 16:43 +0530, Raju, Sundaram wrote:
 Vinod,

...
  
   Now coming to the buffer related attributes, sg list is a nice way to
   represent a disjoint buffer; all the offload engines in drivers/dma
   create a descriptor for each of the contiguous chunk in the sg list
   buffer and pass it to the controller.
  
   But many a times a client may want to transfer from a single buffer to
   the peripheral and most of the DMA controllers have the capability to
   transfer data in chunks/frames directly at a stretch.
   All the attributes of a buffer, which are required for the transfer
   can be programmed in single descriptor and submitted to the
   DMA controller.
  
   So I think the slave transfer API must also have a provision to pass
   the buffer configuration. The buffer attributes have to be passed
   directly as an argument in the prepare API, unlike dma_slave_config
   as they will be different for each buffer that is going to be
   transferred.
  Can you articulate what attributes you are talking about. The
  dma_slave_config parameters don't represent buffer attributes. They
  represent the dma attributes on how you want to transfer. These things
  like bus widths, burst lengths are usually constant for the slave
  transfers, not sure why they should change per buffer?
  
 
 I have tried to explain the attributes in the previous mail 
 I posted in this thread.
 
 Yes, buffer attributes should not be represented in 
 the dma_slave_config. It is for slave related data.
 That is why had mentioned that buffer configuration should be
 a separate structure and passed in the prepare API.
 See quoted below:
 
 quote
  struct dma_buffer_config {
  u32 chunk_size; /* number of bytes in a chunk */
  u32 frame_size; /* number of chunks in a frame */
  /* u32 n_frames; number of frames in the buffer */
  u32 inter_chunk_gap; /* number of bytes between end of a chunk 
  and the start of the next chunk */ 
  u32 inter_frame_gap; /* number of bytes between end of a frame 
  and the start of the next frame */ 
  bool sync_rate; /* 0 - a sync event is required from the 
  peripheral to transfer a chunk 
  1 - a sync event is required from the 
  peripheral to transfer a frame */ 
  };
IIUC, you have a buffer with gaps in between (given by above params).
Is your DMA h/w capable of parsing this buffer and directly do a
transfer based on above parameters (without any work for SW), or you are
doing this in your dma driver and then programming a list of buffers?
In former case (although i haven't seen such dma controller yet), can
you share the datasheet? It would make very little sense to change the
current API for your extra special case. This special case needs to be
handled differently rather than making rule out of it!!

And in latter case you are really going the wrong path as Russel pointed
out and trying to abuse the APIs


-- 
~Vinod

--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-09 Thread Russell King - ARM Linux
Can you please re-post with sensible wrapping at or before column 72.
I'm not manually reformatting your entire message just so I can find
the relevant bits to reply to.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-09 Thread Raju, Sundaram
. 

---
 | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame 0 
---
 |  Inter Frame Gap  | 
---
 | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame 1 
---
 |  Inter Frame Gap  | 
---
 |   | 
---
 |  Inter Frame Gap  | 
---
 | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame m 
---

Note: ICG = Inter Chunk Gap.

struct dma_buffer_config {
u32 chunk_size; /* number of bytes in a chunk */
u32 frame_size; /* number of chunks in a frame */
/* u32 n_frames; number of frames in the buffer */
u32 inter_chunk_gap; /* number of bytes between end of a chunk 
and the start of the next chunk */ 
u32 inter_frame_gap; /* number of bytes between end of a frame 
and the start of the next frame */ 
bool sync_rate; /* 0 - a sync event is required from the 
peripheral to transfer a chunk 
1 - a sync event is required from the 
peripheral to transfer a frame */ 
};

The patch to add a new API for single buffer transfers alone: 
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
@@ -491,6 +492,10 @@ struct dma_device {
struct dma_chan *chan, struct scatterlist *sgl,
unsigned int sg_len, enum dma_data_direction direction,
unsigned long flags);
+   struct dma_async_tx_descriptor *(*device_prep_slave)(
+   struct dma_chan *chan, dma_addr_t buf_addr,
+   unsigned int buf_len, void *buf_prop,
+   enum dma_data_direction direction, unsigned long flags);
struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
size_t period_len, enum dma_data_direction direction);

The number of frames (n_frames) can always be determined using all 
other values in the dma_buffer_config along with the 
buffer length (buf_len) passed in the API. 
n_frames = buf_len / (chunk_sixe * frame_size); 

Regards, 
Sundaram

-Original Message-
From: Russell King - ARM Linux [mailto:li...@arm.linux.org.uk] 
Sent: Thursday, June 09, 2011 6:17 PM
To: Raju, Sundaram
Cc: linux-arm-ker...@lists.infradead.org; linux-ker...@vger.kernel.org; 
davinci-linux-open-sou...@linux.davincidsp.com; linux-omap@vger.kernel.org
Subject: Re: [RFC] dmaengine: add new api for preparing simple slave transfer

Can you please re-post with sensible wrapping at or before column 72.
I'm not manually reformatting your entire message just so I can find
the relevant bits to reply to.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-09 Thread Russell King - ARM Linux
On Thu, Jun 09, 2011 at 09:31:56PM +0530, Raju, Sundaram wrote:
 Here it is, with proper line wrapping;

Thanks.  This is much easier to reply to.

 I believe that even though the dmaengine framework addresses and 
 supports most of the required use cases of a client driver to a DMA 
 controller, some extensions are required in it to make it still more 
 generic.
 
 Current framework contains two APIs to prepare for slave transfers: 
 
 struct dma_async_tx_descriptor *(*device_prep_slave_sg)(
   struct dma_chan *chan, struct scatterlist *sgl,
   unsigned int sg_len, enum dma_data_direction direction,
   unsigned long flags);
 
 struct dma_async_tx_descriptor *(*device_prep_dma_cyclic)(
   struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
   size_t period_len, enum dma_data_direction direction);
 
 and one control API. 
 int (*device_control)(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
   unsigned long arg);
 
 A simple single buffer transfer (i.e. non sg transfer) can be done only
 as a trivial case of the device_prep_slave_sg API. The client driver is
 expected to prepare a sg list and provide to the dmaengine API for that
 single buffer also.

We can avoid preparing a sg list in every driver which wants this by
having dmaengine.h provide a helper for this case:

static inline dma_async_tx_descriptor *dmaengine_prep_slave_single(
struct dma_chan *chan, void *buf, size_t len,
enum dma_data_direction dir, unsigned long flags)
{
struct scatterlist sg;

sg_init_one(sg, buf, len);

return chan-device-device_prep_slave_sg(chan, sg, 1, dir, flags);
}

I think also providing dmaengine_prep_slave_sg() and
dmaengine_prep_dma_cyclic() as wrappers to hide the chan-device-blah
bit would also be beneficial (and helps keep that detail out of the
users of DMA engine.)

 In a slave transfer, the client has to define all the buffer related 
 attributes and the peripheral related attributes. 

I'd argue that it is incorrect to have drivers specify the buffer
related attributes - that makes the API unnecessarily complicated
to use, requiring two calls (one to configure the channel, and one
to prepare the transfer) each time it needs to be used.

Not only that but the memory side of the transfer should be no business
of the driver - the driver itself should only specify the attributes
for the device being driven.  The attributes for the memory side of the
transfer should be a property of the DMA engine itself.

I would like to see in the long term the dma_slave_config structure
lose its 'direction' argument, and the rest of the parameters used to
define the device side parameters only.

This will allow the channel to be configured once when its requested,
and then the prepare call can configure the direction as required.

 Now coming to the buffer related attributes, sg list is a nice way to 
 represent a disjoint buffer; all the offload engines in drivers/dma 
 create a descriptor for each of the contiguous chunk in the sg list 
 buffer and pass it to the controller. 

The sg list is the standard Linux way to describe scattered buffers.

 But many a times a client may want to transfer from a single buffer to
 the peripheral and most of the DMA controllers have the capability to 
 transfer data in chunks/frames directly at a stretch. 

So far, I've only seen DMA controllers which operate on a linked list of
source, destination, length, attributes, and next entry pointer.

 All the attributes of a buffer, which are required for the transfer 
 can be programmed in single descriptor and submitted to the 
 DMA controller. 

I'm not sure that this is useful - in order to make use of the data, the
data would have to be copied in between the descriptors - and doesn't that
rather negate the point of DMA if you have to memcpy() the data around?

Isn't it far more efficient to have DMA place the data exactly where it's
required in memory right from the start without any memcpy() operations?

Can you explain where and how you would use something like this:

 ---
  | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame 0 
 ---
  |Inter Frame Gap  | 
 ---
  | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n | Frame 1 
 ---
  |Inter Frame Gap  | 
 ---
  | | 
 ---
  |Inter Frame Gap  | 
 ---
  | Chunk 0 |ICG| Chunk 1 |ICG| 

Re: [RFC] dmaengine: add new api for preparing simple slave transfer

2011-06-09 Thread Jassi Brar
On Thu, Jun 9, 2011 at 6:09 PM, Raju, Sundaram sunda...@ti.com wrote:

 Generic buffer description:
 A generic buffer can be split into number of frames which contain number of 
 chunks inside them. The frames need not be contiguous, nor do the chunks 
 inside a frame.

        ---
        | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n |       Frame 0
        ---
        |                       Inter Frame Gap                     |
        ---
        | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n |       Frame 1
        ---
        |                       Inter Frame Gap                     |
        ---
        |                                                              
      |
        ---
        |                       Inter Frame Gap                     |
        ---
        | Chunk 0 |ICG| Chunk 1 |ICG| ... |ICG| Chunk n |       Frame m
        ---

IIUC the above figure, the work done by DMA controller remains the
same, either by
passing this as a transfer of the new type or as a normal sg-list  - unless
the DMAC driver attempts to reorder the transfers or the DMAC h/w
natively supports
some form of sg-list.
For DMACs, that have no special support, different representation
wouldn't make a
difference.
And if the DMAC does support the kind of fancy scatter-gather, it
should be possible for
the dma api driver to analyze the submitted 'normal' sg-list and
program the transfers at one go.
Besides, it should be possible to have a 'template' sequence of
requests prepared
already because usually, for above mentioned scenario, the parameters
don't change across
items in a list.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html