Re: [PATCH v2 3/3] spi: spi-davinci: convert to DMA engine API

2012-08-22 Thread Vinod Koul
On Wed, 2012-08-22 at 12:04 -0400, Matt Porter wrote:
> for querying of these types of limitations. Right now, the
> mmc driver implicitly knows that EDMA needs this restriction
> but it's something that should be queried before calling
> prep_slave().
that's something we need to add; exporting channel capabilities. We only
tell that it slave or memcpy today, but we need to tell clients what are
channel supported parameter ranges.

-- 
~Vinod

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] spi: spi-davinci: convert to DMA engine API

2012-08-22 Thread Matt Porter
On Wed, Aug 22, 2012 at 09:15:22AM +0530, Vinod Koul wrote:
> On Tue, 2012-08-21 at 14:43 -0400, Matt Porter wrote:
> > Removes use of the DaVinci EDMA private DMA API and replaces
> > it with use of the DMA engine API.
> > 
> > Signed-off-by: Matt Porter 
> > ---
> 
> > +   struct dma_slave_config dma_rx_conf = {
> > +   .direction = DMA_DEV_TO_MEM,
> > +   .src_addr = (unsigned long)dspi->pbase + SPIBUF,
> > +   .src_addr_width = data_type,
> > +   .src_maxburst = 1,
> what does 1 mean in this context? We define as number of units that
> needs to be transfered, so are you sure that you want only one unit to
> be dma'ed in a single burst. that seems like killing your dmac,
> shouldn't you be using larger bursts for a better dma performance?

This device can't handle bursts, it's a simple shift register based
SPI master that always asserts a DMA req for each SPI word transfer.

The other important thing to note is that the EDMA driver itself
is able to handle a maxburst of 1 as a special case. That is, the
EDMA has some limitations in transfer sizes it can handle if you
need burst support. So, on the EDMA end of things you'll see that
if maxburst if 1, it's able to handle setting up an A-synchronized
transfer to handle any sized segment coming in with a single transfer
slot. However, is maxburst is >1, EDMA requires up to set up an
AB-synchronized transfer. This type of transfer limits allows for
a DMA req per burst, but the maximum segment size we can handle is
SZ_64K-1. An annoying hardware design limitation, indeed.

It works out ok because in this spi driver conversion we always
map a SPI transfer into a single segment (similar to the
spi-omap2-mcspi conversion). Since the SPI master can't handle
bursts, the EDMA driver is able to handle any sized transfer
without any performance penalty. If this SPI master could
handle bursts, we'd be in trouble because we quickly run
into our AB-synced max segment limitation.

In the mmc driver, we have a device that can handle bursts for
performance reasons. It sets maxburst appropriately and the
EDMA driver does the required AB-synced transfer at the h/w
level. However, this is subject to our limitation of SZ_64K-1
per segment. Luckily we aren't the first to need to limit the
segment size coming into an mmc host driver. The mmc
subsystem already handles this case and the existing driver
using the private EDMA API was already advertising a maximum
number of segments and segments size to the mmc subsystem.
Ideally, we should have a dmaengine interface that allows
for querying of these types of limitations. Right now, the
mmc driver implicitly knows that EDMA needs this restriction
but it's something that should be queried before calling
prep_slave().

-Matt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] spi: spi-davinci: convert to DMA engine API

2012-08-22 Thread Matt Porter
On Wed, Aug 22, 2012 at 09:15:22AM +0530, Vinod Koul wrote:
 On Tue, 2012-08-21 at 14:43 -0400, Matt Porter wrote:
  Removes use of the DaVinci EDMA private DMA API and replaces
  it with use of the DMA engine API.
  
  Signed-off-by: Matt Porter mpor...@ti.com
  ---
 
  +   struct dma_slave_config dma_rx_conf = {
  +   .direction = DMA_DEV_TO_MEM,
  +   .src_addr = (unsigned long)dspi-pbase + SPIBUF,
  +   .src_addr_width = data_type,
  +   .src_maxburst = 1,
 what does 1 mean in this context? We define as number of units that
 needs to be transfered, so are you sure that you want only one unit to
 be dma'ed in a single burst. that seems like killing your dmac,
 shouldn't you be using larger bursts for a better dma performance?

This device can't handle bursts, it's a simple shift register based
SPI master that always asserts a DMA req for each SPI word transfer.

The other important thing to note is that the EDMA driver itself
is able to handle a maxburst of 1 as a special case. That is, the
EDMA has some limitations in transfer sizes it can handle if you
need burst support. So, on the EDMA end of things you'll see that
if maxburst if 1, it's able to handle setting up an A-synchronized
transfer to handle any sized segment coming in with a single transfer
slot. However, is maxburst is 1, EDMA requires up to set up an
AB-synchronized transfer. This type of transfer limits allows for
a DMA req per burst, but the maximum segment size we can handle is
SZ_64K-1. An annoying hardware design limitation, indeed.

It works out ok because in this spi driver conversion we always
map a SPI transfer into a single segment (similar to the
spi-omap2-mcspi conversion). Since the SPI master can't handle
bursts, the EDMA driver is able to handle any sized transfer
without any performance penalty. If this SPI master could
handle bursts, we'd be in trouble because we quickly run
into our AB-synced max segment limitation.

In the mmc driver, we have a device that can handle bursts for
performance reasons. It sets maxburst appropriately and the
EDMA driver does the required AB-synced transfer at the h/w
level. However, this is subject to our limitation of SZ_64K-1
per segment. Luckily we aren't the first to need to limit the
segment size coming into an mmc host driver. The mmc
subsystem already handles this case and the existing driver
using the private EDMA API was already advertising a maximum
number of segments and segments size to the mmc subsystem.
Ideally, we should have a dmaengine interface that allows
for querying of these types of limitations. Right now, the
mmc driver implicitly knows that EDMA needs this restriction
but it's something that should be queried before calling
prep_slave().

-Matt
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] spi: spi-davinci: convert to DMA engine API

2012-08-22 Thread Vinod Koul
On Wed, 2012-08-22 at 12:04 -0400, Matt Porter wrote:
 for querying of these types of limitations. Right now, the
 mmc driver implicitly knows that EDMA needs this restriction
 but it's something that should be queried before calling
 prep_slave().
that's something we need to add; exporting channel capabilities. We only
tell that it slave or memcpy today, but we need to tell clients what are
channel supported parameter ranges.

-- 
~Vinod

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] spi: spi-davinci: convert to DMA engine API

2012-08-21 Thread Vinod Koul
On Tue, 2012-08-21 at 14:43 -0400, Matt Porter wrote:
> Removes use of the DaVinci EDMA private DMA API and replaces
> it with use of the DMA engine API.
> 
> Signed-off-by: Matt Porter 
> ---

> + struct dma_slave_config dma_rx_conf = {
> + .direction = DMA_DEV_TO_MEM,
> + .src_addr = (unsigned long)dspi->pbase + SPIBUF,
> + .src_addr_width = data_type,
> + .src_maxburst = 1,
what does 1 mean in this context? We define as number of units that
needs to be transfered, so are you sure that you want only one unit to
be dma'ed in a single burst. that seems like killing your dmac,
shouldn't you be using larger bursts for a better dma performance?


-- 
~Vinod

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] spi: spi-davinci: convert to DMA engine API

2012-08-21 Thread Matt Porter
Removes use of the DaVinci EDMA private DMA API and replaces
it with use of the DMA engine API.

Signed-off-by: Matt Porter 
---
 drivers/spi/spi-davinci.c |  292 -
 1 file changed, 130 insertions(+), 162 deletions(-)

diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c
index 9b2901f..c1ec52d 100644
--- a/drivers/spi/spi-davinci.c
+++ b/drivers/spi/spi-davinci.c
@@ -25,13 +25,14 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
 
 #include 
-#include 
 
 #define SPI_NO_RESOURCE((resource_size_t)-1)
 
@@ -113,14 +114,6 @@
 #define SPIDEF 0x4c
 #define SPIFMT00x50
 
-/* We have 2 DMA channels per CS, one for RX and one for TX */
-struct davinci_spi_dma {
-   int tx_channel;
-   int rx_channel;
-   int dummy_param_slot;
-   enum dma_event_qeventq;
-};
-
 /* SPI Controller driver's private data. */
 struct davinci_spi {
struct spi_bitbang  bitbang;
@@ -134,11 +127,14 @@ struct davinci_spi {
 
const void  *tx;
void*rx;
-#define SPI_TMP_BUFSZ  (SMP_CACHE_BYTES + 1)
-   u8  rx_tmp_buf[SPI_TMP_BUFSZ];
int rcount;
int wcount;
-   struct davinci_spi_dma  dma;
+
+   struct dma_chan *dma_rx;
+   struct dma_chan *dma_tx;
+   int dma_rx_chnum;
+   int dma_tx_chnum;
+
struct davinci_spi_platform_data *pdata;
 
void(*get_rx)(u32 rx_data, struct davinci_spi *);
@@ -496,21 +492,23 @@ out:
return errors;
 }
 
-static void davinci_spi_dma_callback(unsigned lch, u16 status, void *data)
+static void davinci_spi_dma_rx_callback(void *data)
 {
-   struct davinci_spi *dspi = data;
-   struct davinci_spi_dma *dma = >dma;
+   struct davinci_spi *dspi = (struct davinci_spi *)data;
 
-   edma_stop(lch);
+   dspi->rcount = 0;
 
-   if (status == DMA_COMPLETE) {
-   if (lch == dma->rx_channel)
-   dspi->rcount = 0;
-   if (lch == dma->tx_channel)
-   dspi->wcount = 0;
-   }
+   if (!dspi->wcount && !dspi->rcount)
+   complete(>done);
+}
 
-   if ((!dspi->wcount && !dspi->rcount) || (status != DMA_COMPLETE))
+static void davinci_spi_dma_tx_callback(void *data)
+{
+   struct davinci_spi *dspi = (struct davinci_spi *)data;
+
+   dspi->wcount = 0;
+
+   if (!dspi->wcount && !dspi->rcount)
complete(>done);
 }
 
@@ -526,20 +524,20 @@ static void davinci_spi_dma_callback(unsigned lch, u16 
status, void *data)
 static int davinci_spi_bufs(struct spi_device *spi, struct spi_transfer *t)
 {
struct davinci_spi *dspi;
-   int data_type, ret;
+   int data_type, ret = -ENOMEM;
u32 tx_data, spidat1;
u32 errors = 0;
struct davinci_spi_config *spicfg;
struct davinci_spi_platform_data *pdata;
unsigned uninitialized_var(rx_buf_count);
-   struct device *sdev;
+   void *dummy_buf = NULL;
+   struct scatterlist sg_rx, sg_tx;
 
dspi = spi_master_get_devdata(spi->master);
pdata = dspi->pdata;
spicfg = (struct davinci_spi_config *)spi->controller_data;
if (!spicfg)
spicfg = _spi_default_cfg;
-   sdev = dspi->bitbang.master->dev.parent;
 
/* convert len to words based on bits_per_word */
data_type = dspi->bytes_per_word[spi->chip_select];
@@ -567,112 +565,83 @@ static int davinci_spi_bufs(struct spi_device *spi, 
struct spi_transfer *t)
spidat1 |= tx_data & 0x;
iowrite32(spidat1, dspi->base + SPIDAT1);
} else {
-   struct davinci_spi_dma *dma;
-   unsigned long tx_reg, rx_reg;
-   struct edmacc_param param;
-   void *rx_buf;
-   int b, c;
-
-   dma = >dma;
-
-   tx_reg = (unsigned long)dspi->pbase + SPIDAT1;
-   rx_reg = (unsigned long)dspi->pbase + SPIBUF;
-
-   /*
-* Transmit DMA setup
-*
-* If there is transmit data, map the transmit buffer, set it
-* as the source of data and set the source B index to data
-* size. If there is no transmit data, set the transmit register
-* as the source of data, and set the source B index to zero.
-*
-* The destination is always the transmit register itself. And
-* the destination never increments.
-*/
-
-   if (t->tx_buf) {
-   t->tx_dma = dma_map_single(>dev, (void *)t->tx_buf,
-   

Re: [PATCH v2 3/3] spi: spi-davinci: convert to DMA engine API

2012-08-21 Thread Vinod Koul
On Tue, 2012-08-21 at 14:43 -0400, Matt Porter wrote:
 Removes use of the DaVinci EDMA private DMA API and replaces
 it with use of the DMA engine API.
 
 Signed-off-by: Matt Porter mpor...@ti.com
 ---

 + struct dma_slave_config dma_rx_conf = {
 + .direction = DMA_DEV_TO_MEM,
 + .src_addr = (unsigned long)dspi-pbase + SPIBUF,
 + .src_addr_width = data_type,
 + .src_maxburst = 1,
what does 1 mean in this context? We define as number of units that
needs to be transfered, so are you sure that you want only one unit to
be dma'ed in a single burst. that seems like killing your dmac,
shouldn't you be using larger bursts for a better dma performance?


-- 
~Vinod

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] spi: spi-davinci: convert to DMA engine API

2012-08-21 Thread Matt Porter
Removes use of the DaVinci EDMA private DMA API and replaces
it with use of the DMA engine API.

Signed-off-by: Matt Porter mpor...@ti.com
---
 drivers/spi/spi-davinci.c |  292 -
 1 file changed, 130 insertions(+), 162 deletions(-)

diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c
index 9b2901f..c1ec52d 100644
--- a/drivers/spi/spi-davinci.c
+++ b/drivers/spi/spi-davinci.c
@@ -25,13 +25,14 @@
 #include linux/platform_device.h
 #include linux/err.h
 #include linux/clk.h
+#include linux/dmaengine.h
 #include linux/dma-mapping.h
+#include linux/edma.h
 #include linux/spi/spi.h
 #include linux/spi/spi_bitbang.h
 #include linux/slab.h
 
 #include mach/spi.h
-#include mach/edma.h
 
 #define SPI_NO_RESOURCE((resource_size_t)-1)
 
@@ -113,14 +114,6 @@
 #define SPIDEF 0x4c
 #define SPIFMT00x50
 
-/* We have 2 DMA channels per CS, one for RX and one for TX */
-struct davinci_spi_dma {
-   int tx_channel;
-   int rx_channel;
-   int dummy_param_slot;
-   enum dma_event_qeventq;
-};
-
 /* SPI Controller driver's private data. */
 struct davinci_spi {
struct spi_bitbang  bitbang;
@@ -134,11 +127,14 @@ struct davinci_spi {
 
const void  *tx;
void*rx;
-#define SPI_TMP_BUFSZ  (SMP_CACHE_BYTES + 1)
-   u8  rx_tmp_buf[SPI_TMP_BUFSZ];
int rcount;
int wcount;
-   struct davinci_spi_dma  dma;
+
+   struct dma_chan *dma_rx;
+   struct dma_chan *dma_tx;
+   int dma_rx_chnum;
+   int dma_tx_chnum;
+
struct davinci_spi_platform_data *pdata;
 
void(*get_rx)(u32 rx_data, struct davinci_spi *);
@@ -496,21 +492,23 @@ out:
return errors;
 }
 
-static void davinci_spi_dma_callback(unsigned lch, u16 status, void *data)
+static void davinci_spi_dma_rx_callback(void *data)
 {
-   struct davinci_spi *dspi = data;
-   struct davinci_spi_dma *dma = dspi-dma;
+   struct davinci_spi *dspi = (struct davinci_spi *)data;
 
-   edma_stop(lch);
+   dspi-rcount = 0;
 
-   if (status == DMA_COMPLETE) {
-   if (lch == dma-rx_channel)
-   dspi-rcount = 0;
-   if (lch == dma-tx_channel)
-   dspi-wcount = 0;
-   }
+   if (!dspi-wcount  !dspi-rcount)
+   complete(dspi-done);
+}
 
-   if ((!dspi-wcount  !dspi-rcount) || (status != DMA_COMPLETE))
+static void davinci_spi_dma_tx_callback(void *data)
+{
+   struct davinci_spi *dspi = (struct davinci_spi *)data;
+
+   dspi-wcount = 0;
+
+   if (!dspi-wcount  !dspi-rcount)
complete(dspi-done);
 }
 
@@ -526,20 +524,20 @@ static void davinci_spi_dma_callback(unsigned lch, u16 
status, void *data)
 static int davinci_spi_bufs(struct spi_device *spi, struct spi_transfer *t)
 {
struct davinci_spi *dspi;
-   int data_type, ret;
+   int data_type, ret = -ENOMEM;
u32 tx_data, spidat1;
u32 errors = 0;
struct davinci_spi_config *spicfg;
struct davinci_spi_platform_data *pdata;
unsigned uninitialized_var(rx_buf_count);
-   struct device *sdev;
+   void *dummy_buf = NULL;
+   struct scatterlist sg_rx, sg_tx;
 
dspi = spi_master_get_devdata(spi-master);
pdata = dspi-pdata;
spicfg = (struct davinci_spi_config *)spi-controller_data;
if (!spicfg)
spicfg = davinci_spi_default_cfg;
-   sdev = dspi-bitbang.master-dev.parent;
 
/* convert len to words based on bits_per_word */
data_type = dspi-bytes_per_word[spi-chip_select];
@@ -567,112 +565,83 @@ static int davinci_spi_bufs(struct spi_device *spi, 
struct spi_transfer *t)
spidat1 |= tx_data  0x;
iowrite32(spidat1, dspi-base + SPIDAT1);
} else {
-   struct davinci_spi_dma *dma;
-   unsigned long tx_reg, rx_reg;
-   struct edmacc_param param;
-   void *rx_buf;
-   int b, c;
-
-   dma = dspi-dma;
-
-   tx_reg = (unsigned long)dspi-pbase + SPIDAT1;
-   rx_reg = (unsigned long)dspi-pbase + SPIBUF;
-
-   /*
-* Transmit DMA setup
-*
-* If there is transmit data, map the transmit buffer, set it
-* as the source of data and set the source B index to data
-* size. If there is no transmit data, set the transmit register
-* as the source of data, and set the source B index to zero.
-*
-* The destination is always the transmit register itself. And
-* the destination never increments.
-