Re: [RFC PATCH 3/6] spi: spi-s3c64xx: Add coherent buffers for DMA transfers

2013-09-11 Thread Łukasz Czerwiński



On 09/10/2013 02:02 PM, Mark Brown wrote:

On Tue, Sep 10, 2013 at 01:23:45PM +0200, Łukasz Czerwiński wrote:

On 09/09/2013 04:45 PM, Mark Brown wrote:



This seems like a very low limit to have, consider things like firmware
downloads for example.  It seems reasonable to have a preallocated small
buffer but there should be some fallback for larger transfer sizes.



I have tested my modification with different buffer sizes with
S5C73M3 driver (355560 B upload). I obtained following upload times:



16kB buffer:
- 83.972 ms
- 79.196 ms
- 79.432 ms
128kB buffer:
- 74.449 ms
- 80.719 ms
- 75.599 ms



For 256kB I obtained similar results as in 128kB case. Normally
non-interrupted SPI transfer should take 56ms (50MHz). Performance
loss is approximately about 6% between 16kB and 128KB buffer. I my
opinion there are no need to use bigger buffers.


You're missing the point here.  This requires the client driver to know
the maximum transfer size that the controller supports and split the
transfer up for it.  This isn't something that the SPI API supports now
and while it would be useful to add for controllers that are genuinely
limited in transfer size it doesn't seem sane to just randomly impose a
limit to make life easier for software.  Not all applications will be
able to split their transfers up in the first place and it seems like
the wrong place to put that knowledge.

What I would expect to see the driver doing here is either allocating
a larger buffer on demand or (probably more reliably given the need for
contiguous physical memory) transparently splitting larger transfers.
A super smart implementation would overlap the copy of blocks with the
transfers of preceeding blocks, and there should be something we can do
to avoid having to copy data that's already in suitable memory.


Thanks for your comments.
I will try implement transparently splitting larger transfer.


Ideally this would all be factored out into the core based on limit
information but that's probably not required immediately.  Though now I
write that I'm wondering how many other devices should be doing this but
aren't.  The DMA mapping of transfer buffers doesn't seem terribly
device specific...


I propose add extra module parameter which allows to change buffer size.


No, module parameters are not sensible for modern code.  If you wanted
runtime configuration a sysfs tunable would be better though I do think
that should be tuning optimisation, not a hard limit on the transfer
size.


ok

Thanks
Lukasz
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 3/6] spi: spi-s3c64xx: Add coherent buffers for DMA transfers

2013-09-10 Thread Łukasz Czerwiński



On 09/09/2013 04:45 PM, Mark Brown wrote:

On Mon, Sep 09, 2013 at 04:09:23PM +0200, Lukasz Czerwinski wrote:

The spi-s3c64xx currently doesn't support transfers from non-contiguous
client buffers.

This patch adds two coherent buffers which allow transfers from
non-contiguous client buffers without extra coherent memory allocation
in the client driver.

Buffer size is hardcoded to 16kB for Tx/Rx. Client drivers shouldn't
exceed that value.


This seems like a very low limit to have, consider things like firmware
downloads for example.  It seems reasonable to have a preallocated small
buffer but there should be some fallback for larger transfer sizes.

I have tested my modification with different buffer sizes with S5C73M3 
driver (355560 B upload). I obtained following upload times:


16kB buffer:
- 83.972 ms
- 79.196 ms
- 79.432 ms
128kB buffer:
- 74.449 ms
- 80.719 ms
- 75.599 ms

For 256kB I obtained similar results as in 128kB case. Normally 
non-interrupted SPI transfer should take 56ms (50MHz). Performance loss 
is approximately about 6% between 16kB and 128KB buffer. I my opinion 
there are no need to use bigger buffers.


I propose add extra module parameter which allows to change buffer size.


I also didn't notice any checks in the code for the length of transfers
so this will corrupt memory if a client driver tries to transfer more
than the preallocated buffer as things stand.


Right, I will add that.

Thanks
Lukasz
--
To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 3/6] spi: spi-s3c64xx: Add coherent buffers for DMA transfers

2013-09-10 Thread Mark Brown
On Tue, Sep 10, 2013 at 01:23:45PM +0200, Łukasz Czerwiński wrote:
 On 09/09/2013 04:45 PM, Mark Brown wrote:

 This seems like a very low limit to have, consider things like firmware
 downloads for example.  It seems reasonable to have a preallocated small
 buffer but there should be some fallback for larger transfer sizes.

 I have tested my modification with different buffer sizes with
 S5C73M3 driver (355560 B upload). I obtained following upload times:

 16kB buffer:
   - 83.972 ms
   - 79.196 ms
   - 79.432 ms
 128kB buffer:
   - 74.449 ms
   - 80.719 ms
   - 75.599 ms

 For 256kB I obtained similar results as in 128kB case. Normally
 non-interrupted SPI transfer should take 56ms (50MHz). Performance
 loss is approximately about 6% between 16kB and 128KB buffer. I my
 opinion there are no need to use bigger buffers.

You're missing the point here.  This requires the client driver to know
the maximum transfer size that the controller supports and split the
transfer up for it.  This isn't something that the SPI API supports now
and while it would be useful to add for controllers that are genuinely
limited in transfer size it doesn't seem sane to just randomly impose a
limit to make life easier for software.  Not all applications will be
able to split their transfers up in the first place and it seems like
the wrong place to put that knowledge.

What I would expect to see the driver doing here is either allocating
a larger buffer on demand or (probably more reliably given the need for
contiguous physical memory) transparently splitting larger transfers.
A super smart implementation would overlap the copy of blocks with the
transfers of preceeding blocks, and there should be something we can do
to avoid having to copy data that's already in suitable memory.

Ideally this would all be factored out into the core based on limit
information but that's probably not required immediately.  Though now I
write that I'm wondering how many other devices should be doing this but
aren't.  The DMA mapping of transfer buffers doesn't seem terribly
device specific...

 I propose add extra module parameter which allows to change buffer size.

No, module parameters are not sensible for modern code.  If you wanted
runtime configuration a sysfs tunable would be better though I do think
that should be tuning optimisation, not a hard limit on the transfer
size.


signature.asc
Description: Digital signature


[RFC PATCH 3/6] spi: spi-s3c64xx: Add coherent buffers for DMA transfers

2013-09-09 Thread Lukasz Czerwinski
The spi-s3c64xx currently doesn't support transfers from non-contiguous
client buffers.

This patch adds two coherent buffers which allow transfers from
non-contiguous client buffers without extra coherent memory allocation
in the client driver.

Buffer size is hardcoded to 16kB for Tx/Rx. Client drivers shouldn't
exceed that value.

Signed-off-by: Lukasz Czerwinski l.czerwin...@samsung.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
---
 drivers/spi/spi-s3c64xx.c |  107 ++---
 1 file changed, 90 insertions(+), 17 deletions(-)

diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c
index 28e8c71..1ec1244 100644
--- a/drivers/spi/spi-s3c64xx.c
+++ b/drivers/spi/spi-s3c64xx.c
@@ -113,6 +113,7 @@
 #define S3C64XX_SPI_SWAP_TX_EN (10)
 
 #define S3C64XX_SPI_FBCLK_MSK  (30)
+#define S3C64XX_SPI_DMA_BUF_SIZE   (16 * SZ_1K)
 
 #define FIFO_LVL_MASK(i) ((i)-port_conf-fifo_lvl_mask[i-port_id])
 #define S3C64XX_SPI_ST_TX_DONE(v, i) (((v)  \
@@ -133,6 +134,8 @@
 #define TXBUSY(13)
 
 struct s3c64xx_spi_dma_data {
+   u32 *vbuf;
+   dma_addr_t dma_phys;
struct dma_chan *ch;
enum dma_transfer_direction direction;
unsigned int dmach;
@@ -178,6 +181,7 @@ struct s3c64xx_spi_port_config {
  * @xfer_completion: To indicate completion of xfer task.
  * @cur_mode: Stores the active configuration of the controller.
  * @cur_bpw: Stores the active bits per word settings.
+ * @dma_buf_size: Stores buffer size for Rx/Tx.
  * @cur_speed: Stores the active xfer clock speed.
  */
 struct s3c64xx_spi_driver_data {
@@ -194,6 +198,7 @@ struct s3c64xx_spi_driver_data {
unsignedstate;
unsignedcur_mode, cur_bpw;
unsignedcur_speed;
+   unsigneddma_buf_size;
struct s3c64xx_spi_dma_data rx_dma;
struct s3c64xx_spi_dma_data tx_dma;
 
@@ -279,8 +284,7 @@ static void s3c64xx_spi_dmacb(void *data)
spin_unlock_irqrestore(sdd-lock, flags);
 }
 
-static void s3c64xx_prepare_dma(struct s3c64xx_spi_dma_data *dma,
-   unsigned len, dma_addr_t buf)
+static void s3c64xx_prepare_dma(struct s3c64xx_spi_dma_data *dma, unsigned len)
 {
struct s3c64xx_spi_driver_data *sdd;
struct dma_slave_config config;
@@ -306,7 +310,7 @@ static void s3c64xx_prepare_dma(struct s3c64xx_spi_dma_data 
*dma,
dmaengine_slave_config(dma-ch, config);
}
 
-   desc = dmaengine_prep_slave_single(dma-ch, buf, len,
+   desc = dmaengine_prep_slave_single(dma-ch, dma-dma_phys, len,
dma-direction, DMA_PREP_INTERRUPT);
 
desc-callback = s3c64xx_spi_dmacb;
@@ -382,6 +386,34 @@ static void s3c64xx_spi_dma_stop(struct 
s3c64xx_spi_driver_data *sdd,
dmaengine_terminate_all(dma-ch);
 }
 
+static void s3c64xx_copy_txbuf_to_spi(struct s3c64xx_spi_driver_data *sdd,
+   struct spi_transfer *xfer)
+{
+   struct device *dev = sdd-pdev-dev;
+
+   dma_sync_single_for_cpu(dev, sdd-tx_dma.dma_phys,
+   sdd-dma_buf_size, DMA_TO_DEVICE);
+
+   memcpy(sdd-tx_dma.vbuf, xfer-tx_buf, xfer-len);
+
+   dma_sync_single_for_device(dev, sdd-tx_dma.dma_phys,
+   sdd-dma_buf_size, DMA_TO_DEVICE);
+}
+
+static void s3c64xx_copy_spi_to_rxbuf(struct s3c64xx_spi_driver_data *sdd,
+   struct spi_transfer *xfer)
+{
+   struct device *dev = sdd-pdev-dev;
+
+   dma_sync_single_for_cpu(dev, sdd-rx_dma.dma_phys,
+   sdd-dma_buf_size, DMA_FROM_DEVICE);
+
+   memcpy(xfer-rx_buf, sdd-rx_dma.vbuf, xfer-len);
+
+   dma_sync_single_for_device(dev, sdd-rx_dma.dma_phys,
+   sdd-dma_buf_size, DMA_FROM_DEVICE);
+}
+
 static void s3c64xx_enable_datapath(struct s3c64xx_spi_driver_data *sdd,
struct spi_device *spi,
struct spi_transfer *xfer, int dma_mode)
@@ -413,8 +445,8 @@ static void s3c64xx_enable_datapath(struct 
s3c64xx_spi_driver_data *sdd,
chcfg |= S3C64XX_SPI_CH_TXCH_ON;
if (dma_mode) {
modecfg |= S3C64XX_SPI_MODE_TXDMA_ON;
-   s3c64xx_prepare_dma(sdd-tx_dma,
-   xfer-len, xfer-tx_dma);
+   s3c64xx_copy_txbuf_to_spi(sdd, xfer);
+   s3c64xx_prepare_dma(sdd-tx_dma, xfer-len);
} else {
switch (sdd-cur_bpw) {
case 32:
@@ -446,8 +478,8 @@ static void s3c64xx_enable_datapath(struct 
s3c64xx_spi_driver_data *sdd,
writel(((xfer-len * 8 / sdd-cur_bpw)  0x)
| S3C64XX_SPI_PACKET_CNT_EN,