Re: [RFC PATCH 3/6] spi: spi-s3c64xx: Add coherent buffers for DMA transfers
On 09/10/2013 02:02 PM, Mark Brown wrote: On Tue, Sep 10, 2013 at 01:23:45PM +0200, Łukasz Czerwiński wrote: On 09/09/2013 04:45 PM, Mark Brown wrote: This seems like a very low limit to have, consider things like firmware downloads for example. It seems reasonable to have a preallocated small buffer but there should be some fallback for larger transfer sizes. I have tested my modification with different buffer sizes with S5C73M3 driver (355560 B upload). I obtained following upload times: 16kB buffer: - 83.972 ms - 79.196 ms - 79.432 ms 128kB buffer: - 74.449 ms - 80.719 ms - 75.599 ms For 256kB I obtained similar results as in 128kB case. Normally non-interrupted SPI transfer should take 56ms (50MHz). Performance loss is approximately about 6% between 16kB and 128KB buffer. I my opinion there are no need to use bigger buffers. You're missing the point here. This requires the client driver to know the maximum transfer size that the controller supports and split the transfer up for it. This isn't something that the SPI API supports now and while it would be useful to add for controllers that are genuinely limited in transfer size it doesn't seem sane to just randomly impose a limit to make life easier for software. Not all applications will be able to split their transfers up in the first place and it seems like the wrong place to put that knowledge. What I would expect to see the driver doing here is either allocating a larger buffer on demand or (probably more reliably given the need for contiguous physical memory) transparently splitting larger transfers. A super smart implementation would overlap the copy of blocks with the transfers of preceeding blocks, and there should be something we can do to avoid having to copy data that's already in suitable memory. Thanks for your comments. I will try implement transparently splitting larger transfer. Ideally this would all be factored out into the core based on limit information but that's probably not required immediately. Though now I write that I'm wondering how many other devices should be doing this but aren't. The DMA mapping of transfer buffers doesn't seem terribly device specific... I propose add extra module parameter which allows to change buffer size. No, module parameters are not sensible for modern code. If you wanted runtime configuration a sysfs tunable would be better though I do think that should be tuning optimisation, not a hard limit on the transfer size. ok Thanks Lukasz -- To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 3/6] spi: spi-s3c64xx: Add coherent buffers for DMA transfers
On 09/09/2013 04:45 PM, Mark Brown wrote: On Mon, Sep 09, 2013 at 04:09:23PM +0200, Lukasz Czerwinski wrote: The spi-s3c64xx currently doesn't support transfers from non-contiguous client buffers. This patch adds two coherent buffers which allow transfers from non-contiguous client buffers without extra coherent memory allocation in the client driver. Buffer size is hardcoded to 16kB for Tx/Rx. Client drivers shouldn't exceed that value. This seems like a very low limit to have, consider things like firmware downloads for example. It seems reasonable to have a preallocated small buffer but there should be some fallback for larger transfer sizes. I have tested my modification with different buffer sizes with S5C73M3 driver (355560 B upload). I obtained following upload times: 16kB buffer: - 83.972 ms - 79.196 ms - 79.432 ms 128kB buffer: - 74.449 ms - 80.719 ms - 75.599 ms For 256kB I obtained similar results as in 128kB case. Normally non-interrupted SPI transfer should take 56ms (50MHz). Performance loss is approximately about 6% between 16kB and 128KB buffer. I my opinion there are no need to use bigger buffers. I propose add extra module parameter which allows to change buffer size. I also didn't notice any checks in the code for the length of transfers so this will corrupt memory if a client driver tries to transfer more than the preallocated buffer as things stand. Right, I will add that. Thanks Lukasz -- To unsubscribe from this list: send the line unsubscribe linux-samsung-soc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 3/6] spi: spi-s3c64xx: Add coherent buffers for DMA transfers
On Tue, Sep 10, 2013 at 01:23:45PM +0200, Łukasz Czerwiński wrote: On 09/09/2013 04:45 PM, Mark Brown wrote: This seems like a very low limit to have, consider things like firmware downloads for example. It seems reasonable to have a preallocated small buffer but there should be some fallback for larger transfer sizes. I have tested my modification with different buffer sizes with S5C73M3 driver (355560 B upload). I obtained following upload times: 16kB buffer: - 83.972 ms - 79.196 ms - 79.432 ms 128kB buffer: - 74.449 ms - 80.719 ms - 75.599 ms For 256kB I obtained similar results as in 128kB case. Normally non-interrupted SPI transfer should take 56ms (50MHz). Performance loss is approximately about 6% between 16kB and 128KB buffer. I my opinion there are no need to use bigger buffers. You're missing the point here. This requires the client driver to know the maximum transfer size that the controller supports and split the transfer up for it. This isn't something that the SPI API supports now and while it would be useful to add for controllers that are genuinely limited in transfer size it doesn't seem sane to just randomly impose a limit to make life easier for software. Not all applications will be able to split their transfers up in the first place and it seems like the wrong place to put that knowledge. What I would expect to see the driver doing here is either allocating a larger buffer on demand or (probably more reliably given the need for contiguous physical memory) transparently splitting larger transfers. A super smart implementation would overlap the copy of blocks with the transfers of preceeding blocks, and there should be something we can do to avoid having to copy data that's already in suitable memory. Ideally this would all be factored out into the core based on limit information but that's probably not required immediately. Though now I write that I'm wondering how many other devices should be doing this but aren't. The DMA mapping of transfer buffers doesn't seem terribly device specific... I propose add extra module parameter which allows to change buffer size. No, module parameters are not sensible for modern code. If you wanted runtime configuration a sysfs tunable would be better though I do think that should be tuning optimisation, not a hard limit on the transfer size. signature.asc Description: Digital signature
[RFC PATCH 3/6] spi: spi-s3c64xx: Add coherent buffers for DMA transfers
The spi-s3c64xx currently doesn't support transfers from non-contiguous client buffers. This patch adds two coherent buffers which allow transfers from non-contiguous client buffers without extra coherent memory allocation in the client driver. Buffer size is hardcoded to 16kB for Tx/Rx. Client drivers shouldn't exceed that value. Signed-off-by: Lukasz Czerwinski l.czerwin...@samsung.com Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com --- drivers/spi/spi-s3c64xx.c | 107 ++--- 1 file changed, 90 insertions(+), 17 deletions(-) diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c index 28e8c71..1ec1244 100644 --- a/drivers/spi/spi-s3c64xx.c +++ b/drivers/spi/spi-s3c64xx.c @@ -113,6 +113,7 @@ #define S3C64XX_SPI_SWAP_TX_EN (10) #define S3C64XX_SPI_FBCLK_MSK (30) +#define S3C64XX_SPI_DMA_BUF_SIZE (16 * SZ_1K) #define FIFO_LVL_MASK(i) ((i)-port_conf-fifo_lvl_mask[i-port_id]) #define S3C64XX_SPI_ST_TX_DONE(v, i) (((v) \ @@ -133,6 +134,8 @@ #define TXBUSY(13) struct s3c64xx_spi_dma_data { + u32 *vbuf; + dma_addr_t dma_phys; struct dma_chan *ch; enum dma_transfer_direction direction; unsigned int dmach; @@ -178,6 +181,7 @@ struct s3c64xx_spi_port_config { * @xfer_completion: To indicate completion of xfer task. * @cur_mode: Stores the active configuration of the controller. * @cur_bpw: Stores the active bits per word settings. + * @dma_buf_size: Stores buffer size for Rx/Tx. * @cur_speed: Stores the active xfer clock speed. */ struct s3c64xx_spi_driver_data { @@ -194,6 +198,7 @@ struct s3c64xx_spi_driver_data { unsignedstate; unsignedcur_mode, cur_bpw; unsignedcur_speed; + unsigneddma_buf_size; struct s3c64xx_spi_dma_data rx_dma; struct s3c64xx_spi_dma_data tx_dma; @@ -279,8 +284,7 @@ static void s3c64xx_spi_dmacb(void *data) spin_unlock_irqrestore(sdd-lock, flags); } -static void s3c64xx_prepare_dma(struct s3c64xx_spi_dma_data *dma, - unsigned len, dma_addr_t buf) +static void s3c64xx_prepare_dma(struct s3c64xx_spi_dma_data *dma, unsigned len) { struct s3c64xx_spi_driver_data *sdd; struct dma_slave_config config; @@ -306,7 +310,7 @@ static void s3c64xx_prepare_dma(struct s3c64xx_spi_dma_data *dma, dmaengine_slave_config(dma-ch, config); } - desc = dmaengine_prep_slave_single(dma-ch, buf, len, + desc = dmaengine_prep_slave_single(dma-ch, dma-dma_phys, len, dma-direction, DMA_PREP_INTERRUPT); desc-callback = s3c64xx_spi_dmacb; @@ -382,6 +386,34 @@ static void s3c64xx_spi_dma_stop(struct s3c64xx_spi_driver_data *sdd, dmaengine_terminate_all(dma-ch); } +static void s3c64xx_copy_txbuf_to_spi(struct s3c64xx_spi_driver_data *sdd, + struct spi_transfer *xfer) +{ + struct device *dev = sdd-pdev-dev; + + dma_sync_single_for_cpu(dev, sdd-tx_dma.dma_phys, + sdd-dma_buf_size, DMA_TO_DEVICE); + + memcpy(sdd-tx_dma.vbuf, xfer-tx_buf, xfer-len); + + dma_sync_single_for_device(dev, sdd-tx_dma.dma_phys, + sdd-dma_buf_size, DMA_TO_DEVICE); +} + +static void s3c64xx_copy_spi_to_rxbuf(struct s3c64xx_spi_driver_data *sdd, + struct spi_transfer *xfer) +{ + struct device *dev = sdd-pdev-dev; + + dma_sync_single_for_cpu(dev, sdd-rx_dma.dma_phys, + sdd-dma_buf_size, DMA_FROM_DEVICE); + + memcpy(xfer-rx_buf, sdd-rx_dma.vbuf, xfer-len); + + dma_sync_single_for_device(dev, sdd-rx_dma.dma_phys, + sdd-dma_buf_size, DMA_FROM_DEVICE); +} + static void s3c64xx_enable_datapath(struct s3c64xx_spi_driver_data *sdd, struct spi_device *spi, struct spi_transfer *xfer, int dma_mode) @@ -413,8 +445,8 @@ static void s3c64xx_enable_datapath(struct s3c64xx_spi_driver_data *sdd, chcfg |= S3C64XX_SPI_CH_TXCH_ON; if (dma_mode) { modecfg |= S3C64XX_SPI_MODE_TXDMA_ON; - s3c64xx_prepare_dma(sdd-tx_dma, - xfer-len, xfer-tx_dma); + s3c64xx_copy_txbuf_to_spi(sdd, xfer); + s3c64xx_prepare_dma(sdd-tx_dma, xfer-len); } else { switch (sdd-cur_bpw) { case 32: @@ -446,8 +478,8 @@ static void s3c64xx_enable_datapath(struct s3c64xx_spi_driver_data *sdd, writel(((xfer-len * 8 / sdd-cur_bpw) 0x) | S3C64XX_SPI_PACKET_CNT_EN,