[PATCH 08/30] spi: dw: Discard DW SSI chip type storages

2020-09-20 Thread Serge Semin
Keeping SPI peripheral devices type is pointless since first it hasn't
been functionally utilized by any of the client drivers/code and second it
won't work for Microwire type at the very least. Moreover there is no
point in setting up the type by means of the chip-data in the modern
kernel. The peripheral devices with specific interface type need to be
detected in order to activate the corresponding frame format. It most
likely will require some peripheral device specific DT property or
whatever to find out the interface protocol. So let's remove the serial
interface type fields from the DW APB SSI controller and the SPI
peripheral device private data.

Note we'll preserve the explicit SSI_MOTO_SPI interface type setting up to
signify the only currently supported interface protocol.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 6 ++
 drivers/spi/spi-dw.h  | 1 -
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index be94ed5bb896..dd9574f9fafc 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -23,7 +23,6 @@
 /* Slave spi_dev related */
 struct chip_data {
u8 tmode;   /* TR/TO/RO/EEPROM */
-   u8 type;/* SPI/SSP/MicroWire */
 
u16 clk_div;/* baud rate divider */
u32 speed_hz;   /* baud rate */
@@ -238,7 +237,7 @@ u32 dw_spi_update_cr0(struct spi_controller *master, struct 
spi_device *spi,
 
/* Default SPI mode is SCPOL = 0, SCPH = 0 */
cr0 = (transfer->bits_per_word - 1)
-   | (chip->type << SPI_FRF_OFFSET)
+   | (SSI_MOTO_SPI << SPI_FRF_OFFSET)
| spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET) |
   (((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET) |
   (((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET))
@@ -260,7 +259,7 @@ u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master,
cr0 = (transfer->bits_per_word - 1);
 
/* CTRLR0[ 7: 6] Frame Format */
-   cr0 |= chip->type << DWC_SSI_CTRLR0_FRF_OFFSET;
+   cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET;
 
/*
 * SPI mode (SCPOL|SCPH)
@@ -453,7 +452,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
return -ENOMEM;
 
dws->master = master;
-   dws->type = SSI_MOTO_SPI;
dws->dma_addr = (dma_addr_t)(dws->paddr + DW_SPI_DR);
 
spi_controller_set_devdata(master, dws);
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 063fa1b1352d..4420b4d29bac 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -111,7 +111,6 @@ struct dw_spi_dma_ops {
 
 struct dw_spi {
struct spi_controller   *master;
-   enum dw_ssi_typetype;
 
void __iomem*regs;
unsigned long   paddr;
-- 
2.27.0



[PATCH 29/30] dt-bindings: spi: dw: Add Baikal-T1 SPI Controllers

2020-09-20 Thread Serge Semin
These controllers are based on the DW APB SSI IP-core and embedded into
the SoC, so two of them are equipped with IRQ, DMA, 64 words FIFOs and 4
native CS, while another one as being utilized by the Baikal-T1 System
Boot Controller has got a very limited resources: no IRQ, no DMA, only a
single native chip-select and just 8 bytes Tx/Rx FIFOs available. That's
why we have to mark the IRQ to be optional for the later interface.

The SPI controller embedded into the Baikal-T1 System Boot Controller can
be also used to directly access an external SPI flash by means of a
dedicated FSM. The corresponding MMIO region availability is switchable by
the embedded multiplexor, which phandle can be specified in the dts node.

* We added a new example to test out the non-standard Baikal-T1 System
Boot SPI Controller DT binding.

Co-developed-by: Ramil Zaripov 
Signed-off-by: Ramil Zaripov 
Signed-off-by: Serge Semin 
---
 .../bindings/spi/snps,dw-apb-ssi.yaml | 33 +--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml 
b/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
index c62cbe79f00d..d6ae35777dac 100644
--- a/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
+++ b/Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
@@ -22,6 +22,21 @@ allOf:
   properties:
 reg:
   minItems: 2
+  - if:
+  properties:
+compatible:
+  contains:
+enum:
+  - baikal,bt1-sys-ssi
+then:
+  properties:
+mux-controls:
+  maxItems: 1
+  required:
+- mux-controls
+else:
+  required:
+- interrupts
 
 properties:
   compatible:
@@ -44,12 +59,16 @@ properties:
   - const: snps,dw-apb-ssi
   - description: Intel Keem Bay SPI Controller
 const: intel,keembay-ssi
+  - description: Baikal-T1 SPI Controller
+const: baikal,bt1-ssi
+  - description: Baikal-T1 System Boot SPI Controller
+const: baikal,bt1-sys-ssi
 
   reg:
 minItems: 1
 items:
   - description: DW APB SSI controller memory mapped registers
-  - description: SPI MST region map
+  - description: SPI MST region map or directly mapped SPI ROM
 
   interrupts:
 maxItems: 1
@@ -114,7 +133,6 @@ required:
   - reg
   - "#address-cells"
   - "#size-cells"
-  - interrupts
   - clocks
 
 examples:
@@ -130,4 +148,15 @@ examples:
   cs-gpios = < 13 0>,
  < 14 0>;
 };
+  - |
+spi@1f040100 {
+  compatible = "baikal,bt1-sys-ssi";
+  reg = <0x1f040100 0x900>,
+<0x1c00 0x100>;
+  #address-cells = <1>;
+  #size-cells = <0>;
+  mux-controls = <_mux>;
+  clocks = <_sys>;
+  clock-names = "ssi_clk";
+};
 ...
-- 
2.27.0



[PATCH 23/30] spi: dw: Explicitly de-assert CS on SPI transfer completion

2020-09-20 Thread Serge Semin
By design of the currently available native set_cs callback, the CS
de-assertion will be done only if it's required by the corresponding
controller capability. But in order to pre-fill the Tx FIFO buffer with
data during the SPI memory ops execution the SER register needs to be left
cleared before that. We'll also need a way to explicitly set and clear the
corresponding CS bit at a certain moment of the operation. Let's alter
the set_cs function then to also de-activate the CS, when it's required.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 65db4dd3ea8a..7a25ea6f4af6 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -100,7 +100,7 @@ void dw_spi_set_cs(struct spi_device *spi, bool enable)
 */
if (cs_high == enable)
dw_writel(dws, DW_SPI_SER, BIT(spi->chip_select));
-   else if (dws->caps & DW_SPI_CAP_CS_OVERRIDE)
+   else
dw_writel(dws, DW_SPI_SER, 0);
 }
 EXPORT_SYMBOL_GPL(dw_spi_set_cs);
-- 
2.27.0



[PATCH 10/30] spi: dw: Add KeemBay Master capability

2020-09-20 Thread Serge Semin
In a further commit we'll have to get rid of the update_cr0() callback and
define a DW SSI capability instead. Since Keem Bay master/slave
functionality is controller by the CTRL0 register bitfield, we need to
first move the master mode selection into the internal corresponding
update_cr0 method, which would be activated by means of the dedicated
DW_SPI_CAP_KEEMBAY_MST capability setup.

Note this will be also useful if the driver will be ever altered to
support the DW SPI slave interface.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c |  4 
 drivers/spi/spi-dw-mmio.c | 20 +++-
 drivers/spi/spi-dw.h  |  8 
 3 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 78e5af8ed173..8f9737640ec1 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -252,6 +252,7 @@ u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master,
 struct spi_device *spi,
 struct spi_transfer *transfer)
 {
+   struct dw_spi *dws = spi_controller_get_devdata(master);
struct chip_data *chip = spi_get_ctldata(spi);
u32 cr0;
 
@@ -275,6 +276,9 @@ u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master,
/* CTRLR0[13] Shift Register Loop */
cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << DWC_SSI_CTRLR0_SRL_OFFSET;
 
+   if (dws->caps & DW_SPI_CAP_KEEMBAY_MST)
+   cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST;
+
return cr0;
 }
 EXPORT_SYMBOL_GPL(dw_spi_update_cr0_v1_01a);
diff --git a/drivers/spi/spi-dw-mmio.c b/drivers/spi/spi-dw-mmio.c
index 7111cb7ca23b..c0d351fde782 100644
--- a/drivers/spi/spi-dw-mmio.c
+++ b/drivers/spi/spi-dw-mmio.c
@@ -48,13 +48,6 @@ struct dw_spi_mmio {
 #define SPARX5_FORCE_ENA   0xa4
 #define SPARX5_FORCE_VAL   0xa8
 
-/*
- * For Keem Bay, CTRLR0[31] is used to select controller mode.
- * 0: SSI is slave
- * 1: SSI is master
- */
-#define KEEMBAY_CTRLR0_SSIC_IS_MST BIT(31)
-
 struct dw_spi_mscc {
struct regmap   *syscon;
void __iomem*spi_mst; /* Not sparx5 */
@@ -234,20 +227,13 @@ static int dw_spi_dwc_ssi_init(struct platform_device 
*pdev,
return 0;
 }
 
-static u32 dw_spi_update_cr0_keembay(struct spi_controller *master,
-struct spi_device *spi,
-struct spi_transfer *transfer)
-{
-   u32 cr0 = dw_spi_update_cr0_v1_01a(master, spi, transfer);
-
-   return cr0 | KEEMBAY_CTRLR0_SSIC_IS_MST;
-}
-
 static int dw_spi_keembay_init(struct platform_device *pdev,
   struct dw_spi_mmio *dwsmmio)
 {
+   dwsmmio->dws.caps = DW_SPI_CAP_KEEMBAY_MST;
+
/* Register hook to configure CTRLR0 */
-   dwsmmio->dws.update_cr0 = dw_spi_update_cr0_keembay;
+   dwsmmio->dws.update_cr0 = dw_spi_update_cr0_v1_01a;
 
return 0;
 }
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 4c748b2853a8..da9b543322c9 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -71,6 +71,13 @@
 #define DWC_SSI_CTRLR0_FRF_OFFSET  6
 #define DWC_SSI_CTRLR0_DFS_OFFSET  0
 
+/*
+ * For Keem Bay, CTRLR0[31] is used to select controller mode.
+ * 0: SSI is slave
+ * 1: SSI is master
+ */
+#define DWC_SSI_CTRLR0_KEEMBAY_MST BIT(31)
+
 /* Bit fields in SR, 7 bits */
 #define SR_MASK0x7f/* cover 7 bits 
*/
 #define SR_BUSY(1 << 0)
@@ -101,6 +108,7 @@ enum dw_ssi_type {
 
 /* DW SPI capabilities */
 #define DW_SPI_CAP_CS_OVERRIDE BIT(0)
+#define DW_SPI_CAP_KEEMBAY_MST BIT(1)
 
 struct dw_spi;
 struct dw_spi_dma_ops {
-- 
2.27.0



[PATCH 03/30] spi: dw: Initialize n_bytes before the memory barrier

2020-09-20 Thread Serge Semin
Since n_bytes field of the DW SPI private data is also utilized by the
IRQ handler, we need to make sure it' initialization is done before the
memory barrier.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 8b3ce5a0378a..1af74362461d 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -299,6 +299,7 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
 
dws->dma_mapped = 0;
spin_lock_irqsave(>buf_lock, flags);
+   dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE);
dws->tx = (void *)transfer->tx_buf;
dws->tx_end = dws->tx + transfer->len;
dws->rx = transfer->rx_buf;
@@ -323,7 +324,6 @@ static int dw_spi_transfer_one(struct spi_controller 
*master,
}
 
transfer->effective_speed_hz = dws->max_freq / chip->clk_div;
-   dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE);
 
cr0 = dws->update_cr0(master, spi, transfer);
dw_writel(dws, DW_SPI_CTRLR0, cr0);
-- 
2.27.0



[PATCH 09/30] spi: dw: Convert CS-override to DW SPI capabilities

2020-09-20 Thread Serge Semin
There are several vendor-specific versions of the DW SPI controllers,
each of which may have some peculiarities with respect to the original
IP-core. Seeing it has already caused adding flags and a callback into the
DW SPI private data, let's introduce a generic capabilities interface to
tune the generic DW SPI controller driver up in accordance with the
particular controller specifics. It's done by converting a simple
Alpine-specific CS-override capability into the DW SPI controller
capability activated by setting the DW_SPI_CAP_CS_OVERRIDE flag.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 4 ++--
 drivers/spi/spi-dw-mmio.c | 2 +-
 drivers/spi/spi-dw.h  | 7 ++-
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index dd9574f9fafc..78e5af8ed173 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -104,7 +104,7 @@ void dw_spi_set_cs(struct spi_device *spi, bool enable)
 */
if (cs_high == enable)
dw_writel(dws, DW_SPI_SER, BIT(spi->chip_select));
-   else if (dws->cs_override)
+   else if (dws->caps & DW_SPI_CAP_CS_OVERRIDE)
dw_writel(dws, DW_SPI_SER, 0);
 }
 EXPORT_SYMBOL_GPL(dw_spi_set_cs);
@@ -435,7 +435,7 @@ static void spi_hw_init(struct device *dev, struct dw_spi 
*dws)
}
 
/* enable HW fixup for explicit CS deselect for Amazon's alpine chip */
-   if (dws->cs_override)
+   if (dws->caps & DW_SPI_CAP_CS_OVERRIDE)
dw_writel(dws, DW_SPI_CS_OVERRIDE, 0xF);
 }
 
diff --git a/drivers/spi/spi-dw-mmio.c b/drivers/spi/spi-dw-mmio.c
index 18772c0c9220..7111cb7ca23b 100644
--- a/drivers/spi/spi-dw-mmio.c
+++ b/drivers/spi/spi-dw-mmio.c
@@ -204,7 +204,7 @@ static int dw_spi_mscc_sparx5_init(struct platform_device 
*pdev,
 static int dw_spi_alpine_init(struct platform_device *pdev,
  struct dw_spi_mmio *dwsmmio)
 {
-   dwsmmio->dws.cs_override = 1;
+   dwsmmio->dws.caps = DW_SPI_CAP_CS_OVERRIDE;
 
/* Register hook to configure CTRLR0 */
dwsmmio->dws.update_cr0 = dw_spi_update_cr0;
diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 4420b4d29bac..4c748b2853a8 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -2,6 +2,7 @@
 #ifndef DW_SPI_HEADER_H
 #define DW_SPI_HEADER_H
 
+#include 
 #include 
 #include 
 #include 
@@ -98,6 +99,9 @@ enum dw_ssi_type {
SSI_NS_MICROWIRE,
 };
 
+/* DW SPI capabilities */
+#define DW_SPI_CAP_CS_OVERRIDE BIT(0)
+
 struct dw_spi;
 struct dw_spi_dma_ops {
int (*dma_init)(struct device *dev, struct dw_spi *dws);
@@ -118,7 +122,8 @@ struct dw_spi {
u32 fifo_len;   /* depth of the FIFO buffer */
u32 max_freq;   /* max bus freq supported */
 
-   int cs_override;
+   u32 caps;   /* DW SPI capabilities */
+
u32 reg_io_width;   /* DR I/O width in bytes */
u16 bus_num;
u16 num_cs; /* supported slave numbers */
-- 
2.27.0



[PATCH 00/30] spi: dw: Add full Baikal-T1 SPI Controllers support

2020-09-20 Thread Serge Semin
 it is the data IO procedure and IRQ-based SPI-transfer
implementation refactoring. The former one will look much simpler if the
buffers initial pointers and the buffers length data utilized instead of
the Tx/Rx buffers start and end pointers. The later one currently lacks of
valid execution at the final stage of the SPI-transfer. So if there is no
data left to send, but there is still data which needs to be received, the
Tx FIFO Empty IRQ will constantly happen until all of the requested
inbound data is received. So we suggest to fix that by taking the Rx FIFO
Empty IRQ into account.

Ninthly it's potentially errors prone to enable the DW APB SSI interrupts
before enabling the chip. It specifically concerns a case if for some
reason the DW APB SSI IRQs handler is executed before the controller is
enabled. That will cause a part of the outbound data loss. So we suggest
to reverse the order.

Tenthly in order to be able to pre-initialize the Tx FIFO with data and
only the start the SPI memory operations we need to have any CS
de-activated. We'll fulfil that requirement by explicitly clearing the CS
on the SPI transfer completion and at the explicit controller reset.

Then seeing all the currently available and potentially being created
types of the SPI transfers need to perform the DW APB SSI controller
status register check and the errors handler procedure, we've created a
common method for all of them.

Eleventhly if before we've mostly had a series of fixups, cleanups and
refactorings, here we've finally come to the new functionality
implementation. It concerns the poll-based transfer (as Baikal-T1 System
Boot SPI controller lacks a dedicated IRQ lane connected) and the SPI
memory operations implementation. If the former feature is pretty much
straightforward (see the patch log for details), the later one is a bit
tricky. It's based on the EEPROM-read (write-then-read) and the Tx-only
modes of the DW APB SSI controller, which as performing the automatic data
read and write let's us to implement the faster IO procedure than using
the Tx-Rx-mode-based approach. Having the memory-operations implemented
that way is the best thing we can currently do to provide the errors-less
SPI transfers to SPI devices with native CS attached.

Note the approach utilized here to develop the SPI memory operations can
be also used to create the "automatic CS toggle problem"-free(ish) SPI
transfers (combine SPI-message transfers into two buffers, disable
interrupts, push-pull the combined data). But we don't provide a solution
in the framework of this patchset. It is a matter of a dedicated one,
which we currently don't intend to spend our time on.

Finally at the closure of the this patchset you'll find patches, which
provide the Baikal-T1-specific DW APB SSI controllers support. The SoC has
got three SPI controllers. Two of them are pretty much normal DW APB SSI
interfaces: with IRQ, DMA, FIFOs of 64 words depth, 4x CSs. But the third
one as being a part of the Baikal-T1 System Boot Controller has got a very
limited resources: no IRQ, no DMA, only a single native chip-select and
Tx/Rx FIFOs with just 8 words depth available. In order to provide a
transparent initial boot code execution the System Boot SPI Controller is
also utilized by an vendor-specific IP-block, which exposes an SPI flash
memory direct mapping interface. Please see the corresponding patch for
details.

Link: 
https://lore.kernel.org/linux-spi/20200508093621.31619-1-sergey.se...@baikalelectronics.ru/

[1] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt,
Section "KERNEL I/O BARRIER EFFECTS"

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: Ramil Zaripov 
Cc: Pavel Parkhomenko 
Cc: Andy Shevchenko 
Cc: Andy Shevchenko 
Cc: Lars Povlsen 
Cc: wuxu.wu 
Cc: Feng Tang 
Cc: Rob Herring 
Cc: linux-...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (30):
  spi: dw: Discard IRQ threshold macro
  spi: dw: Use ternary op to init set_cs callback
  spi: dw: Initialize n_bytes before the memory barrier
  Revert: spi: spi-dw: Add lock protect dw_spi rx/tx to prevent
concurrent calls
  spi: dw: Clear IRQ status on DW SPI controller reset
  spi: dw: Disable all IRQs when controller is unused
  spi: dw: Use relaxed IO-methods to access FIFOs
  spi: dw: Discard DW SSI chip type storages
  spi: dw: Convert CS-override to DW SPI capabilities
  spi: dw: Add KeemBay Master capability
  spi: dw: Add DWC SSI capability
  spi: dw: Detach SPI device specific CR0 config method
  spi: dw: Update SPI bus speed in a config function
  spi: dw: Simplify the SPI bus speed config procedure
  spi: dw: Update Rx sample delay in the config function
  spi: dw: Add DW SPI controller config structure
  spi: dw: Refactor data IO procedure
  spi: dw: Refactor IRQ-based SPI transfer procedure
  spi: dw: Perform IRQ setup in a dedicated function
  spi: dw: Unmask IRQs after enabling the chip
  

[PATCH 02/30] spi: dw: Use ternary op to init set_cs callback

2020-09-20 Thread Serge Semin
Simplify the dw_spi_add_host() method a bit by replacing the set_cs
callback overwrite procedure with direct setting the callback if a custom
version of one is specified.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 55afdcee7d2b..8b3ce5a0378a 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -482,7 +482,7 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
master->num_chipselect = dws->num_cs;
master->setup = dw_spi_setup;
master->cleanup = dw_spi_cleanup;
-   master->set_cs = dw_spi_set_cs;
+   master->set_cs = dws->set_cs ?: dw_spi_set_cs;
master->transfer_one = dw_spi_transfer_one;
master->handle_err = dw_spi_handle_err;
master->max_speed_hz = dws->max_freq;
@@ -491,9 +491,6 @@ int dw_spi_add_host(struct device *dev, struct dw_spi *dws)
master->flags = SPI_MASTER_GPIO_SS;
master->auto_runtime_pm = true;
 
-   if (dws->set_cs)
-   master->set_cs = dws->set_cs;
-
/* Get default rx sample delay */
device_property_read_u32(dev, "rx-sample-delay-ns",
 >def_rx_sample_dly_ns);
-- 
2.27.0



[PATCH 01/30] spi: dw: Discard IRQ threshold macro

2020-09-20 Thread Serge Semin
The macro has been unused since a half of FIFO length was defined to be a
marker of the IRQ. Let's remove it definition.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index 90dfd21622d6..51bab30b9f85 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -92,9 +92,6 @@
 #define SPI_DMA_RDMAE  (1 << 0)
 #define SPI_DMA_TDMAE  (1 << 1)
 
-/* TX RX interrupt level threshold, max can be 256 */
-#define SPI_INT_THRESHOLD  32
-
 enum dw_ssi_type {
SSI_MOTO_SPI = 0,
SSI_TI_SSP,
-- 
2.27.0



[PATCH 12/30] spi: dw: Detach SPI device specific CR0 config method

2020-09-20 Thread Serge Semin
Indeed there is no point in detecting the SPI peripheral device parameters
and initializing the CR0 register fields each time an SPI transfer is
executed. Instead let's define a dedicated CR0 chip-data member, which
will be initialized in accordance with the SPI device settings at the
moment of setting it up.

By doing so we'll finally make the SPI device chip_data serving as it's
supposed to - to preserve the SPI device specific DW SPI configuration.
See spi-fsl-dspi.c, spi-pl022.c, spi-pxa2xx.c drivers for example of the
way the chip data is utilized.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-core.c | 35 +++
 1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index c21641a485ce..a9644351d75f 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -27,6 +27,7 @@ struct chip_data {
u16 clk_div;/* baud rate divider */
u32 speed_hz;   /* baud rate */
 
+   u32 cr0;
u32 rx_sample_dly;  /* RX sample delay */
 };
 
@@ -228,31 +229,41 @@ static irqreturn_t dw_spi_irq(int irq, void *dev_id)
return dws->transfer_handler(dws);
 }
 
-static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi,
- struct spi_transfer *transfer)
+static u32 dw_spi_get_cr0(struct dw_spi *dws, struct spi_device *spi)
 {
-   struct chip_data *chip = spi_get_ctldata(spi);
-   u32 cr0;
-
-   cr0 = (transfer->bits_per_word - 1);
+   u32 cr0 = 0;
 
if (!(dws->caps & DW_SPI_CAP_DWC_SSI)) {
cr0 |= SSI_MOTO_SPI << SPI_FRF_OFFSET;
cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET;
cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET;
cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET;
-   cr0 |= chip->tmode << SPI_TMOD_OFFSET;
} else {
cr0 |= SSI_MOTO_SPI << DWC_SSI_CTRLR0_FRF_OFFSET;
cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << 
DWC_SSI_CTRLR0_SCPOL_OFFSET;
cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << 
DWC_SSI_CTRLR0_SCPH_OFFSET;
cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << 
DWC_SSI_CTRLR0_SRL_OFFSET;
-   cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
 
if (dws->caps & DW_SPI_CAP_KEEMBAY_MST)
cr0 |= DWC_SSI_CTRLR0_KEEMBAY_MST;
}
 
+   return cr0;
+}
+
+static void dw_spi_update_cr0(struct dw_spi *dws, struct spi_device *spi,
+ struct spi_transfer *transfer)
+{
+   struct chip_data *chip = spi_get_ctldata(spi);
+   u32 cr0 = chip->cr0;
+
+   cr0 |= (transfer->bits_per_word - 1);
+
+   if (!(dws->caps & DW_SPI_CAP_DWC_SSI))
+   cr0 |= chip->tmode << SPI_TMOD_OFFSET;
+   else
+   cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
+
dw_writel(dws, DW_SPI_CTRLR0, cr0);
 }
 
@@ -350,6 +361,7 @@ static void dw_spi_handle_err(struct spi_controller *master,
 /* This may be called twice for each spi dev */
 static int dw_spi_setup(struct spi_device *spi)
 {
+   struct dw_spi *dws = spi_controller_get_devdata(spi->controller);
struct chip_data *chip;
 
/* Only alloc on first setup */
@@ -373,6 +385,13 @@ static int dw_spi_setup(struct spi_device *spi)
dws->max_freq);
}
 
+   /*
+* Update CR0 data each time the setup callback is invoked since
+* the device parameters could have been changed, for instance, by
+* the MMC SPI driver or something else.
+*/
+   chip->cr0 = dw_spi_get_cr0(dws, spi);
+
chip->tmode = SPI_TMOD_TR;
 
return 0;
-- 
2.27.0



[PATCH 07/30] spi: dw: Use relaxed IO-methods to access FIFOs

2020-09-20 Thread Serge Semin
In accordance with [1] the relaxed methods are guaranteed to be ordered
with respect to other accesses from the same CPU thread to the same
peripheral.  This is what we need during the data read/write from/to the
controller FIFOs being executed within a single IRQ handler or a kernel
task.

Such optimization shall significantly speed the data reader and writer up.
For instance, the relaxed IO-accessors utilization on Baikal-T1 lets the
driver to support the SPI memory operations with bus frequency three-fold
faster than if normal IO-accessors would be used.

[1] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt,
Section "KERNEL I/O BARRIER EFFECTS"

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw.h | 18 --
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/drivers/spi/spi-dw.h b/drivers/spi/spi-dw.h
index ff77f39047ce..063fa1b1352d 100644
--- a/drivers/spi/spi-dw.h
+++ b/drivers/spi/spi-dw.h
@@ -161,29 +161,19 @@ static inline u32 dw_readl(struct dw_spi *dws, u32 offset)
return __raw_readl(dws->regs + offset);
 }
 
-static inline u16 dw_readw(struct dw_spi *dws, u32 offset)
-{
-   return __raw_readw(dws->regs + offset);
-}
-
 static inline void dw_writel(struct dw_spi *dws, u32 offset, u32 val)
 {
__raw_writel(val, dws->regs + offset);
 }
 
-static inline void dw_writew(struct dw_spi *dws, u32 offset, u16 val)
-{
-   __raw_writew(val, dws->regs + offset);
-}
-
 static inline u32 dw_read_io_reg(struct dw_spi *dws, u32 offset)
 {
switch (dws->reg_io_width) {
case 2:
-   return dw_readw(dws, offset);
+   return readw_relaxed(dws->regs + offset);
case 4:
default:
-   return dw_readl(dws, offset);
+   return readl_relaxed(dws->regs + offset);
}
 }
 
@@ -191,11 +181,11 @@ static inline void dw_write_io_reg(struct dw_spi *dws, 
u32 offset, u32 val)
 {
switch (dws->reg_io_width) {
case 2:
-   dw_writew(dws, offset, val);
+   writew_relaxed(val, dws->regs + offset);
break;
case 4:
default:
-   dw_writel(dws, offset, val);
+   writel_relaxed(val, dws->regs + offset);
break;
}
 }
-- 
2.27.0



[PATCH v2 05/11] spi: dw-dma: Move DMA transfers submission to the channels prep methods

2020-09-20 Thread Serge Semin
Indeed we can freely move the dmaengine_submit() method invocation and the
Tx and Rx busy flag setting into the DMA Tx/Rx prepare methods. Since the
Tx/Rx preparation method is now mainly used for the DMA transfers
submission, here we suggest to rename it to have the _submit_{r,t}x suffix
instead.

By having this alteration applied first we implement another code
preparation before adding the one-by-one DMA SG entries transmission,
second we now have the dma_async_tx_descriptor descriptor used locally
only in the new DMA transfers submission methods (this will be cleaned up
a bit later), third we make the generic transfer method more readable,
where now the functionality of submission, execution and wait procedures
is transparently split up instead of having a preparation, intermixed
submission/execution and wait procedures.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index d2a67dee1a66..769d10ca74b4 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -272,7 +272,7 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws)
 }
 
 static struct dma_async_tx_descriptor *
-dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *txdesc;
 
@@ -287,6 +287,9 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
txdesc->callback = dw_spi_dma_tx_done;
txdesc->callback_param = dws;
 
+   dmaengine_submit(txdesc);
+   set_bit(TX_BUSY, >dma_chan_busy);
+
return txdesc;
 }
 
@@ -364,7 +367,7 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws)
return dmaengine_slave_config(dws->rxchan, );
 }
 
-static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi 
*dws,
+static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws,
struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *rxdesc;
@@ -380,6 +383,9 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
rxdesc->callback = dw_spi_dma_rx_done;
rxdesc->callback_param = dws;
 
+   dmaengine_submit(rxdesc);
+   set_bit(RX_BUSY, >dma_chan_busy);
+
return rxdesc;
 }
 
@@ -426,25 +432,21 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
struct dma_async_tx_descriptor *txdesc, *rxdesc;
int ret;
 
-   /* Prepare the TX dma transfer */
-   txdesc = dw_spi_dma_prepare_tx(dws, xfer);
+   /* Submit the DMA Tx transfer */
+   txdesc = dw_spi_dma_submit_tx(dws, xfer);
if (!txdesc)
return -EINVAL;
 
-   /* Prepare the RX dma transfer */
+   /* Submit the DMA Rx transfer if required */
if (xfer->rx_buf) {
-   rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
+   rxdesc = dw_spi_dma_submit_rx(dws, xfer);
if (!rxdesc)
return -EINVAL;
 
/* rx must be started before tx due to spi instinct */
-   set_bit(RX_BUSY, >dma_chan_busy);
-   dmaengine_submit(rxdesc);
dma_async_issue_pending(dws->rxchan);
}
 
-   set_bit(TX_BUSY, >dma_chan_busy);
-   dmaengine_submit(txdesc);
dma_async_issue_pending(dws->txchan);
 
ret = dw_spi_dma_wait(dws, xfer);
-- 
2.27.0



[PATCH v2 01/11] spi: dw-dma: Set DMA Level registers on init

2020-09-20 Thread Serge Semin
Indeed the registers content doesn't get cleared when the SPI controller
is disabled and enabled. Max burst lengths aren't changed since the Rx and
Tx DMA channels are requested on init stage and are kept acquired until
the device is removed. Obviously SPI controller FIFO depth can't be
changed. Due to all of that we can safely move the DMA Transmit and
Receive data level registers initialization to the SPI controller DMA init
stage (when the SPI controller is being probed) instead of doing it for
each SPI transfer when dma_setup is called. This shall speed the DMA-based
SPI transfer initialization up a bit, particularly if the APB bus is
relatively slow.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 28 +---
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index bb390ff67d1d..a7ff1e357f8b 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -49,6 +49,7 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
max_burst = RX_BURST_LEVEL;
 
dws->rxburst = min(max_burst, def_burst);
+   dw_writel(dws, DW_SPI_DMARDLR, dws->rxburst - 1);
 
ret = dma_get_slave_caps(dws->txchan, );
if (!ret && caps.max_burst)
@@ -56,7 +57,19 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
else
max_burst = TX_BURST_LEVEL;
 
+   /*
+* Having a Rx DMA channel serviced with higher priority than a Tx DMA
+* channel might not be enough to provide a well balanced DMA-based
+* SPI transfer interface. There might still be moments when the Tx DMA
+* channel is occasionally handled faster than the Rx DMA channel.
+* That in its turn will eventually cause the SPI Rx FIFO overflow if
+* SPI bus speed is high enough to fill the SPI Rx FIFO in before it's
+* cleared by the Rx DMA channel. In order to fix the problem the Tx
+* DMA activity is intentionally slowed down by limiting the SPI Tx
+* FIFO depth with a value twice bigger than the Tx burst length.
+*/
dws->txburst = min(max_burst, def_burst);
+   dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
 }
 
 static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws)
@@ -372,21 +385,6 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct 
spi_transfer *xfer)
 {
u16 imr = 0, dma_ctrl = 0;
 
-   /*
-* Having a Rx DMA channel serviced with higher priority than a Tx DMA
-* channel might not be enough to provide a well balanced DMA-based
-* SPI transfer interface. There might still be moments when the Tx DMA
-* channel is occasionally handled faster than the Rx DMA channel.
-* That in its turn will eventually cause the SPI Rx FIFO overflow if
-* SPI bus speed is high enough to fill the SPI Rx FIFO in before it's
-* cleared by the Rx DMA channel. In order to fix the problem the Tx
-* DMA activity is intentionally slowed down by limiting the SPI Tx
-* FIFO depth with a value twice bigger than the Tx burst length
-* calculated earlier by the dw_spi_dma_maxburst_init() method.
-*/
-   dw_writel(dws, DW_SPI_DMARDLR, dws->rxburst - 1);
-   dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
-
if (xfer->tx_buf)
dma_ctrl |= SPI_DMA_TDMAE;
if (xfer->rx_buf)
-- 
2.27.0



[PATCH v2 07/11] spi: dw-dma: Remove DMA Tx-desc passing around

2020-09-20 Thread Serge Semin
It's pointless to pass the Rx and Tx transfers DMA Tx-descriptors, since
they are used in the Tx/Rx submit method only. Instead just return the
submission status from these methods. This alteration will make the code
less complex.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 31 ++-
 1 file changed, 14 insertions(+), 17 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index aa3900809126..9f70818acce6 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -271,8 +271,7 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws)
return dmaengine_slave_config(dws->txchan, );
 }
 
-static struct dma_async_tx_descriptor *
-dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *txdesc;
dma_cookie_t cookie;
@@ -284,7 +283,7 @@ dw_spi_dma_submit_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
DMA_MEM_TO_DEV,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!txdesc)
-   return NULL;
+   return -ENOMEM;
 
txdesc->callback = dw_spi_dma_tx_done;
txdesc->callback_param = dws;
@@ -293,12 +292,12 @@ dw_spi_dma_submit_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
ret = dma_submit_error(cookie);
if (ret) {
dmaengine_terminate_sync(dws->txchan);
-   return NULL;
+   return ret;
}
 
set_bit(TX_BUSY, >dma_chan_busy);
 
-   return txdesc;
+   return 0;
 }
 
 static inline bool dw_spi_dma_rx_busy(struct dw_spi *dws)
@@ -375,8 +374,7 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws)
return dmaengine_slave_config(dws->rxchan, );
 }
 
-static struct dma_async_tx_descriptor *dw_spi_dma_submit_rx(struct dw_spi *dws,
-   struct spi_transfer *xfer)
+static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *rxdesc;
dma_cookie_t cookie;
@@ -388,7 +386,7 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_submit_rx(struct dw_spi *dws,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!rxdesc)
-   return NULL;
+   return -ENOMEM;
 
rxdesc->callback = dw_spi_dma_rx_done;
rxdesc->callback_param = dws;
@@ -397,12 +395,12 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_submit_rx(struct dw_spi *dws,
ret = dma_submit_error(cookie);
if (ret) {
dmaengine_terminate_sync(dws->rxchan);
-   return NULL;
+   return ret;
}
 
set_bit(RX_BUSY, >dma_chan_busy);
 
-   return rxdesc;
+   return 0;
 }
 
 static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
@@ -445,19 +443,18 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct 
spi_transfer *xfer)
 
 static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
 {
-   struct dma_async_tx_descriptor *txdesc, *rxdesc;
int ret;
 
/* Submit the DMA Tx transfer */
-   txdesc = dw_spi_dma_submit_tx(dws, xfer);
-   if (!txdesc)
-   return -EINVAL;
+   ret = dw_spi_dma_submit_tx(dws, xfer);
+   if (ret)
+   return ret;
 
/* Submit the DMA Rx transfer if required */
if (xfer->rx_buf) {
-   rxdesc = dw_spi_dma_submit_rx(dws, xfer);
-   if (!rxdesc)
-   return -EINVAL;
+   ret = dw_spi_dma_submit_rx(dws, xfer);
+   if (ret)
+   return ret;
 
/* rx must be started before tx due to spi instinct */
dma_async_issue_pending(dws->rxchan);
-- 
2.27.0



[PATCH v2 06/11] spi: dw-dma: Check DMA Tx-desc submission status

2020-09-20 Thread Serge Semin
We suggest to add the dmaengine_submit() return value test for errors.  It
has been unnecessary while the driver was expected to be utilized in pair
with DW DMAC. But since now the driver can be used with any DMA engine, it
might be useful to track the errors on DMA submissions so not miss them
and get into an unpredictable driver behaviour.

Signed-off-by: Serge Semin 

---

Changelog v2:
- Replace negative conditional statements with the positive ones.
- Terminate the prepared descriptors on submission errors.
---
 drivers/spi/spi-dw-dma.c | 20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 769d10ca74b4..aa3900809126 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -275,6 +275,8 @@ static struct dma_async_tx_descriptor *
 dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *txdesc;
+   dma_cookie_t cookie;
+   int ret;
 
txdesc = dmaengine_prep_slave_sg(dws->txchan,
xfer->tx_sg.sgl,
@@ -287,7 +289,13 @@ dw_spi_dma_submit_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
txdesc->callback = dw_spi_dma_tx_done;
txdesc->callback_param = dws;
 
-   dmaengine_submit(txdesc);
+   cookie = dmaengine_submit(txdesc);
+   ret = dma_submit_error(cookie);
+   if (ret) {
+   dmaengine_terminate_sync(dws->txchan);
+   return NULL;
+   }
+
set_bit(TX_BUSY, >dma_chan_busy);
 
return txdesc;
@@ -371,6 +379,8 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_submit_rx(struct dw_spi *dws,
struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *rxdesc;
+   dma_cookie_t cookie;
+   int ret;
 
rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
xfer->rx_sg.sgl,
@@ -383,7 +393,13 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_submit_rx(struct dw_spi *dws,
rxdesc->callback = dw_spi_dma_rx_done;
rxdesc->callback_param = dws;
 
-   dmaengine_submit(rxdesc);
+   cookie = dmaengine_submit(rxdesc);
+   ret = dma_submit_error(cookie);
+   if (ret) {
+   dmaengine_terminate_sync(dws->rxchan);
+   return NULL;
+   }
+
set_bit(RX_BUSY, >dma_chan_busy);
 
return rxdesc;
-- 
2.27.0



[PATCH v2 08/11] spi: dw-dma: Detach DMA transfer into a dedicated method

2020-09-20 Thread Serge Semin
In order to add an alternative method of DMA-based SPI transfer first we
need to detach the currently available one from the common code. Here we
move the normal DMA-based SPI transfer execution functionality into a
dedicated method. It will be utilized if either the DMA engine supports
an unlimited number SG entries or Tx-only SPI transfer is requested. But
currently just use it for any SPI transfer.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 9f70818acce6..f2baefcae9ae 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -441,7 +441,8 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct 
spi_transfer *xfer)
return 0;
 }
 
-static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_transfer_all(struct dw_spi *dws,
+  struct spi_transfer *xfer)
 {
int ret;
 
@@ -462,7 +463,14 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
 
dma_async_issue_pending(dws->txchan);
 
-   ret = dw_spi_dma_wait(dws, xfer);
+   return dw_spi_dma_wait(dws, xfer);
+}
+
+static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+   int ret;
+
+   ret = dw_spi_dma_transfer_all(dws, xfer);
if (ret)
return ret;
 
-- 
2.27.0



[PATCH v2 09/11] spi: dw-dma: Move DMAC register cleanup to DMA transfer method

2020-09-20 Thread Serge Semin
DW APB SSI DMA driver doesn't use the native SPI core wait API since
commit bdbdf0f06337 ("spi: dw: Locally wait for the DMA transfers
completion"). Due to that the driver can now clear the DMAC register
in a single place synchronously with the DMA transactions completion
or failure. After that all the possible code paths are still covered:
1) DMA completion callbacks are executed in case if the corresponding DMA
transactions are finished. When they are, one of them will eventually wake
the SPI messages pump kernel thread and dw_spi_dma_transfer_all() method
will clean the DMAC register as implied by this patch.
2) dma_stop is called when the SPI core detects an error either returned
from the transfer_one() callback or set in the SPI message status field.
Both types of errors will be noticed by the dw_spi_dma_transfer_all()
method.
3) dma_exit is called when either SPI controller driver or the
corresponding device is removed. In any case the SPI core will first
flush the SPI messages pump kernel thread, so any pending or in-fly
SPI transfers will be finished before that.

Due to all of that let's simplify the DW APB SSI DMA driver a bit and
move the DMAC register cleanup to a single place in the
dw_spi_dma_transfer_all() method.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index f2baefcae9ae..935f073a3523 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -152,8 +152,6 @@ static void dw_spi_dma_exit(struct dw_spi *dws)
dmaengine_terminate_sync(dws->rxchan);
dma_release_channel(dws->rxchan);
}
-
-   dw_writel(dws, DW_SPI_DMACR, 0);
 }
 
 static irqreturn_t dw_spi_dma_transfer_handler(struct dw_spi *dws)
@@ -252,7 +250,6 @@ static void dw_spi_dma_tx_done(void *arg)
if (test_bit(RX_BUSY, >dma_chan_busy))
return;
 
-   dw_writel(dws, DW_SPI_DMACR, 0);
complete(>dma_completion);
 }
 
@@ -355,7 +352,6 @@ static void dw_spi_dma_rx_done(void *arg)
if (test_bit(TX_BUSY, >dma_chan_busy))
return;
 
-   dw_writel(dws, DW_SPI_DMACR, 0);
complete(>dma_completion);
 }
 
@@ -449,13 +445,13 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
/* Submit the DMA Tx transfer */
ret = dw_spi_dma_submit_tx(dws, xfer);
if (ret)
-   return ret;
+   goto err_clear_dmac;
 
/* Submit the DMA Rx transfer if required */
if (xfer->rx_buf) {
ret = dw_spi_dma_submit_rx(dws, xfer);
if (ret)
-   return ret;
+   goto err_clear_dmac;
 
/* rx must be started before tx due to spi instinct */
dma_async_issue_pending(dws->rxchan);
@@ -463,7 +459,12 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
 
dma_async_issue_pending(dws->txchan);
 
-   return dw_spi_dma_wait(dws, xfer);
+   ret = dw_spi_dma_wait(dws, xfer);
+
+err_clear_dmac:
+   dw_writel(dws, DW_SPI_DMACR, 0);
+
+   return ret;
 }
 
 static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
@@ -496,8 +497,6 @@ static void dw_spi_dma_stop(struct dw_spi *dws)
dmaengine_terminate_sync(dws->rxchan);
clear_bit(RX_BUSY, >dma_chan_busy);
}
-
-   dw_writel(dws, DW_SPI_DMACR, 0);
 }
 
 static const struct dw_spi_dma_ops dw_spi_dma_mfld_ops = {
-- 
2.27.0



[PATCH v2 00/11] spi: dw-dma: Add max SG entries burst capability support

2020-09-20 Thread Serge Semin
 Tx and Rx SPI
transfer SG-lists in a single DMA transaction without software
intervention, and for the full-duplex SPI-transfers.

Link: 
https://lore.kernel.org/linux-spi/20200731075953.14416-1-sergey.se...@baikalelectronics.ru
Changelog v2:
- Replace negative conditional statements with the positive ones in the
  dw_spi_dma_submit_{tx,rx}() methods.
- Terminate the prepared DMA Tx-descriptors on submission errors.
- Split the patch "spi: dw-dma: Move DMA transfers submission to the
  channels prep methods" up into a series of more simple commits.

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: Georgy Vlasov 
Cc: Ramil Zaripov 
Cc: Pavel Parkhomenko 
Cc: Peter Ujfalusi 
Cc: Andy Shevchenko 
Cc: Andy Shevchenko 
Cc: Feng Tang 
Cc: Vinod Koul 
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (11):
  spi: dw-dma: Set DMA Level registers on init
  spi: dw-dma: Fail DMA-based transfer if no Tx-buffer specified
  spi: dw-dma: Configure the DMA channels in dma_setup
  spi: dw-dma: Check rx_buf availability in the xfer method
  spi: dw-dma: Move DMA transfers submission to the channels prep
methods
  spi: dw-dma: Check DMA Tx-desc submission status
  spi: dw-dma: Remove DMA Tx-desc passing around
  spi: dw-dma: Detach DMA transfer into a dedicated method
  spi: dw-dma: Move DMAC register cleanup to DMA transfer method
  spi: dw-dma: Pass exact data to the DMA submit and wait methods
  spi: dw-dma: Add one-by-one SG list entries transfer

 drivers/spi/spi-dw-dma.c | 316 ++-
 drivers/spi/spi-dw.h |   1 +
 2 files changed, 245 insertions(+), 72 deletions(-)

-- 
2.27.0



[PATCH v2 02/11] spi: dw-dma: Fail DMA-based transfer if no Tx-buffer specified

2020-09-20 Thread Serge Semin
Since commit 46164fde6b78 ("spi: dw: Fix Rx-only DMA transfers") if DMA
interface is enabled, then Tx-buffer must be available in each SPI
transfer. It's required since in order to activate the incoming data
reception either DMA or CPU must be pushing data out to the SPI bus.
But the DW APB SSI DMA driver code is still left in state as if Tx-buffer
might be optional, which is no longer true. Let's fix it so an error would
be returned if no Tx-buffer detected and DMA Tx would be always
enabled.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index a7ff1e357f8b..1b013ac94a3f 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -262,9 +262,6 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
struct dma_slave_config txconf;
struct dma_async_tx_descriptor *txdesc;
 
-   if (!xfer->tx_buf)
-   return NULL;
-
memset(, 0, sizeof(txconf));
txconf.direction = DMA_MEM_TO_DEV;
txconf.dst_addr = dws->dma_addr;
@@ -383,17 +380,19 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
 
 static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
 {
-   u16 imr = 0, dma_ctrl = 0;
+   u16 imr, dma_ctrl;
 
-   if (xfer->tx_buf)
-   dma_ctrl |= SPI_DMA_TDMAE;
+   if (!xfer->tx_buf)
+   return -EINVAL;
+
+   /* Set the DMA handshaking interface */
+   dma_ctrl = SPI_DMA_TDMAE;
if (xfer->rx_buf)
dma_ctrl |= SPI_DMA_RDMAE;
dw_writel(dws, DW_SPI_DMACR, dma_ctrl);
 
/* Set the interrupt mask */
-   if (xfer->tx_buf)
-   imr |= SPI_INT_TXOI;
+   imr = SPI_INT_TXOI;
if (xfer->rx_buf)
imr |= SPI_INT_RXUI | SPI_INT_RXOI;
spi_umask_intr(dws, imr);
@@ -412,6 +411,8 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
 
/* Prepare the TX dma transfer */
txdesc = dw_spi_dma_prepare_tx(dws, xfer);
+   if (!txdesc)
+   return -EINVAL;
 
/* Prepare the RX dma transfer */
rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
@@ -423,17 +424,15 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
dma_async_issue_pending(dws->rxchan);
}
 
-   if (txdesc) {
-   set_bit(TX_BUSY, >dma_chan_busy);
-   dmaengine_submit(txdesc);
-   dma_async_issue_pending(dws->txchan);
-   }
+   set_bit(TX_BUSY, >dma_chan_busy);
+   dmaengine_submit(txdesc);
+   dma_async_issue_pending(dws->txchan);
 
ret = dw_spi_dma_wait(dws, xfer);
if (ret)
return ret;
 
-   if (txdesc && dws->master->cur_msg->status == -EINPROGRESS) {
+   if (dws->master->cur_msg->status == -EINPROGRESS) {
ret = dw_spi_dma_wait_tx_done(dws, xfer);
if (ret)
return ret;
-- 
2.27.0



[PATCH v2 03/11] spi: dw-dma: Configure the DMA channels in dma_setup

2020-09-20 Thread Serge Semin
Mainly this is a preparation patch before adding one-by-one DMA SG entries
transmission. But logically the Tx and Rx DMA channels setup should be
performed in the dma_setup() callback anyway. So we'll move the DMA slave
channels src/dst burst lengths, address and address width configuration
from the Tx/Rx channels preparation methods to the dedicated functions and
then make sure it's called at the DMA setup stage.

Note we now make sure the return value of the dmaengine_slave_config()
method doesn't indicate an error. It has been unnecessary in case if Dw
DMAC is utilized as a DMA engine, since its device_config() callback
always returns zero (though it might change in future). But since DW APB
SSI driver now supports any DMA back-end we must make sure the DMA device
configuration has been successful before proceeding with further setups.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 42 +---
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 1b013ac94a3f..da17897b8acb 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -256,11 +256,9 @@ static void dw_spi_dma_tx_done(void *arg)
complete(>dma_completion);
 }
 
-static struct dma_async_tx_descriptor *
-dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_config_tx(struct dw_spi *dws)
 {
struct dma_slave_config txconf;
-   struct dma_async_tx_descriptor *txdesc;
 
memset(, 0, sizeof(txconf));
txconf.direction = DMA_MEM_TO_DEV;
@@ -270,7 +268,13 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
txconf.dst_addr_width = dw_spi_dma_convert_width(dws->n_bytes);
txconf.device_fc = false;
 
-   dmaengine_slave_config(dws->txchan, );
+   return dmaengine_slave_config(dws->txchan, );
+}
+
+static struct dma_async_tx_descriptor *
+dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+   struct dma_async_tx_descriptor *txdesc;
 
txdesc = dmaengine_prep_slave_sg(dws->txchan,
xfer->tx_sg.sgl,
@@ -345,14 +349,9 @@ static void dw_spi_dma_rx_done(void *arg)
complete(>dma_completion);
 }
 
-static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi 
*dws,
-   struct spi_transfer *xfer)
+static int dw_spi_dma_config_rx(struct dw_spi *dws)
 {
struct dma_slave_config rxconf;
-   struct dma_async_tx_descriptor *rxdesc;
-
-   if (!xfer->rx_buf)
-   return NULL;
 
memset(, 0, sizeof(rxconf));
rxconf.direction = DMA_DEV_TO_MEM;
@@ -362,7 +361,16 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
rxconf.src_addr_width = dw_spi_dma_convert_width(dws->n_bytes);
rxconf.device_fc = false;
 
-   dmaengine_slave_config(dws->rxchan, );
+   return dmaengine_slave_config(dws->rxchan, );
+}
+
+static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi 
*dws,
+   struct spi_transfer *xfer)
+{
+   struct dma_async_tx_descriptor *rxdesc;
+
+   if (!xfer->rx_buf)
+   return NULL;
 
rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
xfer->rx_sg.sgl,
@@ -381,10 +389,22 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
 static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
 {
u16 imr, dma_ctrl;
+   int ret;
 
if (!xfer->tx_buf)
return -EINVAL;
 
+   /* Setup DMA channels */
+   ret = dw_spi_dma_config_tx(dws);
+   if (ret)
+   return ret;
+
+   if (xfer->rx_buf) {
+   ret = dw_spi_dma_config_rx(dws);
+   if (ret)
+   return ret;
+   }
+
/* Set the DMA handshaking interface */
dma_ctrl = SPI_DMA_TDMAE;
if (xfer->rx_buf)
-- 
2.27.0



[PATCH v2 10/11] spi: dw-dma: Pass exact data to the DMA submit and wait methods

2020-09-20 Thread Serge Semin
In order to use the DMA submission and waiting methods in both generic
DMA-based SPI transfer and one-by-one DMA SG entries transmission
functions, we need to modify the dw_spi_dma_wait() and
dw_spi_dma_submit_tx()/dw_spi_dma_submit_rx() prototypes. So instead of
getting the SPI transfer object as the second argument they must accept
the exact data structure instances they imply to use. Those are the
current transfer length and the SPI bus frequency in case of
dw_spi_dma_wait(), and SG list together with number of list entries in
case of the DMA Tx/Rx submission methods.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 35 +--
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 935f073a3523..f333c2e23bf6 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -188,12 +188,12 @@ static enum dma_slave_buswidth 
dw_spi_dma_convert_width(u8 n_bytes)
return DMA_SLAVE_BUSWIDTH_UNDEFINED;
 }
 
-static int dw_spi_dma_wait(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_wait(struct dw_spi *dws, unsigned int len, u32 speed)
 {
unsigned long long ms;
 
-   ms = xfer->len * MSEC_PER_SEC * BITS_PER_BYTE;
-   do_div(ms, xfer->effective_speed_hz);
+   ms = len * MSEC_PER_SEC * BITS_PER_BYTE;
+   do_div(ms, speed);
ms += ms + 200;
 
if (ms > UINT_MAX)
@@ -268,17 +268,16 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws)
return dmaengine_slave_config(dws->txchan, );
 }
 
-static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct scatterlist *sgl,
+   unsigned int nents)
 {
struct dma_async_tx_descriptor *txdesc;
dma_cookie_t cookie;
int ret;
 
-   txdesc = dmaengine_prep_slave_sg(dws->txchan,
-   xfer->tx_sg.sgl,
-   xfer->tx_sg.nents,
-   DMA_MEM_TO_DEV,
-   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+   txdesc = dmaengine_prep_slave_sg(dws->txchan, sgl, nents,
+DMA_MEM_TO_DEV,
+DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!txdesc)
return -ENOMEM;
 
@@ -370,17 +369,16 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws)
return dmaengine_slave_config(dws->rxchan, );
 }
 
-static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct scatterlist *sgl,
+   unsigned int nents)
 {
struct dma_async_tx_descriptor *rxdesc;
dma_cookie_t cookie;
int ret;
 
-   rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
-   xfer->rx_sg.sgl,
-   xfer->rx_sg.nents,
-   DMA_DEV_TO_MEM,
-   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+   rxdesc = dmaengine_prep_slave_sg(dws->rxchan, sgl, nents,
+DMA_DEV_TO_MEM,
+DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!rxdesc)
return -ENOMEM;
 
@@ -443,13 +441,14 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
int ret;
 
/* Submit the DMA Tx transfer */
-   ret = dw_spi_dma_submit_tx(dws, xfer);
+   ret = dw_spi_dma_submit_tx(dws, xfer->tx_sg.sgl, xfer->tx_sg.nents);
if (ret)
goto err_clear_dmac;
 
/* Submit the DMA Rx transfer if required */
if (xfer->rx_buf) {
-   ret = dw_spi_dma_submit_rx(dws, xfer);
+   ret = dw_spi_dma_submit_rx(dws, xfer->rx_sg.sgl,
+  xfer->rx_sg.nents);
if (ret)
goto err_clear_dmac;
 
@@ -459,7 +458,7 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
 
dma_async_issue_pending(dws->txchan);
 
-   ret = dw_spi_dma_wait(dws, xfer);
+   ret = dw_spi_dma_wait(dws, xfer->len, xfer->effective_speed_hz);
 
 err_clear_dmac:
dw_writel(dws, DW_SPI_DMACR, 0);
-- 
2.27.0



[PATCH v2 11/11] spi: dw-dma: Add one-by-one SG list entries transfer

2020-09-20 Thread Serge Semin
In case if at least one of the requested DMA engine channels doesn't
support the hardware accelerated SG list entries traverse, the DMA driver
will most likely work that around by performing the IRQ-based SG list
entries resubmission. That might and will cause a problem if the DMA Tx
channel is recharged and re-executed before the Rx DMA channel. Due to
non-deterministic IRQ-handler execution latency the DMA Tx channel will
start pushing data to the SPI bus before the Rx DMA channel is even
reinitialized with the next inbound SG list entry. By doing so the DMA
Tx channel will implicitly start filling the DW APB SSI Rx FIFO up, which
while the DMA Rx channel being recharged and re-executed will eventually
be overflown.

In order to solve the problem we have to feed the DMA engine with SG
list entries one-by-one. It shall keep the DW APB SSI Tx and Rx FIFOs
synchronized and prevent the Rx FIFO overflow. Since in general the SPI
tx_sg and rx_sg lists may have different number of entries of different
lengths (though total length should match) we virtually split the
SG-lists to the set of DMA transfers, which length is a minimum of the
ordered SG-entries lengths.

The solution described above is only executed if a full-duplex SPI
transfer is requested and the DMA engine hasn't provided channels with
hardware accelerated SG list traverse capability to handle both SG
lists at once.

Signed-off-by: Serge Semin 
Suggested-by: Andy Shevchenko 
---
 drivers/spi/spi-dw-dma.c | 137 ++-
 drivers/spi/spi-dw.h |   1 +
 2 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index f333c2e23bf6..1cbb5a9efbba 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -72,6 +72,23 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
 }
 
+static void dw_spi_dma_sg_burst_init(struct dw_spi *dws)
+{
+   struct dma_slave_caps tx = {0}, rx = {0};
+
+   dma_get_slave_caps(dws->txchan, );
+   dma_get_slave_caps(dws->rxchan, );
+
+   if (tx.max_sg_burst > 0 && rx.max_sg_burst > 0)
+   dws->dma_sg_burst = min(tx.max_sg_burst, rx.max_sg_burst);
+   else if (tx.max_sg_burst > 0)
+   dws->dma_sg_burst = tx.max_sg_burst;
+   else if (rx.max_sg_burst > 0)
+   dws->dma_sg_burst = rx.max_sg_burst;
+   else
+   dws->dma_sg_burst = 0;
+}
+
 static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws)
 {
struct dw_dma_slave dma_tx = { .dst_id = 1 }, *tx = _tx;
@@ -109,6 +126,8 @@ static int dw_spi_dma_init_mfld(struct device *dev, struct 
dw_spi *dws)
 
dw_spi_dma_maxburst_init(dws);
 
+   dw_spi_dma_sg_burst_init(dws);
+
return 0;
 
 free_rxchan:
@@ -138,6 +157,8 @@ static int dw_spi_dma_init_generic(struct device *dev, 
struct dw_spi *dws)
 
dw_spi_dma_maxburst_init(dws);
 
+   dw_spi_dma_sg_burst_init(dws);
+
return 0;
 }
 
@@ -466,11 +487,125 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
return ret;
 }
 
+/*
+ * In case if at least one of the requested DMA channels doesn't support the
+ * hardware accelerated SG list entries traverse, the DMA driver will most
+ * likely work that around by performing the IRQ-based SG list entries
+ * resubmission. That might and will cause a problem if the DMA Tx channel is
+ * recharged and re-executed before the Rx DMA channel. Due to
+ * non-deterministic IRQ-handler execution latency the DMA Tx channel will
+ * start pushing data to the SPI bus before the Rx DMA channel is even
+ * reinitialized with the next inbound SG list entry. By doing so the DMA Tx
+ * channel will implicitly start filling the DW APB SSI Rx FIFO up, which while
+ * the DMA Rx channel being recharged and re-executed will eventually be
+ * overflown.
+ *
+ * In order to solve the problem we have to feed the DMA engine with SG list
+ * entries one-by-one. It shall keep the DW APB SSI Tx and Rx FIFOs
+ * synchronized and prevent the Rx FIFO overflow. Since in general the tx_sg
+ * and rx_sg lists may have different number of entries of different lengths
+ * (though total length should match) let's virtually split the SG-lists to the
+ * set of DMA transfers, which length is a minimum of the ordered SG-entries
+ * lengths. An ASCII-sketch of the implemented algo is following:
+ *  xfer->len
+ *|___|
+ * tx_sg list:|___||__|
+ * rx_sg list:|_|||
+ * DMA transfers: |_|_|__|_|__|
+ *
+ * Note in order to have this workaround solving the denoted problem the DMA
+ * engine driver should properly initialize the max_sg_burst capability and set
+ * the DMA device max segment size parameter with maximum data block size the
+ * DMA engine supports.
+ */
+
+static int dw_spi_dma_transfer_one(struct dw_spi *dw

[PATCH v2 04/11] spi: dw-dma: Check rx_buf availability in the xfer method

2020-09-20 Thread Serge Semin
Checking rx_buf for being NULL and returning NULL from the Rx-channel
preparation method doesn't let us to distinguish that situation from
errors happening during the Rx SG-list preparation. So it's better to make
sure that the rx_buf not-NULL and full-duplex communication is requested
prior calling the Rx preparation method.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index da17897b8acb..d2a67dee1a66 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -369,9 +369,6 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
 {
struct dma_async_tx_descriptor *rxdesc;
 
-   if (!xfer->rx_buf)
-   return NULL;
-
rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
xfer->rx_sg.sgl,
xfer->rx_sg.nents,
@@ -435,10 +432,12 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
return -EINVAL;
 
/* Prepare the RX dma transfer */
-   rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
+   if (xfer->rx_buf) {
+   rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
+   if (!rxdesc)
+   return -EINVAL;
 
-   /* rx must be started before tx due to spi instinct */
-   if (rxdesc) {
+   /* rx must be started before tx due to spi instinct */
set_bit(RX_BUSY, >dma_chan_busy);
dmaengine_submit(rxdesc);
dma_async_issue_pending(dws->rxchan);
@@ -458,7 +457,7 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
return ret;
}
 
-   if (rxdesc && dws->master->cur_msg->status == -EINPROGRESS)
+   if (xfer->rx_buf && dws->master->cur_msg->status == -EINPROGRESS)
ret = dw_spi_dma_wait_rx_done(dws);
 
return ret;
-- 
2.27.0



[PATCH v3] mtd: physmap: Add Baikal-T1 physically mapped ROM support

2020-09-20 Thread Serge Semin
Baikal-T1 Boot Controller provides an access to a RO storages, which are
physically mapped into the SoC MMIO space. In particularly there are
Internal ROM embedded into the SoC with a pre-installed firmware,
externally attached SPI flash (also accessed in the read-only mode) and a
memory region, which mirrors one of them in accordance with the currently
enabled system boot mode (also called Boot ROM).

This commit adds the Internal ROM support to the physmap driver of the MTD
kernel subsystem. The driver will create the Internal ROM MTD as long as
it is defined in the system dts file. The physically mapped SPI flash
region will be used to implement the SPI-mem interface. The mirroring
memory region won't be accessible directly since it's redundant due to
both bootable regions being exposed anyway.

Note we had to create a dedicated code for the ROMs since read from the
corresponding memory regions must be done via the dword-aligned addresses.

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: Lee Jones 
Cc: linux-m...@vger.kernel.org

---

Link: 
https://lore.kernel.org/linux-mtd/20200508100905.5854-1-sergey.se...@baikalelectronics.ru/
Changelog v2:
- Resend.

Link: 
https://lore.kernel.org/linux-mtd/20200526225849.20985-1-sergey.se...@baikalelectronics.ru
Changelog v3:
- Alphabetically order the include statements.
- Discard dummy IOs.
- Discard Baikal-T1 Boot ROM support. The directly mapped SPI flash
  memory will be used in the DW APB SSI driver instead.
---
 drivers/mtd/maps/Kconfig   |  11 +++
 drivers/mtd/maps/Makefile  |   1 +
 drivers/mtd/maps/physmap-bt1-rom.c | 126 +
 drivers/mtd/maps/physmap-bt1-rom.h |  17 
 drivers/mtd/maps/physmap-core.c|   5 ++
 5 files changed, 160 insertions(+)
 create mode 100644 drivers/mtd/maps/physmap-bt1-rom.c
 create mode 100644 drivers/mtd/maps/physmap-bt1-rom.h

diff --git a/drivers/mtd/maps/Kconfig b/drivers/mtd/maps/Kconfig
index fd37553f1b07..6650acbc961e 100644
--- a/drivers/mtd/maps/Kconfig
+++ b/drivers/mtd/maps/Kconfig
@@ -75,6 +75,17 @@ config MTD_PHYSMAP_OF
  physically into the CPU's memory. The mapping description here is
  taken from OF device tree.
 
+config MTD_PHYSMAP_BT1_ROM
+   bool "Baikal-T1 Boot ROMs OF-based physical memory map handling"
+   depends on MTD_PHYSMAP_OF
+   depends on MIPS_BAIKAL_T1 || COMPILE_TEST
+   select MTD_COMPLEX_MAPPINGS
+   select MULTIPLEXER
+   select MUX_MMIO
+   help
+ This provides some extra DT physmap parsing for the Baikal-T1
+ platforms, some detection and setting up ROMs-specific accessors.
+
 config MTD_PHYSMAP_VERSATILE
bool "ARM Versatile OF-based physical memory map handling"
depends on MTD_PHYSMAP_OF
diff --git a/drivers/mtd/maps/Makefile b/drivers/mtd/maps/Makefile
index c0da86a5d26f..79f018cf412f 100644
--- a/drivers/mtd/maps/Makefile
+++ b/drivers/mtd/maps/Makefile
@@ -18,6 +18,7 @@ obj-$(CONFIG_MTD_CK804XROM)   += ck804xrom.o
 obj-$(CONFIG_MTD_TSUNAMI)  += tsunami_flash.o
 obj-$(CONFIG_MTD_PXA2XX)   += pxa2xx-flash.o
 physmap-objs-y += physmap-core.o
+physmap-objs-$(CONFIG_MTD_PHYSMAP_BT1_ROM) += physmap-bt1-rom.o
 physmap-objs-$(CONFIG_MTD_PHYSMAP_VERSATILE) += physmap-versatile.o
 physmap-objs-$(CONFIG_MTD_PHYSMAP_GEMINI) += physmap-gemini.o
 physmap-objs-$(CONFIG_MTD_PHYSMAP_IXP4XX) += physmap-ixp4xx.o
diff --git a/drivers/mtd/maps/physmap-bt1-rom.c 
b/drivers/mtd/maps/physmap-bt1-rom.c
new file mode 100644
index ..27cfe1c63a2e
--- /dev/null
+++ b/drivers/mtd/maps/physmap-bt1-rom.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+ *
+ * Authors:
+ *   Serge Semin 
+ *
+ * Baikal-T1 Physically Mapped Internal ROM driver
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "physmap-bt1-rom.h"
+
+/*
+ * Baikal-T1 SoC ROMs are only accessible by the dword-aligned instructions.
+ * We have to take this into account when implementing the data read-methods.
+ * Note there is no need in bothering with endianness, since both Baikal-T1
+ * CPU and MMIO are LE.
+ */
+static map_word __xipram bt1_rom_map_read(struct map_info *map,
+ unsigned long ofs)
+{
+   void __iomem *src = map->virt + ofs;
+   unsigned long shift;
+   map_word ret;
+   u32 data;
+
+   /* Read data within offset dword. */
+   shift = (unsigned long)src & 0x3;
+   data = readl_relaxed(src - shift);
+   if (!shift) {
+   ret.x[0] = data;
+   return ret;
+   }
+   ret.x[0] = data >> (shift * BITS_PER_BYTE);
+
+   /* Read data from the next dword. */
+   shift = 4 - shift;
+   if (ofs + shift >= map->size)
+   return ret;
+
+   d

[PATCH] clk: baikal-t1: Mark Ethernet PLL as critical

2020-09-20 Thread Serge Semin
We've discovered that disabling the so called Ethernet PLL causes reset of
the devices consuming its outgoing clock. The resets happen automatically
even if each underlying clock gate is turned off. Due to that we can't
disable the Ethernet PLL until the kernel is prepared for the corresponding
resets. So for now just mark the PLL clock provider as critical.

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: linux-m...@vger.kernel.org
---
 drivers/clk/baikal-t1/clk-ccu-pll.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/clk/baikal-t1/clk-ccu-pll.c 
b/drivers/clk/baikal-t1/clk-ccu-pll.c
index 1eec8c0b8f50..2445d4b12baf 100644
--- a/drivers/clk/baikal-t1/clk-ccu-pll.c
+++ b/drivers/clk/baikal-t1/clk-ccu-pll.c
@@ -51,11 +51,13 @@ struct ccu_pll_info {
 };
 
 /*
- * Mark as critical all PLLs except Ethernet one. CPU and DDR PLLs are sources
- * of CPU cores and DDR controller reference clocks, due to which they
- * obviously shouldn't be ever gated. SATA and PCIe PLLs are the parents of
- * APB-bus and DDR controller AXI-bus clocks. If they are gated the system will
- * be unusable.
+ * Alas we have to mark all PLLs as critical. CPU and DDR PLLs are sources of
+ * CPU cores and DDR controller reference clocks, due to which they obviously
+ * shouldn't be ever gated. SATA and PCIe PLLs are the parents of APB-bus and
+ * DDR controller AXI-bus clocks. If they are gated the system will be
+ * unusable. Moreover disabling SATA and Ethernet PLLs causes automatic reset
+ * of the corresponding subsystems. So until we aren't ready to re-initialize
+ * all the devices consuming those PLLs, they will be marked as critical too.
  */
 static const struct ccu_pll_info pll_info[] = {
CCU_PLL_INFO(CCU_CPU_PLL, "cpu_pll", "ref_clk", CCU_CPU_PLL_BASE,
@@ -67,7 +69,7 @@ static const struct ccu_pll_info pll_info[] = {
CCU_PLL_INFO(CCU_PCIE_PLL, "pcie_pll", "ref_clk", CCU_PCIE_PLL_BASE,
 CLK_IS_CRITICAL),
CCU_PLL_INFO(CCU_ETH_PLL, "eth_pll", "ref_clk", CCU_ETH_PLL_BASE,
-CLK_SET_RATE_GATE)
+CLK_IS_CRITICAL | CLK_SET_RATE_GATE)
 };
 
 struct ccu_pll_data {
-- 
2.27.0



[PATCH 2/3] hwmon: bt1-pvt: Cache current update timeout

2020-09-20 Thread Serge Semin
Instead of converting the update timeout data to the milliseconds each
time on the read procedure let's preserve the currently set timeout in the
dedicated driver private data cache. The cached value will be then used in
the timeout read method and in the alarm-less data conversion to prevent
the caller task hanging up in case if the PVT sensor is suddenly powered
down.

Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver")
Signed-off-by: Serge Semin 
---
 drivers/hwmon/bt1-pvt.c | 85 ++---
 drivers/hwmon/bt1-pvt.h |  3 ++
 2 files changed, 49 insertions(+), 39 deletions(-)

diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c
index f4b7353c078a..2600426a3b21 100644
--- a/drivers/hwmon/bt1-pvt.c
+++ b/drivers/hwmon/bt1-pvt.c
@@ -655,44 +655,16 @@ static int pvt_write_trim(struct pvt_hwmon *pvt, long val)
 
 static int pvt_read_timeout(struct pvt_hwmon *pvt, long *val)
 {
-   unsigned long rate;
-   ktime_t kt;
-   u32 data;
-
-   rate = clk_get_rate(pvt->clks[PVT_CLOCK_REF].clk);
-   if (!rate)
-   return -ENODEV;
-
-   /*
-* Don't bother with mutex here, since we just read data from MMIO.
-* We also have to scale the ticks timeout up to compensate the
-* ms-ns-data translations.
-*/
-   data = readl(pvt->regs + PVT_TTIMEOUT) + 1;
+   int ret;
 
-   /*
-* Calculate ref-clock based delay (Ttotal) between two consecutive
-* data samples of the same sensor. So we first must calculate the
-* delay introduced by the internal ref-clock timer (Tref * Fclk).
-* Then add the constant timeout cuased by each conversion latency
-* (Tmin). The basic formulae for each conversion is following:
-*   Ttotal = Tref * Fclk + Tmin
-* Note if alarms are enabled the sensors are polled one after
-* another, so in order to have the delay being applicable for each
-* sensor the requested value must be equally redistirbuted.
-*/
-#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
-   kt = ktime_set(PVT_SENSORS_NUM * (u64)data, 0);
-   kt = ktime_divns(kt, rate);
-   kt = ktime_add_ns(kt, PVT_SENSORS_NUM * PVT_TOUT_MIN);
-#else
-   kt = ktime_set(data, 0);
-   kt = ktime_divns(kt, rate);
-   kt = ktime_add_ns(kt, PVT_TOUT_MIN);
-#endif
+   ret = mutex_lock_interruptible(>iface_mtx);
+   if (ret)
+   return ret;
 
/* Return the result in msec as hwmon sysfs interface requires. */
-   *val = ktime_to_ms(kt);
+   *val = ktime_to_ms(pvt->timeout);
+
+   mutex_unlock(>iface_mtx);
 
return 0;
 }
@@ -700,7 +672,7 @@ static int pvt_read_timeout(struct pvt_hwmon *pvt, long 
*val)
 static int pvt_write_timeout(struct pvt_hwmon *pvt, long val)
 {
unsigned long rate;
-   ktime_t kt;
+   ktime_t kt, cache;
u32 data;
int ret;
 
@@ -713,7 +685,7 @@ static int pvt_write_timeout(struct pvt_hwmon *pvt, long 
val)
 * between all available sensors to have the requested delay
 * applicable to each individual sensor.
 */
-   kt = ms_to_ktime(val);
+   cache = kt = ms_to_ktime(val);
 #if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
kt = ktime_divns(kt, PVT_SENSORS_NUM);
 #endif
@@ -742,6 +714,7 @@ static int pvt_write_timeout(struct pvt_hwmon *pvt, long 
val)
return ret;
 
pvt_set_tout(pvt, data);
+   pvt->timeout = cache;
 
mutex_unlock(>iface_mtx);
 
@@ -1018,10 +991,17 @@ static int pvt_check_pwr(struct pvt_hwmon *pvt)
return ret;
 }
 
-static void pvt_init_iface(struct pvt_hwmon *pvt)
+static int pvt_init_iface(struct pvt_hwmon *pvt)
 {
+   unsigned long rate;
u32 trim, temp;
 
+   rate = clk_get_rate(pvt->clks[PVT_CLOCK_REF].clk);
+   if (!rate) {
+   dev_err(pvt->dev, "Invalid reference clock rate\n");
+   return -ENODEV;
+   }
+
/*
 * Make sure all interrupts and controller are disabled so not to
 * accidentally have ISR executed before the driver data is fully
@@ -1036,12 +1016,37 @@ static void pvt_init_iface(struct pvt_hwmon *pvt)
pvt_set_mode(pvt, pvt_info[pvt->sensor].mode);
pvt_set_tout(pvt, PVT_TOUT_DEF);
 
+   /*
+* Preserve the current ref-clock based delay (Ttotal) between the
+* sensors data samples in the driver data so not to recalculate it
+* each time on the data requests and timeout reads. It consists of the
+* delay introduced by the internal ref-clock timer (N / Fclk) and the
+* constant timeout caused by each conversion latency (Tmin):
+*   Ttotal = N / Fclk + Tmin
+* If alarms are enabled the sensors are polled one after another and
+* in order to get the next measurement of a particular sensor the
+* caller will have to wait for 

[PATCH 1/3] hwmon: bt1-pvt: Test sensor power supply on probe

2020-09-20 Thread Serge Semin
Baikal-T1 PVT sensor has got a dedicated power supply domain (feed up by
the external GPVT/VPVT_18 pins). In case if it isn't powered up, the
registers will be accessible, but the sensor conversion just won't happen.
Due to that an attempt to read data from any PVT sensor will cause the
task hanging up.  For instance that will happen if XP11 jumper isn't
installed on the Baikal-T1-based BFK3.1 board. Let's at least test whether
the conversion work on the device probe procedure. By doing so will make
sure that the PVT sensor is powered up at least at boot time.

Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver")
Signed-off-by: Serge Semin 
---
 drivers/hwmon/bt1-pvt.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c
index 94698cae0497..f4b7353c078a 100644
--- a/drivers/hwmon/bt1-pvt.c
+++ b/drivers/hwmon/bt1-pvt.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -982,6 +983,41 @@ static int pvt_request_clks(struct pvt_hwmon *pvt)
return 0;
 }
 
+static int pvt_check_pwr(struct pvt_hwmon *pvt)
+{
+   unsigned long tout;
+   int ret = 0;
+   u32 data;
+
+   /*
+* Test out the sensor conversion functionality. If it is not done on
+* time then the domain must have been unpowered and we won't be able
+* to use the device later in this driver.
+* Note If the power source is lost during the normal driver work the
+* data read procedure will either return -ETIMEDOUT (for the
+* alarm-less driver configuration) or just stop the repeated
+* conversion. In the later case alas we won't be able to detect the
+* problem.
+*/
+   pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_ALL, PVT_INTR_ALL);
+   pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, PVT_CTRL_EN);
+   pvt_set_tout(pvt, 0);
+   readl(pvt->regs + PVT_DATA);
+
+   tout = PVT_TOUT_MIN / NSEC_PER_USEC;
+   usleep_range(tout, 2 * tout);
+
+   data = readl(pvt->regs + PVT_DATA);
+   if (!(data & PVT_DATA_VALID)) {
+   ret = -ENODEV;
+   dev_err(pvt->dev, "Sensor is powered down\n");
+   }
+
+   pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
+
+   return ret;
+}
+
 static void pvt_init_iface(struct pvt_hwmon *pvt)
 {
u32 trim, temp;
@@ -1109,6 +1145,10 @@ static int pvt_probe(struct platform_device *pdev)
if (ret)
return ret;
 
+   ret = pvt_check_pwr(pvt);
+   if (ret)
+   return ret;
+
pvt_init_iface(pvt);
 
ret = pvt_request_irq(pvt);
-- 
2.27.0



[PATCH 0/3] hwmon: bt1-pvt: Fix PVT sensor being unpowered

2020-09-20 Thread Serge Semin
Baikal-T1 PVT sensor has got a dedicated power domain with 1.8V intended
to be supplied via the external GPVT and VPVT_18 pins. In case if the
power isn't supplied, the sensor IO registers will be accessible, but the
data conversion just won't happen. The situation of the power loss
currently will cause the temperature and voltage read procedures to hang
up. The problem is fixed in the framework of this patchset firstly by
checking whether the conversion works at the sensor probe procedure, and
secondly by keeping the current conversion update timeout in the private
driver data cache and using it to set the wait-completion timeout.

Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver")
Signed-off-by: Serge Semin 
Cc: Maxim Kaurkin 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: linux-m...@vger.kernel.org
Cc: linux-hw...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (3):
  hwmon: bt1-pvt: Test sensor power supply on probe
  hwmon: bt1-pvt: Cache current update timeout
  hwmon: bt1-pvt: Wait for the completion with timeout

 drivers/hwmon/bt1-pvt.c | 138 
 drivers/hwmon/bt1-pvt.h |   3 +
 2 files changed, 101 insertions(+), 40 deletions(-)

-- 
2.27.0



[PATCH 3/3] hwmon: bt1-pvt: Wait for the completion with timeout

2020-09-20 Thread Serge Semin
If the PVT sensor is suddenly powered down while a caller is waiting for
the conversion completion, the request won't be finished and the task will
hang up on this procedure until the power is back up again. Let's call the
wait_for_completion_timeout() method instead to prevent that. The cached
timeout is exactly what we need to predict for how long conversion could
normally last.

Fixes: 87976ce2825d ("hwmon: Add Baikal-T1 PVT sensor driver")
Signed-off-by: Serge Semin 
---
 drivers/hwmon/bt1-pvt.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/hwmon/bt1-pvt.c b/drivers/hwmon/bt1-pvt.c
index 2600426a3b21..3e1d56585b91 100644
--- a/drivers/hwmon/bt1-pvt.c
+++ b/drivers/hwmon/bt1-pvt.c
@@ -477,6 +477,7 @@ static int pvt_read_data(struct pvt_hwmon *pvt, enum 
pvt_sensor_type type,
 long *val)
 {
struct pvt_cache *cache = >cache[type];
+   unsigned long timeout;
u32 data;
int ret;
 
@@ -500,7 +501,14 @@ static int pvt_read_data(struct pvt_hwmon *pvt, enum 
pvt_sensor_type type,
pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID, 0);
pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, PVT_CTRL_EN);
 
-   wait_for_completion(>conversion);
+   /*
+* Wait with timeout since in case if the sensor is suddenly powered
+* down the request won't be completed and the caller will hang up on
+* this procedure until the power is back up again. Multiply the
+* timeout by the factor of two to prevent a false timeout.
+*/
+   timeout = 2 * usecs_to_jiffies(ktime_to_us(pvt->timeout));
+   ret = wait_for_completion_timeout(>conversion, timeout);
 
pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID,
@@ -510,6 +518,9 @@ static int pvt_read_data(struct pvt_hwmon *pvt, enum 
pvt_sensor_type type,
 
mutex_unlock(>iface_mtx);
 
+   if (!ret)
+   return -ETIMEDOUT;
+
if (type == PVT_TEMP)
*val = pvt_calc_poly(_N_to_temp, data);
else
-- 
2.27.0



[PATCH 1/2] mips: Add strong UC ordering config

2020-09-20 Thread Serge Semin
In accordance with [1, 2] memory transactions using CCA=2 (Uncached
Cacheability and Coherency Attribute) are always strongly ordered. This
means the younger memory accesses using CCA=2 are never allowed to be
executed before older memory accesses using CCA=2 (no bypassing is
allowed), and Loads and Stores using CCA=2 are never speculative. It is
expected by the specification that the rest of the system maintains these
properties for processor initiated uncached accesses. So the system IO
interconnect doesn't reorder uncached transactions once they have left the
processor subsystem. Taking into account these properties and what [3]
says about the relaxed IO-accessors we can infer that normal Loads and
Stores from/to CCA=2 memory and without any additional execution barriers
will fully comply with the {read,write}X_relaxed() methods requirements.

Let's convert then currently generated relaxed IO-accessors to being pure
Loads and Stores. Seeing the commit 3d474dacae72 ("MIPS: Enforce strong
ordering for MMIO accessors") and commit 8b656253a7a4 ("MIPS: Provide
actually relaxed MMIO accessors") have already made a preparation in the
corresponding macro, we can do that just by replacing the "barrier"
parameter utilization with the "relax" one. Note the "barrier" macro
argument can be removed, since it isn't fully used anyway other than being
always assigned to 1.

Of course it would be fullish to believe that all the available MIPS-based
CPUs completely follow the denoted specification, especially considering
how old the architecture is. Instead we introduced a dedicated kernel
config, which when enabled will convert the relaxed IO-accessors to being
pure Loads and Stores without any additional barriers around. So if some
CPU supports the strongly ordered UC memory access, it can enable that
config and use a fully optimized relaxed IO-methods. For instance,
Baikal-T1 architecture support code will do that.

[1] MIPS Coherence Protocol Specification, Document Number: MD00605,
Revision 01.01. September 14, 2015, 4.2 Execution Order Behavior,
p. 33

[2] MIPS Coherence Protocol Specification, Document Number: MD00605,
Revision 01.01. September 14, 2015, 4.8.1 IO Device Access, p. 58

[3] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt,
    Section "KERNEL I/O BARRIER EFFECTS"

Signed-off-by: Serge Semin 
Cc: Maciej W. Rozycki 
---
 arch/mips/Kconfig  |  8 
 arch/mips/include/asm/io.h | 20 ++--
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index c95fa3a2484c..2c82d927347d 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -2066,6 +2066,14 @@ config WEAK_ORDERING
 #
 config WEAK_REORDERING_BEYOND_LLSC
bool
+
+#
+# CPU may not reorder reads and writes R->R, R->W, W->R, W->W within Uncached
+# Cacheability and Coherency Attribute (CCA=2)
+#
+config STRONG_UC_ORDERING
+   bool
+
 endmenu
 
 #
diff --git a/arch/mips/include/asm/io.h b/arch/mips/include/asm/io.h
index 78537aa23500..130c4b6458fc 100644
--- a/arch/mips/include/asm/io.h
+++ b/arch/mips/include/asm/io.h
@@ -213,7 +213,7 @@ void iounmap(const volatile void __iomem *addr);
 #define war_io_reorder_wmb()   barrier()
 #endif
 
-#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, barrier, relax, irq)\
+#define __BUILD_MEMORY_SINGLE(pfx, bwlq, type, relax, irq) \
\
 static inline void pfx##write##bwlq(type val,  \
volatile void __iomem *mem) \
@@ -221,7 +221,7 @@ static inline void pfx##write##bwlq(type val,   
\
volatile type *__mem;   \
type __val; \
\
-   if (barrier)\
+   if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING)))  \
iobarrier_rw(); \
else\
war_io_reorder_wmb();   \
@@ -262,7 +262,7 @@ static inline type pfx##read##bwlq(const volatile void 
__iomem *mem)\
\
__mem = (void *)__swizzle_addr_##bwlq((unsigned long)(mem));\
\
-   if (barrier)\
+   if (!(relax && IS_ENABLED(CONFIG_STRONG_UC_ORDERING)))  \
iobarrier_rw(); \
  

[PATCH 2/2] mips: Introduce MIPS CM2 GCR Control register accessors

2020-09-20 Thread Serge Semin
For some reason these accessors have been absent from the MIPS kernel,
while some of them can be used to tune the MIPS code execution up (the
default value are fully acceptable though). For instance, in the framework
of MIPS P5600/P6600 (see [1] for details) if we are sure the IO
interconnect doesn't reorder the requests we can freely set
GCR_CONTROL.SYNCDIS, which will make CM2 to respond on SYNCs just
after a request is accepted on the L2/Memory interface instead of
executing the legacy SYNC and waiting for the full response from
L2/Memory. Needless to say that this will significantly speed the
{read,write}X() IO-accessors due to having more lightweight barriers
around the IO Loads and Stores. There are others MIPS Coherency Manager
optimizations available in framework of that register like cache ops
serialization limits, speculative read enable, etc, which can be useful
for the various MIPS platforms.

[1] MIPS32 P5600 Multiprocessing System Software User's Manual,
Document Number: MD01025, Revision 01.60, April 19, 2016, p. 400

Signed-off-by: Serge Semin 

---

Folks, do you think it would be better to implement a dedicated config for
arch/mips/kernel/mips-cm.c code, which would disable the SI_SyncTxEn
acceptance by setting the GCR_CONTROL.SYNCDIS bit? Currently I intend to
set it in the out platform-specific prom_init() method.
---
 arch/mips/include/asm/mips-cm.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/arch/mips/include/asm/mips-cm.h b/arch/mips/include/asm/mips-cm.h
index aeae2effa123..17b2adf57e0c 100644
--- a/arch/mips/include/asm/mips-cm.h
+++ b/arch/mips/include/asm/mips-cm.h
@@ -143,6 +143,21 @@ GCR_ACCESSOR_RW(64, 0x008, base)
 #define  CM_GCR_BASE_CMDEFTGT_IOCU02
 #define  CM_GCR_BASE_CMDEFTGT_IOCU13
 
+/* GCR_CONTROL - Global CM2 Settings */
+GCR_ACCESSOR_RW(64, 0x010, control)
+#define CM_GCR_CONTROL_SYNCCTL BIT(16)
+#define CM_GCR_CONTROL_SYNCDIS BIT(5)
+#define CM_GCR_CONTROL_IVU_EN  BIT(4)
+#define CM_GCR_CONTROL_SHST_EN BIT(3)
+#define CM_GCR_CONTROL_PARK_EN BIT(2)
+#define CM_GCR_CONTROL_MMIO_LIMIT_DIS  BIT(1)
+#define CM_GCR_CONTROL_SPEC_READ_ENBIT(0)
+
+/* GCR_CONTROL2 - Global CM2 Settings (continue) */
+GCR_ACCESSOR_RW(64, 0x018, control2)
+#define CM_GCR_CONTROL2_L2_CACHEOP_LIMIT   GENMASK(19, 16)
+#define CM_GCR_CONTROL2_L1_CACHEOP_LIMIT   GENMASK(3, 0)
+
 /* GCR_ACCESS - Controls core/IOCU access to GCRs */
 GCR_ACCESSOR_RW(32, 0x020, access)
 #define CM_GCR_ACCESS_ACCESSEN GENMASK(7, 0)
-- 
2.27.0



[PATCH 0/2] mips: Introduce some IO-accessors optimizations

2020-09-20 Thread Serge Semin
It has been discovered that on our MIPS P5600-based CPU the IO accessors
aren't that rapid as they could be even taking into account a relatively
slow AXI2APB bridge embedded into the system interconnect. Turned out we
can introduce two types of optimizations. First we can remove the
execution barriers from the relaxed IO-accessors as our CPU conforms to
the MIPS Coherency Protocol Specification [1, 2]. Of course it also
concerns the IO interconnect implementation. So in accordance with [3] we
suggest to remove the barriers at least for the platforms which conform
the specification the same way as ours. Second there is a dedicated
Coherency Manager control register, which can be also used to tune the IO
methods up. For some reason it hasn't been added to the MIPS arch code so
far, while it provides flags for instance to speed the SYNC barrier for
the platforms with non-re-ordering IO interconnect, to set the cache ops
serialization limits, enable the speculative reads, etc. For now we
suggest to add just the macro with the CM2 GCR_CONTROL register accessors
and fields description. So any platform could use it to activate the
corresponding optimization. Our platform-wise we'll do this in the
framework of our Baikal-T1 platform code in the prom_init() method.

[1] MIPS Coherence Protocol Specification, Document Number: MD00605,
Revision 01.01. September 14, 2015, 4.2 Execution Order Behavior,
p. 33

[2] MIPS Coherence Protocol Specification, Document Number: MD00605,
Revision 01.01. September 14, 2015, 4.8.1 IO Device Access, p. 58

[3] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt,
Section "KERNEL I/O BARRIER EFFECTS"

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: Vadim Vlasov 
Cc: Maciej W. Rozycki 
Cc: linux-m...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (2):
  mips: Add strong UC ordering config
  mips: Introduce MIPS CM2 GCR Control register accessors

 arch/mips/Kconfig   |  8 
 arch/mips/include/asm/io.h  | 20 ++--
 arch/mips/include/asm/mips-cm.h | 15 +++
 3 files changed, 33 insertions(+), 10 deletions(-)

-- 
2.27.0



Re: [PATCH] net: phy: realtek: fix rtl8211e rx/tx delay config

2020-09-18 Thread Serge Semin
Hello Andrew.

On Fri, Sep 18, 2020 at 03:54:03PM +0200, Andrew Lunn wrote:
> On Fri, Sep 18, 2020 at 06:55:16AM +, 劉偉權 wrote:
> > Hi Serge,
> 
> > Thanks for your reply. There is a confidential issue that realtek
> > doesn't offer the detail of a full register layout for configuration
> > register.
> 
> ...
> 
> > >* 0xa4 extension page (0x7) layout. It can be used to disable/enable
> > >* the RX/TX delays otherwise controlled by RXDLY/TXDLY pins. It can
> > >* also be used to customize the whole configuration register:
> > 
> > > -  * 8:6 = PHY Address, 5:4 = Auto-Negotiation, 3 = Interface Mode Select,
> > > -  * 2 = RX Delay, 1 = TX Delay, 0 = SELRGV (see original PHY datasheet
> > > -  * for details).
> > > +  * 13 = Force Tx RX Delay controlled by bit12 bit11,
> > > +  * 12 = RX Delay, 11 = TX Delay
> > 
> 
> > Here you've removed the register layout description and replaced itq
> > with just three bits info. So from now the text above doesn't really
> > corresponds to what follows.
> 
> > I might have forgotten something, but AFAIR that register bits
> > stateq mapped well to what was available on the corresponding
> > external pins.
> 
> Hi Willy
> 

> So it appears bits 3 to 8 have been reverse engineered. Unless you
> know from your confidential datasheet that these are wrong, please
> leave the comment alone.
> 
> If you confidential datasheet says that the usage of bits 0-2 is
> wrong, then please do correct that part.

I've got that info from Kyle post here:
https://reviews.freebsd.org/D13591

My work with that problem has been done more than a year ago, so I don't
remember all the details. But IIRC the very first nine bits [8:0] must be a copy
of what is set on the external configuration pins as I described in the comment.
AFAIR I tried to manually change these pins [1] and detected that change right
there in the register. That fully fitted to what Kyle wrote in his post. Alas I
can't repeat it to be completely sure at the moment due to the lack of the
hardware. So I might have missed something, and Willy' confirmation about that
would have been more than welcome. What we can say now for sure I didn't use
the magic number in my patch. That could have been a mistake from what Willy
says in the commit-log...

Anyway aside with the Magic bits settings (which by Willy' patch appears to be
the Tx/Rx delays + Force Tx/Rx delay setting) Kyle also clears the bits 1-2 with
a comment "Ensure both internal delays are turned off". Willy, what can you say
about that? Can we really leave them out from the change? Kyle, could you give
us a comment about your experience with that?

[1] RTL8211E-VB-CG, RTL8211E-VL-CG, RTL8211E-VL-CG, "INTEGRATED 10/100/1000M 
ETHERNET
TRANSCEIVER" Datasheet, Table 13. Configuration Register Definition, Rev. 
1.6,
03 April 2012, Track ID: JATR-3375-16, p.16

-Sergey

> 
>Andrew


Re: [PATCH] net: phy: realtek: fix rtl8211e rx/tx delay config

2020-09-17 Thread Serge Semin
Hello Willy,
Thanks for the patch. My comments are below.

I've Cc'ed the U-boot/FreeBSD, who might be also interested in the solution
you've provided.

On Thu, Sep 17, 2020 at 09:47:33AM +0800, Willy Liu wrote:
> RGMII RX Delay and TX Delay settings will not applied if Force TX RX Delay
> Control bit is not set.
> Register bit for configuration pins:
> 13 = force Tx RX Delay controlled by bit12 bit11
> 12 = Tx Delay
> 11 = Rx Delay

This is a very useful information, but it contradicts a bit to what knowledge
we've currently got about that magical register. Current code in U-boot does
the delays configuration by means of another bits:
https://elixir.bootlin.com/u-boot/v2020.10-rc4/source/drivers/net/phy/realtek.c

Could you provide a full register layout, so we'd know for sure what that
register really does and finally close the question for good?

> 
> Fixes: f81dadbcf7fd ("net: phy: realtek: Add rtl8211e rx/tx delays config")
> Signed-off-by: Willy Liu 
> ---
>  drivers/net/phy/realtek.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
>  mode change 100644 => 100755 drivers/net/phy/realtek.c
> 
> diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
> old mode 100644
> new mode 100755
> index 95dbe5e..3fddd57
> --- a/drivers/net/phy/realtek.c
> +++ b/drivers/net/phy/realtek.c
> @@ -32,9 +32,9 @@
>  #define RTL8211F_TX_DELAYBIT(8)
>  #define RTL8211F_RX_DELAYBIT(3)
>  

> -#define RTL8211E_TX_DELAYBIT(1)
> -#define RTL8211E_RX_DELAYBIT(2)
> -#define RTL8211E_MODE_MII_GMII   BIT(3)
> +#define RTL8211E_CTRL_DELAY  BIT(13)
> +#define RTL8211E_TX_DELAYBIT(12)
> +#define RTL8211E_RX_DELAYBIT(11)

So, what do BIT(1) and BIT(2) control then? Could you explain?

>  
>  #define RTL8201F_ISR 0x1e
>  #define RTL8201F_IER 0x13
> @@ -249,13 +249,13 @@ static int rtl8211e_config_init(struct phy_device 
> *phydev)
>   val = 0;
>   break;
>   case PHY_INTERFACE_MODE_RGMII_ID:
> - val = RTL8211E_TX_DELAY | RTL8211E_RX_DELAY;
> + val = RTL8211E_CTRL_DELAY | RTL8211E_TX_DELAY | 
> RTL8211E_RX_DELAY;
>   break;
>   case PHY_INTERFACE_MODE_RGMII_RXID:
> - val = RTL8211E_RX_DELAY;
> + val = RTL8211E_CTRL_DELAY | RTL8211E_RX_DELAY;
>   break;
>   case PHY_INTERFACE_MODE_RGMII_TXID:
> - val = RTL8211E_TX_DELAY;
> + val = RTL8211E_CTRL_DELAY | RTL8211E_TX_DELAY;
>   break;
>   default: /* the rest of the modes imply leaving delays as is. */
>   return 0;
> @@ -265,9 +265,8 @@ static int rtl8211e_config_init(struct phy_device *phydev)
>* 0xa4 extension page (0x7) layout. It can be used to disable/enable
>* the RX/TX delays otherwise controlled by RXDLY/TXDLY pins. It can
>* also be used to customize the whole configuration register:

> -  * 8:6 = PHY Address, 5:4 = Auto-Negotiation, 3 = Interface Mode Select,
> -  * 2 = RX Delay, 1 = TX Delay, 0 = SELRGV (see original PHY datasheet
> -  * for details).
> +  * 13 = Force Tx RX Delay controlled by bit12 bit11,
> +  * 12 = RX Delay, 11 = TX Delay

Here you've removed the register layout description and replaced it with just 
three
bits info. So from now the text above doesn't really corresponds to what 
follows.

I might have forgotten something, but AFAIR that register bits state mapped
well to what was available on the corresponding external pins. So if you've got
a sacred knowledge what configs are really hidden behind that register, please
open it up. This in-code comment would be a good place to provide the full
register description.

-Sergey

>*/
>   oldpage = phy_select_page(phydev, 0x7);
>   if (oldpage < 0)
> @@ -277,7 +276,8 @@ static int rtl8211e_config_init(struct phy_device *phydev)
>   if (ret)
>   goto err_restore_page;
>  
> - ret = __phy_modify(phydev, 0x1c, RTL8211E_TX_DELAY | RTL8211E_RX_DELAY,
> + ret = __phy_modify(phydev, 0x1c, RTL8211E_CTRL_DELAY
> +| RTL8211E_TX_DELAY | RTL8211E_RX_DELAY,
>  val);
>  
>  err_restore_page:
> -- 
> 1.9.1
> 


Re: [PATCH] gpio: dwapb: add support for new hisilicon ascend soc

2020-09-12 Thread Serge Semin
On Sun, Sep 06, 2020 at 06:18:07AM +, Dingtianhong wrote:

[...]

> > On Sat, Aug 22, 2020 at 12:27:53PM +0800, Ding Tianhong wrote:
> >> The hisilicon ascend soc's gpio is based on the synopsys DW gpio,
> >> and expand the register to support for INTCOMB_MASK, the new
> >> register is used to enable/disable the interrupt combine features.
> > 
> > I am wondering what does the "Interrupt Combine" feature do? Is it the same 
> > as
> > the GPIO_INTR_IO pre-synthesize parameter of the DW_apb_gpio IP-core? Is it
> > possible to tune the DW APB GPIO controller at runtime sending up to the IRQ
> > controller either combined or individual interrupts?
> > 
> 
> looks like the same.
> 
> > If the later is true, then probably having the "Interrupt Combine" feature
> > enabled must depend on whether an individual or combined interrupts are 
> > supplied
> > in dts, etc...
> > 
> 
> needed.
> 
> > Could you explain the way the feature works and the corresponding layout
> > register in more details?
> > 
> 
> Ok
> The hisilicon chip use the register called GPIO_INTCOMB_MASK (offset is 
> 0xffc) to enable the combien interrupt.
> it is very simple, if GPIO_INTCOMB_MASK.bit0 is 0, then all combine 
> interrupte could not be used (default
> setting), otherwise if 1, then the 32 ports could use the same irq line, that 
> is all.

The main question is whether your hardware is capable of working with both
combined and individual interrupts. Is your version of the DW APB GPIO
controller able to generate both types of them? How is it connected to the
parental interrupt controller?

So If the GPIO and IRQ controllers are attached to each other with a single lane
gpio_intr_flag, then indeed it's pure combined IRQ design and we'll have to make
sure that GPIO_INTCOMB_MASK.bit0 is set to one before using the DW GPIO block.
If they are also connected with each other by 32 individual GPIO-IRQ
gpio_intr{_n}N lanes, then setting or clearing of the GPIO_INTCOMB_MASK.bit0
flag will for example depend on the number IRQ lanes specified in a dts file. In
the later case the patch needs to be altered, but it would provide a better
support of the hisilicon ascend soc's GPIO capabilities.

-Sergey


Re: [PATCH] gpio: dwapb: add support for new hisilicon ascend soc

2020-08-25 Thread Serge Semin
Hello Ding,
Thanks for the patch. My comments are below.

On Sat, Aug 22, 2020 at 12:27:53PM +0800, Ding Tianhong wrote:
> The hisilicon ascend soc's gpio is based on the synopsys DW gpio,
> and expand the register to support for INTCOMB_MASK, the new
> register is used to enable/disable the interrupt combine features.

I am wondering what does the "Interrupt Combine" feature do? Is it the same as
the GPIO_INTR_IO pre-synthesize parameter of the DW_apb_gpio IP-core? Is it
possible to tune the DW APB GPIO controller at runtime sending up to the IRQ
controller either combined or individual interrupts?

If the later is true, then probably having the "Interrupt Combine" feature
enabled must depend on whether an individual or combined interrupts are supplied
in dts, etc...

Could you explain the way the feature works and the corresponding layout
register in more details?

> 
> Both support for ACPI and Device Tree.
> 
> Signed-off-by: Ding Tianhong 
> ---
>  drivers/gpio/gpio-dwapb.c | 28 
>  1 file changed, 28 insertions(+)
> 
> diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
> index 1d8d55b..923b381 100644
> --- a/drivers/gpio/gpio-dwapb.c
> +++ b/drivers/gpio/gpio-dwapb.c
> @@ -49,6 +49,8 @@
>  #define GPIO_EXT_PORTC   0x58
>  #define GPIO_EXT_PORTD   0x5c

>  
> +#define GPIO_INTCOMB_MASK0xffc

If the register is the HiSilicon Ascend SoC specific, then I'd suggest to add
the vendor-specific prefix like: GPIO_HS_INTCOMB_MASK.

Also pay attention to the register naming. Here you define it as "INTCOMB",
later as "INT_COMB". It's better to have the same references everywhere: either
with underscore or without it.

Also please move the new register definition to be after the corresponding
feature macro definition (see the next comment for detailts).

> +
>  #define DWAPB_DRIVER_NAME"gpio-dwapb"
>  #define DWAPB_MAX_PORTS  4
>  
> @@ -58,6 +60,10 @@
>  
>  #define GPIO_REG_OFFSET_V2   1

>  
> +#define GPIO_REG_INT_COMB2

Please move this macro to be define after the "GPIO_PORTA_EOI_V2" one, so to
make the blocked-like macro definitions order. Like this:
#define GPIO_REG_OFFSET_V2  BIT(0)  // Reg V2 Feature macro

#define GPIO_INTMASK_V2 0x44// Reg V2-specific macro
...
#define GPIO_PORTA_EOI_V2   0x40// Reg V2-specific macro

+ #define GPIO_REG_HS_INTCOMB   BIT(1)  // HiSilicon Feature macro
+ 
+ #define GPIO_HS_INTCOMB_MASK  0xffc   // HiSilicon-specific macro
+ 

I missed that in my series, but having the flags defined as decimals isn't good.
Could you convert GPIO_REG_HS_INTCOMB and GPIO_REG_OFFSET_V2 to be defined as 
BIT(x)?

The same comment about HS_-prefixing. Perhaps GPIO_REG_HS_INTCOMB?

> +#define ENABLE_INT_COMB  1
> +#define DISABLE_INT_COMB 0

I don't really see a point in having these two macros defined. They are 
basically
just the boolean flags. Please see my next comment.

> +
>  #define GPIO_INTMASK_V2  0x44
>  #define GPIO_INTTYPE_LEVEL_V20x34
>  #define GPIO_INT_POLARITY_V2 0x38
> @@ -354,6 +360,20 @@ static irqreturn_t dwapb_irq_handler_mfd(int irq, void 
> *dev_id)
>   return IRQ_RETVAL(dwapb_do_irq(dev_id));
>  }
>  

> +static void dwapb_enable_inq_combine(struct dwapb_gpio *gpio, unsigned int 
> enable)

inq_combine or int_combine?

"enable" is used here as boolean, then it should be declared as one.

> +{
> + u32 val;
> +
> + if (gpio->flags & GPIO_REG_INT_COMB) {
> + val = dwapb_read(gpio, GPIO_INTCOMB_MASK);
> + if (enable)
> + val |= BIT(0);
> + else
> + val &= BIT(0);
> + dwapb_write(gpio, GPIO_INTCOMB_MASK, val);
> + }
> +}
> +

Do you need to preserve the register content by using the read-modify-write 
procedure
here? If not, then inlining something like this should be fine:
if (gpio->flags & GPIO_REG_HS_INTCOMB)
dwapb_write(gpio, GPIO_HS_INTCOMB_MASK, 1);

>  static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
>struct dwapb_gpio_port *port,
>struct dwapb_port_property *pp)
> @@ -446,6 +466,8 @@ static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
>   irq_create_mapping(gpio->domain, hwirq);
>  
>   port->gc.to_irq = dwapb_gpio_to_irq;
> +
> + dwapb_enable_inq_combine(gpio, ENABLE_INT_COMB);
>  }
>  
>  static void dwapb_irq_teardown(struct dwapb_gpio *gpio)
> @@ -618,6 +640,7 @@ static struct dwapb_platform_data 
> *dwapb_gpio_get_pdata(struct device *dev)
>  static const struct of_device_id dwapb_of_match[] = {
>   { .compatible = "snps,dw-apb-gpio", .data = (void *)0},
>   { .compatible = "apm,xgene-gpio-v2", .data = (void 
> *)GPIO_REG_OFFSET_V2},
> + { .compatible = "hisi,ascend-gpio", .data = (void *)GPIO_REG_INT_COMB},
>  

Re: [PATCH -next] bus: bt1-apb: remove duplicate include

2020-08-18 Thread Serge Semin
On Wed, Aug 19, 2020 at 10:43:51AM +0800, Wang Hai wrote:
> Remove linux/clk.h which is included more than once
> 
> Reported-by: Hulk Robot 
> Signed-off-by: Wang Hai 

Thanks!

Acked-by: Serge Semin 

> ---
>  drivers/bus/bt1-apb.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/bus/bt1-apb.c b/drivers/bus/bt1-apb.c
> index b25ff941e7c7..74b1b712ef3a 100644
> --- a/drivers/bus/bt1-apb.c
> +++ b/drivers/bus/bt1-apb.c
> @@ -22,7 +22,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  
>  #define APB_EHB_ISR  0x00
> -- 
> 2.17.1
> 


[PATCH v2 2/5] dmaengine: dw: Activate FIFO-mode for memory peripherals only

2020-07-31 Thread Serge Semin
CFGx.FIFO_MODE field controls a DMA-controller "FIFO readiness" criterion.
In other words it determines when to start pushing data out of a DW
DMAC channel FIFO to a destination peripheral or from a source
peripheral to the DW DMAC channel FIFO. Currently FIFO-mode is set to one
for all DW DMAC channels. It means they are tuned to flush data out of
FIFO (to a memory peripheral or by accepting the burst transaction
requests) when FIFO is at least half-full (except at the end of the block
transfer, when FIFO-flush mode is activated) and are configured to get
data to the FIFO when it's at least half-empty.

Such configuration is a good choice when there is no slave device involved
in the DMA transfers. In that case the number of bursts per block is less
than when CFGx.FIFO_MODE = 0 and, hence, the bus utilization will improve.
But the latency of DMA transfers may increase when CFGx.FIFO_MODE = 1,
since DW DMAC will wait for the channel FIFO contents to be either
half-full or half-empty depending on having the destination or the source
transfers. Such latencies might be dangerous in case if the DMA transfers
are expected to be performed from/to a slave device. Since normally
peripheral devices keep data in internal FIFOs, any latency at some
critical moment may cause one being overflown and consequently losing
data. This especially concerns a case when either a peripheral device is
relatively fast or the DW DMAC engine is relatively slow with respect to
the incoming data pace.

In order to solve problems, which might be caused by the latencies
described above, let's enable the FIFO half-full/half-empty "FIFO
readiness" criterion only for DMA transfers with no slave device involved.
Thanks to the commit 99ba8b9b0d97 ("dmaengine: dw: Initialize channel
before each transfer") we can freely do that in the generic
dw_dma_initialize_chan() method.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/dma/dw/dw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/dw/dw.c b/drivers/dma/dw/dw.c
index 7a085b3c1854..d9810980920a 100644
--- a/drivers/dma/dw/dw.c
+++ b/drivers/dma/dw/dw.c
@@ -14,7 +14,7 @@
 static void dw_dma_initialize_chan(struct dw_dma_chan *dwc)
 {
struct dw_dma *dw = to_dw_dma(dwc->chan.device);
-   u32 cfghi = DWC_CFGH_FIFO_MODE;
+   u32 cfghi = is_slave_direction(dwc->direction) ? 0 : DWC_CFGH_FIFO_MODE;
u32 cfglo = DWC_CFGL_CH_PRIOR(dwc->priority);
bool hs_polarity = dwc->dws.hs_polarity;
 
-- 
2.27.0



[PATCH v2 0/5] dmaengine: dw: Introduce non-mem peripherals optimizations

2020-07-31 Thread Serge Semin
After a lot of tests and thorough DW DMAC databook studying we've
discovered that the driver can be optimized especially when it comes to
working with non-memory peripherals.

First of all we've found out that since each DW DMAC channel can
be synthesized with different parameters, then even when two of them
are configured to perform the same DMA transactions they may execute them
with different performance. Since some DMA client devices might be
sensitive to such important parameter as performance, then it is a good
idea to let them request only suitable DMA channels. In this patchset we
introduce a functionality, which makes it possible by passing the DMA
channels mask either over the "dmas" DT property or in the dw_dma_slave
platform data descriptor.

Secondly FIFO-mode of the "FIFO readiness" criterion is more suitable for
the pure memory DMA transfers, since it minimizes the system bus
utilization, but causes some performance drop. When it comes to working with
non-memory peripherals the DMA engine performance comes to the first
place. Since normally DMA client devices keep data in internal FIFOs, any
latency at some critical moment may cause a FIFO being overflown and
consequently losing data. So in order to minimize a chance of the DW DMAC
internal FIFO being a bottle neck during the DMA transfers to and from
non-memory peripherals we propose not to use FIFO-mode for them.

Thirdly it has been discovered that using a DMA transaction length is
redundant when calculating the destination transfer width for the
dev-to-mem DMA communications. That shall increase performance of the DMA
transfers with unaligned data length.

Finally there is a small optimization in the burst length setting. In
particular we've found out, that according to the DW DMAC databoot it's
pointless to set one for the memory peripherals since they don't have
handshaking interface connected to the DMA controller. So we suggest to
just ignore the burst length config when it comes to setting the memory
peripherals up.

Link: 
https://lore.kernel.org/dmaengine/20200730154545.3965-1-sergey.se...@baikalelectronics.ru
Changelog v2:
- Add Databook version to the commits log.
- Use the statement "slave.channels >= BIT(dw->pdata->nr_channels)" to
  make sure the permitted DMA-channels pool is valid.
- Describe new DW DMAC "channels" mask in a single line even though it
  gets out of 80 columns limit.

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: Peter Ujfalusi 
Cc: Andy Shevchenko 
Cc: Rob Herring 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (5):
  dt-bindings: dma: dw: Add optional DMA-channels mask cell support
  dmaengine: dw: Activate FIFO-mode for memory peripherals only
  dmaengine: dw: Discard dlen from the dev-to-mem xfer width calculation
  dmaengine: dw: Ignore burst setting for memory peripherals
  dmaengine: dw: Add DMA-channels mask cell support

 .../devicetree/bindings/dma/snps,dma-spear1340.yaml| 7 +--
 drivers/dma/dw/core.c  | 6 +-
 drivers/dma/dw/dw.c| 7 +++
 drivers/dma/dw/idma32.c| 5 ++---
 drivers/dma/dw/of.c| 7 +--
 include/linux/platform_data/dma-dw.h   | 2 ++
 6 files changed, 22 insertions(+), 12 deletions(-)

-- 
2.27.0



[PATCH v2 3/5] dmaengine: dw: Discard dlen from the dev-to-mem xfer width calculation

2020-07-31 Thread Serge Semin
Indeed in case of the DMA_DEV_TO_MEM DMA transfers it's enough to take the
destination memory address and the destination master data width into
account to calculate the CTLx.DST_TR_WIDTH setting of the memory
peripheral. According to the DW DMAC IP-core Databook 2.18b (page 66,
Example 5) at the and of a DMA transfer when the DMA-channel internal FIFO
is left with data less than for a single destination burst transaction,
the destination peripheral will enter the Single Transaction Region where
the DW DMA controller can complete a block transfer to the destination
using single transactions (non-burst transaction of CTLx.DST_TR_WIDTH
bytes). If there is no enough data in the DMA-channel internal FIFO for
even a single non-burst transaction of CTLx.DST_TR_WIDTH bytes, then the
channel enters "FIFO flush mode". That mode is activated to empty the FIFO
and flush the leftovers out to the memory peripheral. The flushing
procedure is simple.  The data is sent to the memory by means of a set of
single transaction of CTLx.SRC_TR_WIDTH bytes. To sum up it's redundant to
use the LLPs length to find out the CTLx.DST_TR_WIDTH parameter value,
since each DMA transfer will be completed with the CTLx.SRC_TR_WIDTH bytes
transaction if it is required.

We suggest to remove the LLP entry length from the statement which
calculates the memory peripheral DMA transaction width since it's
redundant due to the feature described above. By doing so we'll improve
the memory bus utilization and speed up the DMA-channel performance for
DMA_DEV_TO_MEM DMA-transfers.

Signed-off-by: Serge Semin 
Acked-by: Andy Shevchenko 

---

Changelog v2:
- Add Databook version to the commit log.
---
 drivers/dma/dw/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 4700f2e87a62..3da0aea9fe25 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -723,7 +723,7 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist 
*sgl,
lli_write(desc, sar, reg);
lli_write(desc, dar, mem);
lli_write(desc, ctlhi, ctlhi);
-   mem_width = __ffs(data_width | mem | dlen);
+   mem_width = __ffs(data_width | mem);
lli_write(desc, ctllo, ctllo | 
DWC_CTLL_DST_WIDTH(mem_width));
desc->len = dlen;
 
-- 
2.27.0



[PATCH v2 4/5] dmaengine: dw: Ignore burst setting for memory peripherals

2020-07-31 Thread Serge Semin
According to the DW DMA controller Databook 2.18b (page 40 "3.5 Memory
Peripherals") memory peripherals don't have handshaking interface
connected to the controller, therefore they can never be a flow
controller. Since the CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are properties
valid only for peripherals with a handshaking interface, we can freely
zero these fields out if the memory peripheral is selected to be the
source or the destination of the DMA transfers.

Note according to the databook, length of burst transfers to memory is
always equal to the number of data items available in a channel FIFO or
data items required to complete the block transfer, whichever is smaller;
length of burst transfers from memory is always equal to the space
available in a channel FIFO or number of data items required to complete
the block transfer, whichever is smaller.

Signed-off-by: Serge Semin 

---

Changelog v2:
- Add Databook version to the commit log.
---
 drivers/dma/dw/dw.c | 5 ++---
 drivers/dma/dw/idma32.c | 5 ++---
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/dma/dw/dw.c b/drivers/dma/dw/dw.c
index d9810980920a..a4862263ff14 100644
--- a/drivers/dma/dw/dw.c
+++ b/drivers/dma/dw/dw.c
@@ -67,9 +67,8 @@ static size_t dw_dma_block2bytes(struct dw_dma_chan *dwc, u32 
block, u32 width)
 static u32 dw_dma_prepare_ctllo(struct dw_dma_chan *dwc)
 {
struct dma_slave_config *sconfig = >dma_sconfig;
-   bool is_slave = is_slave_direction(dwc->direction);
-   u8 smsize = is_slave ? sconfig->src_maxburst : DW_DMA_MSIZE_16;
-   u8 dmsize = is_slave ? sconfig->dst_maxburst : DW_DMA_MSIZE_16;
+   u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst 
: 0;
+   u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst 
: 0;
u8 p_master = dwc->dws.p_master;
u8 m_master = dwc->dws.m_master;
u8 dms = (dwc->direction == DMA_MEM_TO_DEV) ? p_master : m_master;
diff --git a/drivers/dma/dw/idma32.c b/drivers/dma/dw/idma32.c
index f00657308811..3ce44de25d33 100644
--- a/drivers/dma/dw/idma32.c
+++ b/drivers/dma/dw/idma32.c
@@ -73,9 +73,8 @@ static size_t idma32_block2bytes(struct dw_dma_chan *dwc, u32 
block, u32 width)
 static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)
 {
struct dma_slave_config *sconfig = >dma_sconfig;
-   bool is_slave = is_slave_direction(dwc->direction);
-   u8 smsize = is_slave ? sconfig->src_maxburst : IDMA32_MSIZE_8;
-   u8 dmsize = is_slave ? sconfig->dst_maxburst : IDMA32_MSIZE_8;
+   u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst 
: 0;
+   u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst 
: 0;
 
return DWC_CTLL_LLP_D_EN | DWC_CTLL_LLP_S_EN |
   DWC_CTLL_DST_MSIZE(dmsize) | DWC_CTLL_SRC_MSIZE(smsize);
-- 
2.27.0



[PATCH v2 5/5] dmaengine: dw: Add DMA-channels mask cell support

2020-07-31 Thread Serge Semin
DW DMA IP-core provides a way to synthesize the DMA controller with
channels having different parameters like maximum burst-length,
multi-block support, maximum data width, etc. Those parameters both
explicitly and implicitly affect the channels performance. Since DMA slave
devices might be very demanding to the DMA performance, let's provide a
functionality for the slaves to be assigned with DW DMA channels, which
performance according to the platform engineer fulfill their requirements.
After this patch is applied it can be done by passing the mask of suitable
DMA-channels either directly in the dw_dma_slave structure instance or as
a fifth cell of the DMA DT-property. If mask is zero or not provided, then
there is no limitation on the channels allocation.

For instance Baikal-T1 SoC is equipped with a DW DMAC engine, which first
two channels are synthesized with max burst length of 16, while the rest
of the channels have been created with max-burst-len=4. It would seem that
the first two channels must be faster than the others and should be more
preferable for the time-critical DMA slave devices. In practice it turned
out that the situation is quite the opposite. The channels with
max-burst-len=4 demonstrated a better performance than the channels with
max-burst-len=16 even when they both had been initialized with the same
settings. The performance drop of the first two DMA-channels made them
unsuitable for the DW APB SSI slave device. No matter what settings they
are configured with, full-duplex SPI transfers occasionally experience the
Rx FIFO overflow. It means that the DMA-engine doesn't keep up with
incoming data pace even though the SPI-bus is enabled with speed of 25MHz
while the DW DMA controller is clocked with 50MHz signal. There is no such
problem has been noticed for the channels synthesized with
max-burst-len=4.

Signed-off-by: Serge Semin 

---

Changelog v2:
- Use the statement "slave.channels >= BIT(dw->pdata->nr_channels)" to
  make sure the permitted DMA-channels pool is valid.
- Describe new DW DMAC "channels" mask in a single line even though it
  gets out of 80 columns limit.
---
 drivers/dma/dw/core.c| 4 
 drivers/dma/dw/of.c  | 7 +--
 include/linux/platform_data/dma-dw.h | 2 ++
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 3da0aea9fe25..5f7b9badb965 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -772,6 +772,10 @@ bool dw_dma_filter(struct dma_chan *chan, void *param)
if (dws->dma_dev != chan->device->dev)
return false;
 
+   /* permit channels in accordance with the channels mask */
+   if (dws->channels && !(dws->channels & dwc->mask))
+   return false;
+
/* We have to copy data since dws can be temporary storage */
memcpy(>dws, dws, sizeof(struct dw_dma_slave));
 
diff --git a/drivers/dma/dw/of.c b/drivers/dma/dw/of.c
index 1474b3817ef4..c1cf7675b9d1 100644
--- a/drivers/dma/dw/of.c
+++ b/drivers/dma/dw/of.c
@@ -22,18 +22,21 @@ static struct dma_chan *dw_dma_of_xlate(struct 
of_phandle_args *dma_spec,
};
dma_cap_mask_t cap;
 
-   if (dma_spec->args_count != 3)
+   if (dma_spec->args_count < 3 || dma_spec->args_count > 4)
return NULL;
 
slave.src_id = dma_spec->args[0];
slave.dst_id = dma_spec->args[0];
slave.m_master = dma_spec->args[1];
slave.p_master = dma_spec->args[2];
+   if (dma_spec->args_count >= 4)
+   slave.channels = dma_spec->args[3];
 
if (WARN_ON(slave.src_id >= DW_DMA_MAX_NR_REQUESTS ||
slave.dst_id >= DW_DMA_MAX_NR_REQUESTS ||
slave.m_master >= dw->pdata->nr_masters ||
-   slave.p_master >= dw->pdata->nr_masters))
+   slave.p_master >= dw->pdata->nr_masters ||
+   slave.channels >= BIT(dw->pdata->nr_channels)))
return NULL;
 
dma_cap_zero(cap);
diff --git a/include/linux/platform_data/dma-dw.h 
b/include/linux/platform_data/dma-dw.h
index 4f681df85c27..4ca18e29c392 100644
--- a/include/linux/platform_data/dma-dw.h
+++ b/include/linux/platform_data/dma-dw.h
@@ -23,6 +23,7 @@
  * @dst_id:dst request line
  * @m_master:  memory master for transfers on allocated channel
  * @p_master:  peripheral master for transfers on allocated channel
+ * @channels:  mask of the channels permitted for allocation (zero value means 
any)
  * @hs_polarity:set active low polarity of handshake interface
  */
 struct dw_dma_slave {
@@ -31,6 +32,7 @@ struct dw_dma_slave {
u8  dst_id;
u8  m_master;
u8  p_master;
+   u8  channels;
boolhs_polarity;
 };
 
-- 
2.27.0



[PATCH v2 1/5] dt-bindings: dma: dw: Add optional DMA-channels mask cell support

2020-07-31 Thread Serge Semin
Each DW DMA controller channel can be synthesized with different
parameters like maximum burst-length, multi-block support, maximum data
width, etc. Most of these parameters determine the DW DMAC channels
performance in its own aspect. On the other hand these parameters can
be implicitly responsible for the channels performance degradation
(for instance multi-block support is a very useful feature, but having
it disabled during the DW DMAC synthesize will provide a more optimized
core). Since DMA slave devices may have critical dependency on the DMA
engine performance, let's provide a way for the slave devices to have
the DMA-channels allocated from a pool of the channels, which according
to the system engineer fulfill their performance requirements.

The pool is determined by a mask optionally specified in the fifth
DMA-cell of the DMA DT-property. If the fifth cell is omitted from the
phandle arguments or the mask is zero, then the allocation will be
performed from a set of all channels provided by the DMA controller.

Signed-off-by: Serge Semin 
---
 .../devicetree/bindings/dma/snps,dma-spear1340.yaml| 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml 
b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
index 20870f5c14dd..ef1d6879c158 100644
--- a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
+++ b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
@@ -18,12 +18,15 @@ properties:
 const: snps,dma-spear1340
 
   "#dma-cells":
-const: 3
+minimum: 3
+maximum: 4
 description: |
   First cell is a phandle pointing to the DMA controller. Second one is
   the DMA request line number. Third cell is the memory master identifier
   for transfers on dynamically allocated channel. Fourth cell is the
-  peripheral master identifier for transfers on an allocated channel.
+  peripheral master identifier for transfers on an allocated channel. Fifth
+  cell is an optional mask of the DMA channels permitted to be allocated
+  for the corresponding client device.
 
   reg:
 maxItems: 1
-- 
2.27.0



Re: [PATCH 2/5] dmaengine: dw: Activate FIFO-mode for memory peripherals only

2020-07-31 Thread Serge Semin
On Fri, Jul 31, 2020 at 10:22:54PM +0530, Vinod Koul wrote:
> On 30-07-20, 20:13, Serge Semin wrote:
> > On Thu, Jul 30, 2020 at 07:47:03PM +0300, Andy Shevchenko wrote:
> > > On Thu, Jul 30, 2020 at 07:31:54PM +0300, Serge Semin wrote:
> > > > On Thu, Jul 30, 2020 at 07:24:28PM +0300, Andy Shevchenko wrote:
> > > > > On Thu, Jul 30, 2020 at 06:45:42PM +0300, Serge Semin wrote:
> > > 
> > > ...
> > > 
> > > > > > Thanks to the commit  ("dmaengine: dw: Initialize 
> > > > > > channel
> > > 
> > > ...
> > > 
> > > > > > Note the DMA-engine repository 
> > > > > > git.infradead.org/users/vkoul/slave-dma.git
> > > > > > isn't accessible. So I couldn't find out the Andy' commit hash to 
> > > > > > use it in
> > > > > > the log.
> 
> Yeah I moved tree to k.org after disk issue with infradead, change patch
> was on dmaengine ML
> 
> > > > > It's dmaengine.git on git.kernel.org.
> > > > 
> > > > Ah, thanks! I've just found out that the repo address has been changed. 
> > > > But I've
> > > > also scanned the "next" branch of the repo:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git
> > > > 
> > > > Your commit isn't there. Am I missing something?
> > > 
> > 
> > > It's a fix. It went to upstream branch (don't remember its name by heart 
> > > in
> > > Vinod's repo).
> 
> Yes it is Linus's tree now and in dmaengine you can find in fixes branch
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git/commit/?h=fixes=99ba8b9b0d9780e9937eb1d488d120e9e5c2533d

Thanks for pointing out to the commit. I'll add the hash to the patch log and
resend the series shortly.

-Sergey

> 
> > 
> > Right. Found it. Thanks.
> > 
> > -Sergey
> > 
> > > 
> > > -- 
> > > With Best Regards,
> > > Andy Shevchenko
> > > 
> > > 
> 
> -- 
> ~Vinod


Re: clocksource: dw_apb_timer: commit 6d2e16a3181b broke Arria10 platform

2020-07-31 Thread Serge Semin
On Fri, Jul 31, 2020 at 10:56:37AM -0500, Dinh Nguyen wrote:
> Hi Serge,
> 
> On 7/31/20 1:48 AM, Serge Semin wrote:
> > Hello Dinh,
> > It must be something wrong with your timer2 and timer3 declared in the 
> > Arria10
> > dts because the patch didn't change anything for the first two timers 
> > (timer0 and
> > timer1). It just permits to register all DW APB Timers found in dts.
> > 
> > If those timers are broken, then you should have disabled them in the dts 
> > in the
> > first place. If they are normal, then you need to investigate further why do
> > they cause the kernel panic.
> > 
> 

> Indeed, the dts reg entry for timer3 was broken. Thanks alot for
> exposing the issue. Apologies for the noise.

No worries. Glad I could help.

-Sergey

> 
> Dinh


Re: [PATCH 0/8] spi: dw-dma: Add max SG entries burst capability support

2020-07-31 Thread Serge Semin
On Fri, Jul 31, 2020 at 12:26:12PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 31, 2020 at 10:59:45AM +0300, Serge Semin wrote:
> 
> ...
> 
> > Note since the DMA-engine subsystem in kernel 5.8-rcX doesn't have the
> > max_sg_burst capability supported, this series is intended to be applied
> > only after the "next"-branch of the DMA-engine repository is merged into
> > the mainline repo. Alternatively the series could be merged in through the
> > DMA-engine repo.
> 

> This needs to be thought through...

There is nothing to think about: either Mark completes review/accepts the series
and Vinod agrees to merge it in through the DMA-engine repo, or we'll have to
wait until the next merge window is closed and then merge the series in
traditionally through the SPI repo.

-Sergey

> 
> I gave some simple comments (and on top just try not to modify the same lines
> inside one series two or more times, e.g. ret = func() -> return func() -> ret
> = func() in one case).
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


Re: [PATCH 4/8] spi: dw-dma: Move DMA transfers submission to the channels prep methods

2020-07-31 Thread Serge Semin
On Fri, Jul 31, 2020 at 12:15:28PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 31, 2020 at 10:59:49AM +0300, Serge Semin wrote:
> > Indeed we can freely move the dmaengine_submit() method invocation and the
> > Tx and Rx busy flag setting into the DMA Tx/Rx prepare methods. By doing
> > so first we implement another preparation before adding the one-by-one DMA
> > SG entries transmission, second we now have the dma_async_tx_descriptor
> > descriptor used locally only in the new DMA transfers submitition methods,
> > which makes the code less complex with no passing around the DMA Tx
> > descriptors, third we make the generic transfer method more readable, where
> > now the functionality of submission, execution and wait procedures is
> > transparently split up instead of having a preparation, intermixed
> > submission/execution and wait procedures. While at it we also add the
> > dmaengine_submit() return value test. It has been unnecessary for
> > DW DMAC, but should be done to support the generic DMA interface.
> > 
> > Note since the DMA channels preparation methods are now responsible for
> > the DMA transactions submission, we also rename them to
> > dw_spi_dma_submit_{tx,rx}().
> 
> ...
> 
> > +   cookie = dmaengine_submit(txdesc);
> > +   ret = dma_submit_error(cookie);
> > +   if (!ret)
> 

> Use traditional pattern
>   if (ret)
>   return ret;
> 
> Same for below.

Ok.

> 
> > +   set_bit(TX_BUSY, >dma_chan_busy);
> > +
> > +   return ret;
> 
> ...
> 

> > -   if (!xfer->rx_buf)
> > -   return NULL;
> 
> This seems not related.

I moved it to the upper level for the methods better maintainability.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


Re: [PATCH 3/8] spi: dw-dma: Configure the DMA channels in dma_setup

2020-07-31 Thread Serge Semin
On Fri, Jul 31, 2020 at 12:16:38PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 31, 2020 at 10:59:48AM +0300, Serge Semin wrote:
> > Mainly this is a preparation patch before adding one-by-one DMA SG entries
> > transmission. But logically the Tx and Rx DMA channels setup should be
> > performed in the dma_setup() callback anyway. So let's move the DMA slave
> > channels src/dst burst lengths, address and address width configuration to
> > the DMA setup stage. While at it make sure the return value of the
> > dmaengine_slave_config() method is checked. It has been unnecessary in
> > case if Dw DMAC is utilized as a DMA engine, since its device_config()
> > callback always returns zero (though it might change in future). But since
> > DW APB SSI driver now supports any DMA back-end we must make sure the
> > DMA device configuration has been successful before proceeding with
> > further setups.
> 
> ...
> 

Part 1:
> > +   if (!xfer->rx_buf)
> > +   return NULL;
> 
> ...
> 

Part 2:
> > +   if (xfer->rx_buf) {
> 
> > +   }
> 
> This looks like a separate change to drop one of them and not hide in the 
> next patch.

Both of these changes are a part of the single alteration introduced to detach 
two
methods from each other: dw_spi_dma_{config,prepare}_{rx,tx}(). Part 1 is a
statement, which belongs to the method dw_spi_dma_prepare_rx() and is left there
after dw_spi_dma_config_rx() has been detached from it. Part 2 is a logical
part, which must be presented in dw_spi_dma_setup() since we don't need to 
configure
the Rx DMA channel if rx_buf isn't specified.

Please, read more carefully the commit log. I didn't introduce anything other
than the changes described there.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


[PATCH 2/8] spi: dw-dma: Fail DMA-based transfer if no Tx-buffer specified

2020-07-31 Thread Serge Semin
Since commit 46164fde6b78 ("spi: dw: Fix Rx-only DMA transfers") if DMA
interface is enabled, then Tx-buffer must be available in each SPI
transfer. It's required since in order to activate the incoming data
reception either DMA or CPU must be pushing data out to the SPI bus.
But the DW APB SSI DMA driver code is still left in state as if Tx-buffer
might be optional, which is no longer true. Let's fix it so an error would
be returned if no Tx-buffer detected and DMA Tx would be always
enabled.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 440679fa0764..ec721af61663 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -263,9 +263,6 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
struct dma_slave_config txconf;
struct dma_async_tx_descriptor *txdesc;
 
-   if (!xfer->tx_buf)
-   return NULL;
-
memset(, 0, sizeof(txconf));
txconf.direction = DMA_MEM_TO_DEV;
txconf.dst_addr = dws->dma_addr;
@@ -384,17 +381,19 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
 
 static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
 {
-   u16 imr = 0, dma_ctrl = 0;
+   u16 imr, dma_ctrl;
 
-   if (xfer->tx_buf)
-   dma_ctrl |= SPI_DMA_TDMAE;
+   if (!xfer->tx_buf)
+   return -EINVAL;
+
+   /* Set the DMA handshaking interface */
+   dma_ctrl = SPI_DMA_TDMAE;
if (xfer->rx_buf)
dma_ctrl |= SPI_DMA_RDMAE;
dw_writel(dws, DW_SPI_DMACR, dma_ctrl);
 
/* Set the interrupt mask */
-   if (xfer->tx_buf)
-   imr |= SPI_INT_TXOI;
+   imr = SPI_INT_TXOI;
if (xfer->rx_buf)
imr |= SPI_INT_RXUI | SPI_INT_RXOI;
spi_umask_intr(dws, imr);
@@ -413,6 +412,8 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
 
/* Prepare the TX dma transfer */
txdesc = dw_spi_dma_prepare_tx(dws, xfer);
+   if (!txdesc)
+   return -EINVAL;
 
/* Prepare the RX dma transfer */
rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
@@ -424,17 +425,15 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
dma_async_issue_pending(dws->rxchan);
}
 
-   if (txdesc) {
-   set_bit(TX_BUSY, >dma_chan_busy);
-   dmaengine_submit(txdesc);
-   dma_async_issue_pending(dws->txchan);
-   }
+   set_bit(TX_BUSY, >dma_chan_busy);
+   dmaengine_submit(txdesc);
+   dma_async_issue_pending(dws->txchan);
 
ret = dw_spi_dma_wait(dws, xfer);
if (ret)
return ret;
 
-   if (txdesc && dws->master->cur_msg->status == -EINPROGRESS) {
+   if (dws->master->cur_msg->status == -EINPROGRESS) {
ret = dw_spi_dma_wait_tx_done(dws, xfer);
if (ret)
return ret;
-- 
2.27.0



[PATCH 4/8] spi: dw-dma: Move DMA transfers submission to the channels prep methods

2020-07-31 Thread Serge Semin
Indeed we can freely move the dmaengine_submit() method invocation and the
Tx and Rx busy flag setting into the DMA Tx/Rx prepare methods. By doing
so first we implement another preparation before adding the one-by-one DMA
SG entries transmission, second we now have the dma_async_tx_descriptor
descriptor used locally only in the new DMA transfers submitition methods,
which makes the code less complex with no passing around the DMA Tx
descriptors, third we make the generic transfer method more readable, where
now the functionality of submission, execution and wait procedures is
transparently split up instead of having a preparation, intermixed
submission/execution and wait procedures. While at it we also add the
dmaengine_submit() return value test. It has been unnecessary for
DW DMAC, but should be done to support the generic DMA interface.

Note since the DMA channels preparation methods are now responsible for
the DMA transactions submission, we also rename them to
dw_spi_dma_submit_{tx,rx}().

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 56 ++--
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 56496b659d62..366978086781 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -272,10 +272,11 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws)
return dmaengine_slave_config(dws->txchan, );
 }
 
-static struct dma_async_tx_descriptor *
-dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *txdesc;
+   dma_cookie_t cookie;
+   int ret;
 
txdesc = dmaengine_prep_slave_sg(dws->txchan,
xfer->tx_sg.sgl,
@@ -283,12 +284,17 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
DMA_MEM_TO_DEV,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!txdesc)
-   return NULL;
+   return -ENOMEM;
 
txdesc->callback = dw_spi_dma_tx_done;
txdesc->callback_param = dws;
 
-   return txdesc;
+   cookie = dmaengine_submit(txdesc);
+   ret = dma_submit_error(cookie);
+   if (!ret)
+   set_bit(TX_BUSY, >dma_chan_busy);
+
+   return ret;
 }
 
 static inline bool dw_spi_dma_rx_busy(struct dw_spi *dws)
@@ -365,13 +371,11 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws)
return dmaengine_slave_config(dws->rxchan, );
 }
 
-static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi 
*dws,
-   struct spi_transfer *xfer)
+static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer)
 {
struct dma_async_tx_descriptor *rxdesc;
-
-   if (!xfer->rx_buf)
-   return NULL;
+   dma_cookie_t cookie;
+   int ret;
 
rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
xfer->rx_sg.sgl,
@@ -379,12 +383,17 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
DMA_DEV_TO_MEM,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!rxdesc)
-   return NULL;
+   return -ENOMEM;
 
rxdesc->callback = dw_spi_dma_rx_done;
rxdesc->callback_param = dws;
 
-   return rxdesc;
+   cookie = dmaengine_submit(rxdesc);
+   ret = dma_submit_error(cookie);
+   if (!ret)
+   set_bit(RX_BUSY, >dma_chan_busy);
+
+   return ret;
 }
 
 static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
@@ -427,26 +436,23 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct 
spi_transfer *xfer)
 
 static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
 {
-   struct dma_async_tx_descriptor *txdesc, *rxdesc;
int ret;
 
-   /* Prepare the TX dma transfer */
-   txdesc = dw_spi_dma_prepare_tx(dws, xfer);
-   if (!txdesc)
-   return -EINVAL;
+   /* Submit DMA Tx transfer */
+   ret = dw_spi_dma_submit_tx(dws, xfer);
+   if (ret)
+   return ret;
 
-   /* Prepare the RX dma transfer */
-   rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
+   /* Submit DMA Rx transfer if required */
+   if (xfer->rx_buf) {
+   ret = dw_spi_dma_submit_rx(dws, xfer);
+   if (ret)
+   return ret;
 
-   /* rx must be started before tx due to spi instinct */
-   if (rxdesc) {
-   set_bit(RX_BUSY, >dma_chan_busy);
-   dmaengine_submit(rxdesc);
+   /* rx must be started before tx due to spi instinct */
dma_async_issue_pending(dws->rxchan);
}
 
-   set_bit(TX_BUSY, 

[PATCH 7/8] spi: dw-dma: Pass exact data to the DMA submit and wait methods

2020-07-31 Thread Serge Semin
In order to use the DMA submission and waiting methods in both generic
DMA-based SPI transfer and one-by-one DMA SG entries transmission
functions, we need to modify the dw_spi_dma_wait() and
dw_spi_dma_submit_tx()/dw_spi_dma_submit_rx() prototypes. So instead of
getting the SPI transfer object as the second argument they must accept
the exact data structure instances they imply to use. Those are the
current transfer length and the SPI bus frequency in case of
dw_spi_dma_wait(), and SG list together with number of list entries in
case of the DMA Tx/Rx submission methods.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 35 +--
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index f8508c0c7978..2b42b42b6cf2 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -189,12 +189,12 @@ static enum dma_slave_buswidth 
dw_spi_dma_convert_width(u8 n_bytes)
return DMA_SLAVE_BUSWIDTH_UNDEFINED;
 }
 
-static int dw_spi_dma_wait(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_wait(struct dw_spi *dws, unsigned int len, u32 speed)
 {
unsigned long long ms;
 
-   ms = xfer->len * MSEC_PER_SEC * BITS_PER_BYTE;
-   do_div(ms, xfer->effective_speed_hz);
+   ms = len * MSEC_PER_SEC * BITS_PER_BYTE;
+   do_div(ms, speed);
ms += ms + 200;
 
if (ms > UINT_MAX)
@@ -269,17 +269,16 @@ static int dw_spi_dma_config_tx(struct dw_spi *dws)
return dmaengine_slave_config(dws->txchan, );
 }
 
-static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_submit_tx(struct dw_spi *dws, struct scatterlist *sgl,
+   unsigned int nents)
 {
struct dma_async_tx_descriptor *txdesc;
dma_cookie_t cookie;
int ret;
 
-   txdesc = dmaengine_prep_slave_sg(dws->txchan,
-   xfer->tx_sg.sgl,
-   xfer->tx_sg.nents,
-   DMA_MEM_TO_DEV,
-   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+   txdesc = dmaengine_prep_slave_sg(dws->txchan, sgl, nents,
+DMA_MEM_TO_DEV,
+DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!txdesc)
return -ENOMEM;
 
@@ -367,17 +366,16 @@ static int dw_spi_dma_config_rx(struct dw_spi *dws)
return dmaengine_slave_config(dws->rxchan, );
 }
 
-static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_submit_rx(struct dw_spi *dws, struct scatterlist *sgl,
+   unsigned int nents)
 {
struct dma_async_tx_descriptor *rxdesc;
dma_cookie_t cookie;
int ret;
 
-   rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
-   xfer->rx_sg.sgl,
-   xfer->rx_sg.nents,
-   DMA_DEV_TO_MEM,
-   DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+   rxdesc = dmaengine_prep_slave_sg(dws->rxchan, sgl, nents,
+DMA_DEV_TO_MEM,
+DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!rxdesc)
return -ENOMEM;
 
@@ -436,13 +434,14 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
int ret;
 
/* Submit DMA Tx transfer */
-   ret = dw_spi_dma_submit_tx(dws, xfer);
+   ret = dw_spi_dma_submit_tx(dws, xfer->tx_sg.sgl, xfer->tx_sg.nents);
if (ret)
goto err_clear_dmac;
 
/* Submit DMA Rx transfer if required */
if (xfer->rx_buf) {
-   ret = dw_spi_dma_submit_rx(dws, xfer);
+   ret = dw_spi_dma_submit_rx(dws, xfer->rx_sg.sgl,
+  xfer->rx_sg.nents);
if (ret)
goto err_clear_dmac;
 
@@ -452,7 +451,7 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
 
dma_async_issue_pending(dws->txchan);
 
-   ret = dw_spi_dma_wait(dws, xfer);
+   ret = dw_spi_dma_wait(dws, xfer->len, xfer->effective_speed_hz);
 
 err_clear_dmac:
dw_writel(dws, DW_SPI_DMACR, 0);
-- 
2.27.0



[PATCH 6/8] spi: dw-dma: Move DMAC register cleanup to DMA transfer method

2020-07-31 Thread Serge Semin
DW APB SSI DMA driver doesn't use the native SPI core wait API since
commit bdbdf0f06337 ("spi: dw: Locally wait for the DMA transfers
completion"). Due to that the driver can now clear the DMAC register
in a single place synchronously with the DMA transactions completion
or failure. After that all the possible code paths are still covered:
1) DMA completion callbacks are executed in case if the corresponding DMA
transactions are finished. When they are, one of them will eventually wake
the SPI messages pump kernel thread and dw_spi_dma_transfer_all() method
will clean the DMAC register as implied by this patch.
2) dma_stop is called when the SPI core detects an error either returned
from the transfer_one() callback or set in the SPI message status field.
Both types of errors will be noticed by the dw_spi_dma_transfer_all()
method.
3) dma_exit is called when either SPI controller driver or the
corresponding device is removed. In any case the SPI core will first
flush the SPI messages pump kernel thread, so any pending or in-fly
SPI transfers will be finished before that.

Due to all of that let's simplify the DW APB SSI DMA driver a bit and
move the DMAC register cleanup to a single place in the
dw_spi_dma_transfer_all() method.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 32ef7913a73d..f8508c0c7978 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -153,8 +153,6 @@ static void dw_spi_dma_exit(struct dw_spi *dws)
dmaengine_terminate_sync(dws->rxchan);
dma_release_channel(dws->rxchan);
}
-
-   dw_writel(dws, DW_SPI_DMACR, 0);
 }
 
 static irqreturn_t dw_spi_dma_transfer_handler(struct dw_spi *dws)
@@ -253,7 +251,6 @@ static void dw_spi_dma_tx_done(void *arg)
if (test_bit(RX_BUSY, >dma_chan_busy))
return;
 
-   dw_writel(dws, DW_SPI_DMACR, 0);
complete(>dma_completion);
 }
 
@@ -352,7 +349,6 @@ static void dw_spi_dma_rx_done(void *arg)
if (test_bit(TX_BUSY, >dma_chan_busy))
return;
 
-   dw_writel(dws, DW_SPI_DMACR, 0);
complete(>dma_completion);
 }
 
@@ -442,13 +438,13 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
/* Submit DMA Tx transfer */
ret = dw_spi_dma_submit_tx(dws, xfer);
if (ret)
-   return ret;
+   goto err_clear_dmac;
 
/* Submit DMA Rx transfer if required */
if (xfer->rx_buf) {
ret = dw_spi_dma_submit_rx(dws, xfer);
if (ret)
-   return ret;
+   goto err_clear_dmac;
 
/* rx must be started before tx due to spi instinct */
dma_async_issue_pending(dws->rxchan);
@@ -456,7 +452,12 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
 
dma_async_issue_pending(dws->txchan);
 
-   return dw_spi_dma_wait(dws, xfer);
+   ret = dw_spi_dma_wait(dws, xfer);
+
+err_clear_dmac:
+   dw_writel(dws, DW_SPI_DMACR, 0);
+
+   return ret;
 }
 
 static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
@@ -489,8 +490,6 @@ static void dw_spi_dma_stop(struct dw_spi *dws)
dmaengine_terminate_sync(dws->rxchan);
clear_bit(RX_BUSY, >dma_chan_busy);
}
-
-   dw_writel(dws, DW_SPI_DMACR, 0);
 }
 
 static const struct dw_spi_dma_ops dw_spi_dma_mfld_ops = {
-- 
2.27.0



[PATCH 8/8] spi: dw-dma: Add one-by-one SG list entries transfer

2020-07-31 Thread Serge Semin
In case if at least one of the requested DMA engine channels doesn't
support the hardware accelerated SG list entries traverse, the DMA driver
will most likely work that around by performing the IRQ-based SG list
entries resubmission. That might and will cause a problem if the DMA Tx
channel is recharged and re-executed before the Rx DMA channel. Due to
non-deterministic IRQ-handler execution latency the DMA Tx channel will
start pushing data to the SPI bus before the Rx DMA channel is even
reinitialized with the next inbound SG list entry. By doing so the DMA
Tx channel will implicitly start filling the DW APB SSI Rx FIFO up, which
while the DMA Rx channel being recharged and re-executed will eventually
be overflown.

In order to solve the problem we have to feed the DMA engine with SG
list entries one-by-one. It shall keep the DW APB SSI Tx and Rx FIFOs
synchronized and prevent the Rx FIFO overflow. Since in general the SPI
tx_sg and rx_sg lists may have different number of entries of different
lengths (though total length should match) we virtually split the
SG-lists to the set of DMA transfers, which length is a minimum of the
ordered SG-entries lengths.

The solution described above is only executed if a full-duplex SPI
transfer is requested and the DMA engine hasn't provided channels with
hardware accelerated SG list traverse capability to handle both SG
lists at once.

Signed-off-by: Serge Semin 
Suggested-by: Andy Shevchenko 
---
 drivers/spi/spi-dw-dma.c | 137 ++-
 drivers/spi/spi-dw.h |   1 +
 2 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 2b42b42b6cf2..b3eeef3d5cf7 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -73,6 +73,23 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
 }
 
+static void dw_spi_dma_sg_burst_init(struct dw_spi *dws)
+{
+   struct dma_slave_caps tx = {0}, rx = {0};
+
+   dma_get_slave_caps(dws->txchan, );
+   dma_get_slave_caps(dws->rxchan, );
+
+   if (tx.max_sg_burst > 0 && rx.max_sg_burst > 0)
+   dws->dma_sg_burst = min(tx.max_sg_burst, rx.max_sg_burst);
+   else if (tx.max_sg_burst > 0)
+   dws->dma_sg_burst = tx.max_sg_burst;
+   else if (rx.max_sg_burst > 0)
+   dws->dma_sg_burst = rx.max_sg_burst;
+   else
+   dws->dma_sg_burst = 0;
+}
+
 static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws)
 {
struct dw_dma_slave dma_tx = { .dst_id = 1 }, *tx = _tx;
@@ -110,6 +127,8 @@ static int dw_spi_dma_init_mfld(struct device *dev, struct 
dw_spi *dws)
 
dw_spi_dma_maxburst_init(dws);
 
+   dw_spi_dma_sg_burst_init(dws);
+
return 0;
 
 free_rxchan:
@@ -139,6 +158,8 @@ static int dw_spi_dma_init_generic(struct device *dev, 
struct dw_spi *dws)
 
dw_spi_dma_maxburst_init(dws);
 
+   dw_spi_dma_sg_burst_init(dws);
+
return 0;
 }
 
@@ -459,11 +480,125 @@ static int dw_spi_dma_transfer_all(struct dw_spi *dws,
return ret;
 }
 
+static int dw_spi_dma_transfer_one(struct dw_spi *dws,
+  struct spi_transfer *xfer)
+{
+   struct scatterlist *tx_sg = NULL, *rx_sg = NULL, tx_tmp, rx_tmp;
+   unsigned int tx_len = 0, rx_len = 0;
+   unsigned int base, len;
+   int ret;
+
+   sg_init_table(_tmp, 1);
+   sg_init_table(_tmp, 1);
+
+   /*
+* In case if at least one of the requested DMA channels doesn't
+* support the hardware accelerated SG list entries traverse, the DMA
+* driver will most likely work that around by performing the IRQ-based
+* SG list entries resubmission. That might and will cause a problem
+* if the DMA Tx channel is recharged and re-executed before the Rx DMA
+* channel. Due to non-deterministic IRQ-handler execution latency the
+* DMA Tx channel will start pushing data to the SPI bus before the
+* Rx DMA channel is even reinitialized with the next inbound SG list
+* entry. By doing so the DMA Tx channel will implicitly start filling
+* the DW APB SSI Rx FIFO up, which while the DMA Rx channel being
+* recharged and re-executed will eventually be overflown.
+*
+* In order to solve the problem we have to feed the DMA engine with SG
+* list entries one-by-one. It shall keep the DW APB SSI Tx and Rx
+* FIFOs synchronized and prevent the Rx FIFO overflow. Since in
+* general the tx_sg and rx_sg lists may have different number of
+* entries of different lengths (though total length should match)
+* let's virtually split the SG-lists to the set of DMA transfers,
+* which length is a minimum of the ordered SG-entries lengths.
+* An ASCII-sketch of the implemented algo is followi

[PATCH 5/8] spi: dw-dma: Detach DMA transfer into a dedicated method

2020-07-31 Thread Serge Semin
In order to add an alternative method of DMA-based SPI transfer first we
need to detach the currently available one from the common code. Here we
move the normal DMA-based SPI transfer execution functionality into a
dedicated method. It will be utilized if either the DMA engine supports
an unlimited number SG entries or Tx-only SPI transfer is requested. But
currently just use it for any SPI transfer.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index 366978086781..32ef7913a73d 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -434,7 +434,8 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct 
spi_transfer *xfer)
return 0;
 }
 
-static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_transfer_all(struct dw_spi *dws,
+  struct spi_transfer *xfer)
 {
int ret;
 
@@ -455,7 +456,14 @@ static int dw_spi_dma_transfer(struct dw_spi *dws, struct 
spi_transfer *xfer)
 
dma_async_issue_pending(dws->txchan);
 
-   ret = dw_spi_dma_wait(dws, xfer);
+   return dw_spi_dma_wait(dws, xfer);
+}
+
+static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+   int ret;
+
+   ret = dw_spi_dma_transfer_all(dws, xfer);
if (ret)
return ret;
 
-- 
2.27.0



[PATCH 1/8] spi: dw-dma: Set DMA Level registers on init

2020-07-31 Thread Serge Semin
Indeed the registers content doesn't get cleared when the SPI controller
is disabled and enabled. Max burst lengths aren't changed since the Rx and
Tx DMA channels are requested on init stage and are kept acquired until
the device is removed. Obviously SPI controller FIFO depth can't be
changed. Due to all of that we can safely move the DMA Transmit and
Receive data level registers initialization to the SPI controller DMA init
stage (when the SPI controller is being probed) instead of doing it for
each SPI transfer when dma_setup is called. This shall speed the DMA-based
SPI transfer initialization up a bit, particularly if the APB bus is
relatively slow.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index bb390ff67d1d..440679fa0764 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -49,6 +49,7 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
max_burst = RX_BURST_LEVEL;
 
dws->rxburst = min(max_burst, def_burst);
+   dw_writel(dws, DW_SPI_DMARDLR, dws->rxburst - 1);
 
ret = dma_get_slave_caps(dws->txchan, );
if (!ret && caps.max_burst)
@@ -56,7 +57,20 @@ static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
else
max_burst = TX_BURST_LEVEL;
 
+   /*
+* Having a Rx DMA channel serviced with higher priority than a Tx DMA
+* channel might not be enough to provide a well balanced DMA-based
+* SPI transfer interface. There might still be moments when the Tx DMA
+* channel is occasionally handled faster than the Rx DMA channel.
+* That in its turn will eventually cause the SPI Rx FIFO overflow if
+* SPI bus speed is high enough to fill the SPI Rx FIFO in before it's
+* cleared by the Rx DMA channel. In order to fix the problem the Tx
+* DMA activity is intentionally slowed down by limiting the SPI Tx
+* FIFO depth with a value twice bigger than the Tx burst length
+* calculated earlier by the dw_spi_dma_maxburst_init() method.
+*/
dws->txburst = min(max_burst, def_burst);
+   dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
 }
 
 static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws)
@@ -372,21 +386,6 @@ static int dw_spi_dma_setup(struct dw_spi *dws, struct 
spi_transfer *xfer)
 {
u16 imr = 0, dma_ctrl = 0;
 
-   /*
-* Having a Rx DMA channel serviced with higher priority than a Tx DMA
-* channel might not be enough to provide a well balanced DMA-based
-* SPI transfer interface. There might still be moments when the Tx DMA
-* channel is occasionally handled faster than the Rx DMA channel.
-* That in its turn will eventually cause the SPI Rx FIFO overflow if
-* SPI bus speed is high enough to fill the SPI Rx FIFO in before it's
-* cleared by the Rx DMA channel. In order to fix the problem the Tx
-* DMA activity is intentionally slowed down by limiting the SPI Tx
-* FIFO depth with a value twice bigger than the Tx burst length
-* calculated earlier by the dw_spi_dma_maxburst_init() method.
-*/
-   dw_writel(dws, DW_SPI_DMARDLR, dws->rxburst - 1);
-   dw_writel(dws, DW_SPI_DMATDLR, dws->txburst);
-
if (xfer->tx_buf)
dma_ctrl |= SPI_DMA_TDMAE;
if (xfer->rx_buf)
-- 
2.27.0



[PATCH 3/8] spi: dw-dma: Configure the DMA channels in dma_setup

2020-07-31 Thread Serge Semin
Mainly this is a preparation patch before adding one-by-one DMA SG entries
transmission. But logically the Tx and Rx DMA channels setup should be
performed in the dma_setup() callback anyway. So let's move the DMA slave
channels src/dst burst lengths, address and address width configuration to
the DMA setup stage. While at it make sure the return value of the
dmaengine_slave_config() method is checked. It has been unnecessary in
case if Dw DMAC is utilized as a DMA engine, since its device_config()
callback always returns zero (though it might change in future). But since
DW APB SSI driver now supports any DMA back-end we must make sure the
DMA device configuration has been successful before proceeding with
further setups.

Signed-off-by: Serge Semin 
---
 drivers/spi/spi-dw-dma.c | 42 +---
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/drivers/spi/spi-dw-dma.c b/drivers/spi/spi-dw-dma.c
index ec721af61663..56496b659d62 100644
--- a/drivers/spi/spi-dw-dma.c
+++ b/drivers/spi/spi-dw-dma.c
@@ -257,11 +257,9 @@ static void dw_spi_dma_tx_done(void *arg)
complete(>dma_completion);
 }
 
-static struct dma_async_tx_descriptor *
-dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+static int dw_spi_dma_config_tx(struct dw_spi *dws)
 {
struct dma_slave_config txconf;
-   struct dma_async_tx_descriptor *txdesc;
 
memset(, 0, sizeof(txconf));
txconf.direction = DMA_MEM_TO_DEV;
@@ -271,7 +269,13 @@ dw_spi_dma_prepare_tx(struct dw_spi *dws, struct 
spi_transfer *xfer)
txconf.dst_addr_width = dw_spi_dma_convert_width(dws->n_bytes);
txconf.device_fc = false;
 
-   dmaengine_slave_config(dws->txchan, );
+   return dmaengine_slave_config(dws->txchan, );
+}
+
+static struct dma_async_tx_descriptor *
+dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+   struct dma_async_tx_descriptor *txdesc;
 
txdesc = dmaengine_prep_slave_sg(dws->txchan,
xfer->tx_sg.sgl,
@@ -346,14 +350,9 @@ static void dw_spi_dma_rx_done(void *arg)
complete(>dma_completion);
 }
 
-static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi 
*dws,
-   struct spi_transfer *xfer)
+static int dw_spi_dma_config_rx(struct dw_spi *dws)
 {
struct dma_slave_config rxconf;
-   struct dma_async_tx_descriptor *rxdesc;
-
-   if (!xfer->rx_buf)
-   return NULL;
 
memset(, 0, sizeof(rxconf));
rxconf.direction = DMA_DEV_TO_MEM;
@@ -363,7 +362,16 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
rxconf.src_addr_width = dw_spi_dma_convert_width(dws->n_bytes);
rxconf.device_fc = false;
 
-   dmaengine_slave_config(dws->rxchan, );
+   return dmaengine_slave_config(dws->rxchan, );
+}
+
+static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi 
*dws,
+   struct spi_transfer *xfer)
+{
+   struct dma_async_tx_descriptor *rxdesc;
+
+   if (!xfer->rx_buf)
+   return NULL;
 
rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
xfer->rx_sg.sgl,
@@ -382,10 +390,22 @@ static struct dma_async_tx_descriptor 
*dw_spi_dma_prepare_rx(struct dw_spi *dws,
 static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
 {
u16 imr, dma_ctrl;
+   int ret;
 
if (!xfer->tx_buf)
return -EINVAL;
 
+   /* Setup DMA channels */
+   ret = dw_spi_dma_config_tx(dws);
+   if (ret)
+   return ret;
+
+   if (xfer->rx_buf) {
+   ret = dw_spi_dma_config_rx(dws);
+   if (ret)
+   return ret;
+   }
+
/* Set the DMA handshaking interface */
dma_ctrl = SPI_DMA_TDMAE;
if (xfer->rx_buf)
-- 
2.27.0



[PATCH 0/8] spi: dw-dma: Add max SG entries burst capability support

2020-07-31 Thread Serge Semin
 SPI
transfer SG-lists in a single DMA transaction without software
intervention, and for the full-duplex SPI-transfers.

Note since the DMA-engine subsystem in kernel 5.8-rcX doesn't have the
max_sg_burst capability supported, this series is intended to be applied
only after the "next"-branch of the DMA-engine repository is merged into
the mainline repo. Alternatively the series could be merged in through the
DMA-engine repo.

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: Georgy Vlasov 
Cc: Ramil Zaripov 
Cc: Pavel Parkhomenko 
Cc: Peter Ujfalusi 
Cc: Andy Shevchenko 
Cc: Andy Shevchenko 
Cc: Feng Tang 
Cc: Vinod Koul 
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (8):
  spi: dw-dma: Set DMA Level registers on init
  spi: dw-dma: Fail DMA-based transfer if no Tx-buffer specified
  spi: dw-dma: Configure the DMA channels in dma_setup
  spi: dw-dma: Move DMA transfers submission to the channels prep
methods
  spi: dw-dma: Detach DMA transfer into a dedicated method
  spi: dw-dma: Move DMAC register cleanup to DMA transfer method
  spi: dw-dma: Pass exact data to the DMA submit and wait methods
  spi: dw-dma: Add one-by-one SG list entries transfer

 drivers/spi/spi-dw-dma.c | 309 ++-
 drivers/spi/spi-dw.h |   1 +
 2 files changed, 238 insertions(+), 72 deletions(-)

-- 
2.27.0



Re: clocksource: dw_apb_timer: commit 6d2e16a3181b broke Arria10 platform

2020-07-31 Thread Serge Semin
Hello Dinh,
It must be something wrong with your timer2 and timer3 declared in the Arria10
dts because the patch didn't change anything for the first two timers (timer0 
and
timer1). It just permits to register all DW APB Timers found in dts.

If those timers are broken, then you should have disabled them in the dts in the
first place. If they are normal, then you need to investigate further why do
they cause the kernel panic.

-Sergey

On Thu, Jul 30, 2020 at 09:30:55PM +, Nguyen, Dinh wrote:
> Hi Sergey,
> 
> Commit "6d2e16a3181b clocksource: dw_apb_timer_of: Fix missing clockevent 
> timers" broke the Arria10 platform. See the bootlog here:
> 
> [0.00] Booting Linux on physical CPU 0x0
> [0.00] Linux version 5.8.0-rc7-next-20200730 (dinguyen@linux-builds1) 
> (a
> rm-linux-gnueabihf-gcc (Linaro GCC 7.2-2017.11) 7.2.1 20171011, GNU ld 
> (Linaro_B
> inutils-2017.11) 2.28.2.20170706) #18 SMP Thu Jul 30 16:25:46 CDT 2020
> [0.00] CPU: ARMv7 Processor [414fc091] revision 1 (ARMv7), cr=10c5387d
> [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
> instructio
> n cache
> [0.00] OF: fdt: Machine model: Altera SOCFPGA Arria 10
> [0.00] earlycon: uart0 at MMIO32 0xffc02100 (options '115200n8')
> [0.00] printk: bootconsole [uart0] enabled
> [0.00] Memory policy: Data cache writealloc
> [0.00] Zone ranges:
> [0.00]   Normal   [mem 0x-0x2fff]
> [0.00]   HighMem  [mem 0x3000-0x3fff]
> [0.00] Movable zone start for each node
> [0.00] Early memory node ranges
> [0.00]   node   0: [mem 0x-0x3fff]
> [0.00] Initmem setup node 0 [mem 
> 0x-0x3fff]
> [0.00] percpu: Embedded 19 pages/cpu s45132 r8192 d24500 u77824
> [0.00] Built 1 zonelists, mobility grouping on.  Total pages: 260608
> [0.00] Kernel command line: earlycon console=ttyS0,115200 
> root=/dev/nfs
> rw nfsroot=10.122.105.146:/home/dnguyen/rootfs_yocto,tcp ip=dhcp
> [0.00] Dentry cache hash table entries: 131072 (order: 7, 524288 
> bytes,
> linear)
> [0.00] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, 
> li
> near)
> [0.00] mem auto-init: stack:off, heap alloc:off, heap free:off
> [0.00] Memory: 1027232K/1048576K available (8192K kernel code, 690K 
> rwda
> ta, 1792K rodata, 1024K init, 159K bss, 21344K reserved, 0K cma-reserved, 
> 262144
> K highmem)
> [0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
> [0.00] ftrace: allocating 28185 entries in 56 pages
> [0.00] ftrace: allocated 56 pages with 3 groups
> [0.00] rcu: Hierarchical RCU implementation.
> [0.00] rcu: RCU event tracing is enabled.
> [0.00]  Rude variant of Tasks RCU enabled.
> [0.00] rcu: RCU calculated value of scheduler-enlistment delay is 10 
> jif
> fies.
> [0.00] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
> [0.00] L2C-310 erratum 769419 enabled
> [0.00] L2C-310 enabling early BRESP for Cortex-A9
> [0.00] L2C-310: enabling full line of zeros but not enabled in 
> Cortex-A9
> [0.00] L2C-310 ID prefetch enabled, offset 1 lines
> [0.00] L2C-310 dynamic clock gating enabled, standby mode enabled
> [0.00] L2C-310 cache controller enabled, 8 ways, 512 kB
> [0.00] L2C-310: CACHE_ID 0x410030c9, AUX_CTRL 0x76560001
> [0.00] random: get_random_bytes called from start_kernel+0x388/0x528 
> wit
> h crng_init=0
> [0.00] clocksource: timer1: mask: 0x max_cycles: 0x, 
> max
> _idle_ns: 38225208935 ns
> [0.04] sched_clock: 32 bits at 50MHz, resolution 20ns, wraps every 
> 42949
> 672950ns
> [0.007796] Switching to timer-based delay loop, resolution 20ns
> [0.013861] 8<--- cut here ---
> [0.016901] Unhandled fault: imprecise external abort (0x406) at 0x4b96623e
> [0.023828] pgd = (ptrval)
> [0.026520] [4b96623e] *pgd=
> [0.030084] Internal error: : 406 [#1] SMP ARM
> [0.034504] Modules linked in:
> [0.037548] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 5.8.0-rc7-next-20200730
> #18
> [0.044992] Hardware name: Altera SOCFPGA Arria10
> [0.049677] PC is at apbt_enable_int+0x24/0x48
> [0.054098] LR is at dw_apb_clockevent_register+0x3c/0x40
> [0.059468] pc : []lr : []psr: 60d3
> [0.065702] sp : c0c01ef8  ip : c0c01f10  fp : c0c01f0c
> [0.070908] r10: c0b44a3c  r9 :   r8 : c0a5ace0
> [0.076106] r7 : 0013  r6 : ef7f7640  r5 : ef06c0c0  r4 : a5f2106f
> [0.082608] r3 : f081c000  r2 : 012c  r1 : 20d3  r0 : ef06c0c0
> [0.089112] Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment 
> no
> ne
> [0.096384] Control: 10c5387d  Table: 404a  DAC: 0051
> [0.102110] Process swapper/0 (pid: 0, stack 

Re: [PATCH] MIPS: Provide Kconfig option for default IEEE754 conformance mode

2020-07-31 Thread Serge Semin
On Fri, Jul 31, 2020 at 12:10:16PM +0800, Jiaxun Yang wrote:
> Requested by downstream distros, a Kconfig option for default
> IEEE754 conformance mode allows them to set their mode to
> relaxed by default.

That's what should have been here in the first place. Thanks!
Reviewed-by: Serge Semin 

> 
> Signed-off-by: Jiaxun Yang 
> ---
>  arch/mips/Kconfig| 21 +
>  arch/mips/kernel/cpu-probe.c | 12 +++-
>  2 files changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
> index f6bb446d30f0..ef5b2a177b1b 100644
> --- a/arch/mips/Kconfig
> +++ b/arch/mips/Kconfig
> @@ -2877,6 +2877,27 @@ config MIPS_NR_CPU_NR_MAP
>   default 1024 if MIPS_NR_CPU_NR_MAP_1024
>   default NR_CPUS if !MIPS_NR_CPU_NR_MAP_1024
>  
> +choice
> + prompt "Default IEEE Std 754 conformance mode"
> + default IEEE754_DEFAULT_STRICT
> + help
> +   Default IEEE Std 754 conformance mode, see ieee754= kernel parameter
> +   for detail.
> +
> + config IEEE754_DEFAULT_STRICT
> + bool "Strict"
> +
> + config IEEE754_DEFAULT_LEGACY
> + bool "Legacy"
> +
> + config IEEE754_DEFAULT_STD2008
> + bool "2008"
> +
> + config IEEE754_DEFAULT_RELAXED
> + bool "Relaxed"
> +
> +endchoice
> +
>  #
>  # Timer Interrupt Frequency Configuration
>  #
> diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
> index d9e8d39a7289..03adeed58efb 100644
> --- a/arch/mips/kernel/cpu-probe.c
> +++ b/arch/mips/kernel/cpu-probe.c
> @@ -157,7 +157,17 @@ static void cpu_set_fpu_2008(struct cpuinfo_mips *c)
>   * IEEE 754 conformance mode to use.  Affects the NaN encoding and the
>   * ABS.fmt/NEG.fmt execution mode.
>   */
> -static enum { STRICT, LEGACY, STD2008, RELAXED } ieee754 = STRICT;
> +enum ieee754_mode { STRICT, LEGACY, STD2008, RELAXED };
> +
> +#if defined(CONFIG_IEEE754_DEFAULT_STRICT)
> +static enum ieee754_mode ieee754 = STRICT;
> +#elif defined(CONFIG_IEEE754_DEFAULT_LEGACY)
> +static enum ieee754_mode ieee754 = LEGACY;
> +#elif defined(CONFIG_IEEE754_DEFAULT_STD2008)
> +static enum ieee754_mode ieee754 = STD2008;
> +#elif defined(CONFIG_IEEE754_DEFAULT_RELAXED)
> +static enum ieee754_mode ieee754 = RELAXED;
> +#endif
>  
>  /*
>   * Set the IEEE 754 NaN encodings and the ABS.fmt/NEG.fmt execution modes
> -- 
> 2.28.0
> 


Re: [PATCH 2/5] dmaengine: dw: Activate FIFO-mode for memory peripherals only

2020-07-30 Thread Serge Semin
On Thu, Jul 30, 2020 at 07:47:03PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 30, 2020 at 07:31:54PM +0300, Serge Semin wrote:
> > On Thu, Jul 30, 2020 at 07:24:28PM +0300, Andy Shevchenko wrote:
> > > On Thu, Jul 30, 2020 at 06:45:42PM +0300, Serge Semin wrote:
> 
> ...
> 
> > > > Thanks to the commit  ("dmaengine: dw: Initialize channel
> 
> ...
> 
> > > > Note the DMA-engine repository 
> > > > git.infradead.org/users/vkoul/slave-dma.git
> > > > isn't accessible. So I couldn't find out the Andy' commit hash to use 
> > > > it in
> > > > the log.
> > 
> > > It's dmaengine.git on git.kernel.org.
> > 
> > Ah, thanks! I've just found out that the repo address has been changed. But 
> > I've
> > also scanned the "next" branch of the repo:
> > https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git
> > 
> > Your commit isn't there. Am I missing something?
> 

> It's a fix. It went to upstream branch (don't remember its name by heart in
> Vinod's repo).

Right. Found it. Thanks.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


Re: [PATCH 5/5] dmaengine: dw: Add DMA-channels mask cell support

2020-07-30 Thread Serge Semin
On Thu, Jul 30, 2020 at 07:41:46PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 30, 2020 at 06:45:45PM +0300, Serge Semin wrote:
> > DW DMA IP-core provides a way to synthesize the DMA controller with
> > channels having different parameters like maximum burst-length,
> > multi-block support, maximum data width, etc. Those parameters both
> > explicitly and implicitly affect the channels performance. Since DMA slave
> > devices might be very demanding to the DMA performance, let's provide a
> > functionality for the slaves to be assigned with DW DMA channels, which
> > performance according to the platform engineer fulfill their requirements.
> > After this patch is applied it can be done by passing the mask of suitable
> > DMA-channels either directly in the dw_dma_slave structure instance or as
> > a fifth cell of the DMA DT-property. If mask is zero or not provided, then
> > there is no limitation on the channels allocation.
> > 
> > For instance Baikal-T1 SoC is equipped with a DW DMAC engine, which first
> > two channels are synthesized with max burst length of 16, while the rest
> > of the channels have been created with max-burst-len=4. It would seem that
> > the first two channels must be faster than the others and should be more
> > preferable for the time-critical DMA slave devices. In practice it turned
> > out that the situation is quite the opposite. The channels with
> > max-burst-len=4 demonstrated a better performance than the channels with
> > max-burst-len=16 even when they both had been initialized with the same
> > settings. The performance drop of the first two DMA-channels made them
> > unsuitable for the DW APB SSI slave device. No matter what settings they
> > are configured with, full-duplex SPI transfers occasionally experience the
> > Rx FIFO overflow. It means that the DMA-engine doesn't keep up with
> > incoming data pace even though the SPI-bus is enabled with speed of 25MHz
> > while the DW DMA controller is clocked with 50MHz signal. There is no such
> > problem has been noticed for the channels synthesized with
> > max-burst-len=4.
> 
> ...
> 

> > +   if (dws->channels && !(dws->channels & dwc->mask))
> 
> You can drop the first check if...

See below.

> 
> > +   return false;
> 
> ...
> 
> > +   if (dma_spec->args_count >= 4)
> > +   slave.channels = dma_spec->args[3];
> 
> ...you apply sane default here or somewhere else.

Alas I can't because dw_dma_slave structure is defined all over the kernel
drivers/spi/spi-dw-dma.c
drivers/spi/spi-pxa2xx-pci.c
drivers/tty/serial/8250/8250_lpss.c

These devices aren't always placed on the OF-based platforms. In that case the
corresponding DMA-channels won't be requested by means of the dw_dma_of_xlate()
method. So we have to preserve a default behavior if dws->channels is zero.

> 
> ...
> 
> > +   fls(slave.channels) > dw->pdata->nr_channels))
> 

> Does it really make sense?

It does to prevent the clients to specify an invalid channels mask, which can't
have bits set higher than the number of channels the engine supports.

> 
> I think it can also be simplified to faster op, i.e.
>   BIT(nr_channels) < slave.channels
> (but check for off-by-one errors)

Makes sense. Thanks. I'll replace it with the next statement:
slave.channels >= BIT(dw->pdata->nr_channels)

> 
> ...
> 

> > + * @channels:  mask of the channels permitted for allocation (zero
> > + * value means any)
> 
> Perhaps on one line?

I don't really care. If you insist on that, I'll make it a single line, but it
will be over 80 columns. 85 to be exact.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


Re: [PATCH 4/5] dmaengine: dw: Ignore burst setting for memory peripherals

2020-07-30 Thread Serge Semin
On Thu, Jul 30, 2020 at 07:31:22PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 30, 2020 at 06:45:44PM +0300, Serge Semin wrote:
> > According to the DW DMA controller Databook (page 40 "3.5 Memory
> 

> Which version of it?

2.18b

> 
> > Peripherals") memory peripherals don't have handshaking interface
> > connected to the controller, therefore they can never be a flow
> > controller. Since the CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are
> > properties valid only for peripherals with a handshaking
> > interface, we can freely zero these fields out if the memory peripheral
> > is selected to be the source or the destination of the DMA transfers.
> > 
> > Note according to the databook, length of burst transfers to memory is
> > always equal to the number of data items available in a channel FIFO or
> > data items required to complete the block transfer, whichever is smaller;
> > length of burst transfers from memory is always equal to the space
> > available in a channel FIFO or number of data items required to complete
> > the block transfer, whichever is smaller.
> 

> But does it really matter if you program there something or not?

For memory peripherals it doesn't. But it's better to remove the redundant
settings for consistency with the doc and to get rid of the unneeded
stack-variable declaration.

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


Re: [PATCH 2/5] dmaengine: dw: Activate FIFO-mode for memory peripherals only

2020-07-30 Thread Serge Semin
On Thu, Jul 30, 2020 at 07:24:28PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 30, 2020 at 06:45:42PM +0300, Serge Semin wrote:
> > CFGx.FIFO_MODE field controls a DMA-controller "FIFO readiness" criterion.
> > In other words it determines when to start pushing data out of a DW
> > DMAC channel FIFO to a destination peripheral or from a source
> > peripheral to the DW DMAC channel FIFO. Currently FIFO-mode is set to one
> > for all DW DMAC channels. It means they are tuned to flush data out of
> > FIFO (to a memory peripheral or by accepting the burst transaction
> > requests) when FIFO is at least half-full (except at the end of the block
> > transfer, when FIFO-flush mode is activated) and are configured to get
> > data to the FIFO when it's at least half-empty.
> > 
> > Such configuration is a good choice when there is no slave device involved
> > in the DMA transfers. In that case the number of bursts per block is less
> > than when CFGx.FIFO_MODE = 0 and, hence, the bus utilization will improve.
> > But the latency of DMA transfers may increase when CFGx.FIFO_MODE = 1,
> > since DW DMAC will wait for the channel FIFO contents to be either
> > half-full or half-empty depending on having the destination or the source
> > transfers. Such latencies might be dangerous in case if the DMA transfers
> > are expected to be performed from/to a slave device. Since normally
> > peripheral devices keep data in internal FIFOs, any latency at some
> > critical moment may cause one being overflown and consequently losing
> > data. This especially concerns a case when either a peripheral device is
> > relatively fast or the DW DMAC engine is relatively slow with respect to
> > the incoming data pace.
> > 
> > In order to solve problems, which might be caused by the latencies
> > described above, let's enable the FIFO half-full/half-empty "FIFO
> > readiness" criterion only for DMA transfers with no slave device involved.
> 
> > Thanks to the commit  ("dmaengine: dw: Initialize channel
> 
> See below.
> 
> > before each transfer") we can freely do that in the generic
> > dw_dma_initialize_chan() method.
> 
> Reviewed-by: Andy Shevchenko 
> Thanks!
> 
> > Signed-off-by: Serge Semin 
> > 
> > ---
> > 
> > Note the DMA-engine repository git.infradead.org/users/vkoul/slave-dma.git
> > isn't accessible. So I couldn't find out the Andy' commit hash to use it in
> > the log.
> 

> It's dmaengine.git on git.kernel.org.

Ah, thanks! I've just found out that the repo address has been changed. But I've
also scanned the "next" branch of the repo:
https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git

Your commit isn't there. Am I missing something?

-Sergey

> 
> > ---
> >  drivers/dma/dw/dw.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/dma/dw/dw.c b/drivers/dma/dw/dw.c
> > index 7a085b3c1854..d9810980920a 100644
> > --- a/drivers/dma/dw/dw.c
> > +++ b/drivers/dma/dw/dw.c
> > @@ -14,7 +14,7 @@
> >  static void dw_dma_initialize_chan(struct dw_dma_chan *dwc)
> >  {
> > struct dw_dma *dw = to_dw_dma(dwc->chan.device);
> > -   u32 cfghi = DWC_CFGH_FIFO_MODE;
> > +   u32 cfghi = is_slave_direction(dwc->direction) ? 0 : DWC_CFGH_FIFO_MODE;
> > u32 cfglo = DWC_CFGL_CH_PRIOR(dwc->priority);
> > bool hs_polarity = dwc->dws.hs_polarity;
> >  
> > -- 
> > 2.27.0
> > 
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


[PATCH 3/5] dmaengine: dw: Discard dlen from the dev-to-mem xfer width calculation

2020-07-30 Thread Serge Semin
Indeed in case of the DMA_DEV_TO_MEM DMA transfers it's enough to take the
destination memory address and the destination master data width into
account to calculate the CTLx.DST_TR_WIDTH setting of the memory
peripheral. According to the DW DMAC IP-core Databook (page 66, Example 5)
at the and of a DMA transfer when the DMA-channel internal FIFO is left
with data less than for a single destination burst transaction, the
destination peripheral will enter the Single Transaction Region where the
DW DMA controller can complete a block transfer to the destination using
single transactions (non-burst transaction of CTLx.DST_TR_WIDTH bytes). If
there is no enough data in the DMA-channel internal FIFO for even a single
non-burst transaction of CTLx.DST_TR_WIDTH bytes, then the channel enters
"FIFO flush mode". That mode is activated to empty the FIFO and flush the
leftovers out to the memory peripheral. The flushing procedure is simple.
The data is sent to the memory by means of a set of single transaction of
CTLx.SRC_TR_WIDTH bytes. To sum up it's redundant to use the LLPs length
to find out the CTLx.DST_TR_WIDTH parameter value, since each DMA transfer
will be completed with the CTLx.SRC_TR_WIDTH bytes transaction if it is
required.

In this commit we remove the LLP entry length from the statement which
calculates the memory peripheral DMA transaction width since it's
redundant due to the feature described above. By doing so we'll improve
the memory bus utilization and speed up the DMA-channel performance for
DMA_DEV_TO_MEM DMA-transfers.

Signed-off-by: Serge Semin 
---
 drivers/dma/dw/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 4700f2e87a62..3da0aea9fe25 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -723,7 +723,7 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist 
*sgl,
lli_write(desc, sar, reg);
lli_write(desc, dar, mem);
lli_write(desc, ctlhi, ctlhi);
-   mem_width = __ffs(data_width | mem | dlen);
+   mem_width = __ffs(data_width | mem);
lli_write(desc, ctllo, ctllo | 
DWC_CTLL_DST_WIDTH(mem_width));
desc->len = dlen;
 
-- 
2.27.0



[PATCH 2/5] dmaengine: dw: Activate FIFO-mode for memory peripherals only

2020-07-30 Thread Serge Semin
CFGx.FIFO_MODE field controls a DMA-controller "FIFO readiness" criterion.
In other words it determines when to start pushing data out of a DW
DMAC channel FIFO to a destination peripheral or from a source
peripheral to the DW DMAC channel FIFO. Currently FIFO-mode is set to one
for all DW DMAC channels. It means they are tuned to flush data out of
FIFO (to a memory peripheral or by accepting the burst transaction
requests) when FIFO is at least half-full (except at the end of the block
transfer, when FIFO-flush mode is activated) and are configured to get
data to the FIFO when it's at least half-empty.

Such configuration is a good choice when there is no slave device involved
in the DMA transfers. In that case the number of bursts per block is less
than when CFGx.FIFO_MODE = 0 and, hence, the bus utilization will improve.
But the latency of DMA transfers may increase when CFGx.FIFO_MODE = 1,
since DW DMAC will wait for the channel FIFO contents to be either
half-full or half-empty depending on having the destination or the source
transfers. Such latencies might be dangerous in case if the DMA transfers
are expected to be performed from/to a slave device. Since normally
peripheral devices keep data in internal FIFOs, any latency at some
critical moment may cause one being overflown and consequently losing
data. This especially concerns a case when either a peripheral device is
relatively fast or the DW DMAC engine is relatively slow with respect to
the incoming data pace.

In order to solve problems, which might be caused by the latencies
described above, let's enable the FIFO half-full/half-empty "FIFO
readiness" criterion only for DMA transfers with no slave device involved.
Thanks to the commit  ("dmaengine: dw: Initialize channel
before each transfer") we can freely do that in the generic
dw_dma_initialize_chan() method.

Signed-off-by: Serge Semin 

---

Note the DMA-engine repository git.infradead.org/users/vkoul/slave-dma.git
isn't accessible. So I couldn't find out the Andy' commit hash to use it in
the log.
---
 drivers/dma/dw/dw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/dw/dw.c b/drivers/dma/dw/dw.c
index 7a085b3c1854..d9810980920a 100644
--- a/drivers/dma/dw/dw.c
+++ b/drivers/dma/dw/dw.c
@@ -14,7 +14,7 @@
 static void dw_dma_initialize_chan(struct dw_dma_chan *dwc)
 {
struct dw_dma *dw = to_dw_dma(dwc->chan.device);
-   u32 cfghi = DWC_CFGH_FIFO_MODE;
+   u32 cfghi = is_slave_direction(dwc->direction) ? 0 : DWC_CFGH_FIFO_MODE;
u32 cfglo = DWC_CFGL_CH_PRIOR(dwc->priority);
bool hs_polarity = dwc->dws.hs_polarity;
 
-- 
2.27.0



[PATCH 4/5] dmaengine: dw: Ignore burst setting for memory peripherals

2020-07-30 Thread Serge Semin
According to the DW DMA controller Databook (page 40 "3.5 Memory
Peripherals") memory peripherals don't have handshaking interface
connected to the controller, therefore they can never be a flow
controller. Since the CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are
properties valid only for peripherals with a handshaking
interface, we can freely zero these fields out if the memory peripheral
is selected to be the source or the destination of the DMA transfers.

Note according to the databook, length of burst transfers to memory is
always equal to the number of data items available in a channel FIFO or
data items required to complete the block transfer, whichever is smaller;
length of burst transfers from memory is always equal to the space
available in a channel FIFO or number of data items required to complete
the block transfer, whichever is smaller.

Signed-off-by: Serge Semin 
---
 drivers/dma/dw/dw.c | 5 ++---
 drivers/dma/dw/idma32.c | 5 ++---
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/dma/dw/dw.c b/drivers/dma/dw/dw.c
index d9810980920a..a4862263ff14 100644
--- a/drivers/dma/dw/dw.c
+++ b/drivers/dma/dw/dw.c
@@ -67,9 +67,8 @@ static size_t dw_dma_block2bytes(struct dw_dma_chan *dwc, u32 
block, u32 width)
 static u32 dw_dma_prepare_ctllo(struct dw_dma_chan *dwc)
 {
struct dma_slave_config *sconfig = >dma_sconfig;
-   bool is_slave = is_slave_direction(dwc->direction);
-   u8 smsize = is_slave ? sconfig->src_maxburst : DW_DMA_MSIZE_16;
-   u8 dmsize = is_slave ? sconfig->dst_maxburst : DW_DMA_MSIZE_16;
+   u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst 
: 0;
+   u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst 
: 0;
u8 p_master = dwc->dws.p_master;
u8 m_master = dwc->dws.m_master;
u8 dms = (dwc->direction == DMA_MEM_TO_DEV) ? p_master : m_master;
diff --git a/drivers/dma/dw/idma32.c b/drivers/dma/dw/idma32.c
index f00657308811..3ce44de25d33 100644
--- a/drivers/dma/dw/idma32.c
+++ b/drivers/dma/dw/idma32.c
@@ -73,9 +73,8 @@ static size_t idma32_block2bytes(struct dw_dma_chan *dwc, u32 
block, u32 width)
 static u32 idma32_prepare_ctllo(struct dw_dma_chan *dwc)
 {
struct dma_slave_config *sconfig = >dma_sconfig;
-   bool is_slave = is_slave_direction(dwc->direction);
-   u8 smsize = is_slave ? sconfig->src_maxburst : IDMA32_MSIZE_8;
-   u8 dmsize = is_slave ? sconfig->dst_maxburst : IDMA32_MSIZE_8;
+   u8 smsize = (dwc->direction == DMA_DEV_TO_MEM) ? sconfig->src_maxburst 
: 0;
+   u8 dmsize = (dwc->direction == DMA_MEM_TO_DEV) ? sconfig->dst_maxburst 
: 0;
 
return DWC_CTLL_LLP_D_EN | DWC_CTLL_LLP_S_EN |
   DWC_CTLL_DST_MSIZE(dmsize) | DWC_CTLL_SRC_MSIZE(smsize);
-- 
2.27.0



[PATCH 5/5] dmaengine: dw: Add DMA-channels mask cell support

2020-07-30 Thread Serge Semin
DW DMA IP-core provides a way to synthesize the DMA controller with
channels having different parameters like maximum burst-length,
multi-block support, maximum data width, etc. Those parameters both
explicitly and implicitly affect the channels performance. Since DMA slave
devices might be very demanding to the DMA performance, let's provide a
functionality for the slaves to be assigned with DW DMA channels, which
performance according to the platform engineer fulfill their requirements.
After this patch is applied it can be done by passing the mask of suitable
DMA-channels either directly in the dw_dma_slave structure instance or as
a fifth cell of the DMA DT-property. If mask is zero or not provided, then
there is no limitation on the channels allocation.

For instance Baikal-T1 SoC is equipped with a DW DMAC engine, which first
two channels are synthesized with max burst length of 16, while the rest
of the channels have been created with max-burst-len=4. It would seem that
the first two channels must be faster than the others and should be more
preferable for the time-critical DMA slave devices. In practice it turned
out that the situation is quite the opposite. The channels with
max-burst-len=4 demonstrated a better performance than the channels with
max-burst-len=16 even when they both had been initialized with the same
settings. The performance drop of the first two DMA-channels made them
unsuitable for the DW APB SSI slave device. No matter what settings they
are configured with, full-duplex SPI transfers occasionally experience the
Rx FIFO overflow. It means that the DMA-engine doesn't keep up with
incoming data pace even though the SPI-bus is enabled with speed of 25MHz
while the DW DMA controller is clocked with 50MHz signal. There is no such
problem has been noticed for the channels synthesized with
max-burst-len=4.

Signed-off-by: Serge Semin 
---
 drivers/dma/dw/core.c| 4 
 drivers/dma/dw/of.c  | 7 +--
 include/linux/platform_data/dma-dw.h | 3 +++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 3da0aea9fe25..5f7b9badb965 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -772,6 +772,10 @@ bool dw_dma_filter(struct dma_chan *chan, void *param)
if (dws->dma_dev != chan->device->dev)
return false;
 
+   /* permit channels in accordance with the channels mask */
+   if (dws->channels && !(dws->channels & dwc->mask))
+   return false;
+
/* We have to copy data since dws can be temporary storage */
memcpy(>dws, dws, sizeof(struct dw_dma_slave));
 
diff --git a/drivers/dma/dw/of.c b/drivers/dma/dw/of.c
index 1474b3817ef4..abdf22b269b5 100644
--- a/drivers/dma/dw/of.c
+++ b/drivers/dma/dw/of.c
@@ -22,18 +22,21 @@ static struct dma_chan *dw_dma_of_xlate(struct 
of_phandle_args *dma_spec,
};
dma_cap_mask_t cap;
 
-   if (dma_spec->args_count != 3)
+   if (dma_spec->args_count < 3 || dma_spec->args_count > 4)
return NULL;
 
slave.src_id = dma_spec->args[0];
slave.dst_id = dma_spec->args[0];
slave.m_master = dma_spec->args[1];
slave.p_master = dma_spec->args[2];
+   if (dma_spec->args_count >= 4)
+   slave.channels = dma_spec->args[3];
 
if (WARN_ON(slave.src_id >= DW_DMA_MAX_NR_REQUESTS ||
slave.dst_id >= DW_DMA_MAX_NR_REQUESTS ||
slave.m_master >= dw->pdata->nr_masters ||
-   slave.p_master >= dw->pdata->nr_masters))
+   slave.p_master >= dw->pdata->nr_masters ||
+   fls(slave.channels) > dw->pdata->nr_channels))
return NULL;
 
dma_cap_zero(cap);
diff --git a/include/linux/platform_data/dma-dw.h 
b/include/linux/platform_data/dma-dw.h
index 4f681df85c27..3bc48451a70c 100644
--- a/include/linux/platform_data/dma-dw.h
+++ b/include/linux/platform_data/dma-dw.h
@@ -23,6 +23,8 @@
  * @dst_id:dst request line
  * @m_master:  memory master for transfers on allocated channel
  * @p_master:  peripheral master for transfers on allocated channel
+ * @channels:  mask of the channels permitted for allocation (zero
+ * value means any)
  * @hs_polarity:set active low polarity of handshake interface
  */
 struct dw_dma_slave {
@@ -31,6 +33,7 @@ struct dw_dma_slave {
u8  dst_id;
u8  m_master;
u8  p_master;
+   u8  channels;
boolhs_polarity;
 };
 
-- 
2.27.0



[PATCH 0/5] dmaengine: dw: Introduce non-mem peripherals optimizations

2020-07-30 Thread Serge Semin
After a lot of tests and thorough DW DMAC databook studying we've
discovered that the driver can be optimized especially when it comes to
working with non-memory peripherals.

First of all we've found out that since each DW DMAC channel can
be synthesized with different parameters, then even when two of them
are configured to perform the same DMA transactions they may execute them
with different performance. Since some DMA client devices might be
sensitive to such important parameter as performance, then it is a good
idea to let them request only suitable DMA channels. In this patchset we
introduce a functionality, which makes it possible by passing the DMA
channels mask either over the "dmas" DT property or in the dw_dma_slave
platform data descriptor.

Secondly FIFO-mode of the "FIFO readiness" criterion is more suitable for
the pure memory DMA transfers, since it minimizes the system bus
utilization, but causes some performance drop. When it comes to working with
non-memory peripherals the DMA engine performance comes to the first
place. Since normally DMA client devices keep data in internal FIFOs, any
latency at some critical moment may cause a FIFO being overflown and
consequently losing data. So in order to minimize a chance of the DW DMAC
internal FIFO being a bottle neck during the DMA transfers to and from
non-memory peripherals we propose not to use FIFO-mode for them.

Thirdly it has been discovered that using a DMA transaction length is
redundant when calculating the destination transfer width for the
dev-to-mem DMA communications. That shall increase performance of the DMA
transfers with unaligned data length.

Finally there is a small optimization in the burst length setting. In
particular we've found out, that according to the DW DMAC databoot it's
pointless to set one for the memory peripherals since they don't have
handshaking interface connected to the DMA controller. So we suggest to
just ignore the burst length config when it comes to setting the memory
peripherals up.

Note this patchset is supposed to be applied on top of the series:
Link: 
https://lore.kernel.org/dmaengine/20200723005848.31907-1-sergey.se...@baikalelectronics.ru

Signed-off-by: Serge Semin 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: Peter Ujfalusi 
Cc: Andy Shevchenko 
Cc: Rob Herring 
Cc: dmaeng...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (5):
  dt-bindings: dma: dw: Add optional DMA-channels mask cell support
  dmaengine: dw: Activate FIFO-mode for memory peripherals only
  dmaengine: dw: Discard dlen from the dev-to-mem xfer width calculation
  dmaengine: dw: Ignore burst setting for memory peripherals
  dmaengine: dw: Add DMA-channels mask cell support

 .../devicetree/bindings/dma/snps,dma-spear1340.yaml| 7 +--
 drivers/dma/dw/core.c  | 6 +-
 drivers/dma/dw/dw.c| 7 +++
 drivers/dma/dw/idma32.c| 5 ++---
 drivers/dma/dw/of.c| 7 +--
 include/linux/platform_data/dma-dw.h   | 3 +++
 6 files changed, 23 insertions(+), 12 deletions(-)

-- 
2.27.0



[PATCH 1/5] dt-bindings: dma: dw: Add optional DMA-channels mask cell support

2020-07-30 Thread Serge Semin
Each DW DMA controller channel can be synthesized with different
parameters like maximum burst-length, multi-block support, maximum data
width, etc. Most of these parameters determine the DW DMAC channels
performance in its own aspect. On the other hand these parameters can
be implicitly responsible for the channels performance degradation
(for instance multi-block support is a very useful feature, but having
it disabled during the DW DMAC synthesize will provide a more optimized
core). Since DMA slave devices may have critical dependency on the DMA
engine performance, let's provide a way for the slave devices to have
the DMA-channels allocated from a pool of the channels, which according
to the system engineer fulfill their performance requirements.

The pool is determined by a mask optionally specified in the fifth
DMA-cell of the DMA DT-property. If the fifth cell is omitted from the
phandle arguments or the mask is zero, then the allocation will be
performed from a set of all channels provided by the DMA controller.

Signed-off-by: Serge Semin 
---
 .../devicetree/bindings/dma/snps,dma-spear1340.yaml| 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml 
b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
index 20870f5c14dd..ef1d6879c158 100644
--- a/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
+++ b/Documentation/devicetree/bindings/dma/snps,dma-spear1340.yaml
@@ -18,12 +18,15 @@ properties:
 const: snps,dma-spear1340
 
   "#dma-cells":
-const: 3
+minimum: 3
+maximum: 4
 description: |
   First cell is a phandle pointing to the DMA controller. Second one is
   the DMA request line number. Third cell is the memory master identifier
   for transfers on dynamically allocated channel. Fourth cell is the
-  peripheral master identifier for transfers on an allocated channel.
+  peripheral master identifier for transfers on an allocated channel. Fifth
+  cell is an optional mask of the DMA channels permitted to be allocated
+  for the corresponding client device.
 
   reg:
 maxItems: 1
-- 
2.27.0



[PATCH v3 00/10] gpio: dwapb: Refactor GPIO resources initialization

2020-07-30 Thread Serge Semin
This series is about the DW APB GPIO device initialization procedure
cleaning up. First of all it has been discovered that having a
vendor-specific "snps,nr-gpios" property isn't only redundant but also
might be dangerous (see the commit log for details). Instead we suggest to
use the generic "ngpios" property to define a number of GPIOs each DW APB
GPIO controller port supports. Secondly seeing a tendency of the other
GPIO drivers getting converted to using the GPIO-lib-based IRQ-chip
interface this series provides a patch, which replaces the DW APB GPIO
driver Generic IRQ-chip implementation with the GPIO-lib IRQ-chip one.
Finally the DW APB GPIO device probe procedure is simplified by
converting the code to be using the device managed resources for the
reference clocks initialization, reset control assertion/de-assertion
and GPIO-chip registration.

Some additional cleanups like replacing a number of GPIOs literal with a
corresponding macro and grouping the IRQ handlers up in a single place of
the driver are also introduced in this patchset.

Link: 
https://lore.kernel.org/linux-gpio/20200723013858.10766-1-sergey.se...@baikalelectronics.ru/
Changelog v2:
- Replace gc->to_irq() with irq_find_mapping() method.
- Refactor dwapb_irq_set_type() method to directly set IRQ flow handler
  instead of using a temporary variable.
- Initialize GPIOlib IRQ-chip settings before calling request_irq()
  method.
- Add a notice regarding regression of commit 6a2f4b7dadd5 ("gpio:
  dwapb: use a second irq chip").
- Move the acpi_gpiochip_{request,free}_interrupts() methods invocation
  removal to a dedicated patch.
- Move GPIO-chip to_irq callback removal to a dedicated patch.
- Add a patch which replaces a max number of GPIO literals with a macro.
- Introduce dwapb_convert_irqs() method to convert the sparse parental
  IRQs array into an array of linearly distributed IRQs correctly
  perceived by GPIO-lib.

Link: 
https://lore.kernel.org/linux-gpio/20200730135536.19747-1-sergey.se...@baikalelectronics.ru
Changelog v3:
- Fix patches SoB's.
- Add Andy' Suggested-by to the patch: "gpio: dwapb: Add max GPIOs macro"
- Add blank lines to the IRQ-chip conversion commit message to make it
  more readable.
- Dynamically allocate memory for the IRQ-chip-related data.

Signed-off-by: Serge Semin 
Cc: Andy Shevchenko 
Cc: Andy Shevchenko 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: Rob Herring 
Cc: linux-g...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (10):
  dt-bindings: gpio: dwapb: Add ngpios property support
  gpio: dwapb: Add ngpios DT-property support
  gpio: dwapb: Move MFD-specific IRQ handler
  gpio: dwapb: Add max GPIOs macro
  gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip
  gpio: dwapb: Discard GPIO-to-IRQ mapping function
  gpio: dwapb: Discard ACPI GPIO-chip IRQs request
  gpio: dwapb: Get reset control by means of resource managed interface
  gpio: dwapb: Get clocks by means of resource managed interface
  gpio: dwapb: Use resource managed GPIO-chip add data method

 .../bindings/gpio/snps,dw-apb-gpio.yaml   |   6 +
 drivers/gpio/Kconfig  |   2 +-
 drivers/gpio/gpio-dwapb.c | 352 +-
 include/linux/platform_data/gpio-dwapb.h  |   4 +-
 4 files changed, 191 insertions(+), 173 deletions(-)

-- 
2.27.0



[PATCH v3 08/10] gpio: dwapb: Get reset control by means of resource managed interface

2020-07-30 Thread Serge Semin
The reset control interface provides the resource managed version of
the reset_control_get() method. The only thing which needs to be also
automated is the reset lane assertion on the device removal. It can be
implemented by means of the custom action definition. After that the reset
control will be purely managed by the device resources interface.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 35 +--
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 707e9087ca59..faac594287b3 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -621,6 +621,32 @@ static struct dwapb_platform_data 
*dwapb_gpio_get_pdata(struct device *dev)
return pdata;
 }
 
+static void dwapb_assert_reset(void *data)
+{
+   struct dwapb_gpio *gpio = data;
+
+   reset_control_assert(gpio->rst);
+}
+
+static int dwapb_get_reset(struct dwapb_gpio *gpio)
+{
+   int err;
+
+   gpio->rst = devm_reset_control_get_optional_shared(gpio->dev, NULL);
+   if (IS_ERR(gpio->rst)) {
+   dev_err(gpio->dev, "Cannot get reset descriptor\n");
+   return PTR_ERR(gpio->rst);
+   }
+
+   err = reset_control_deassert(gpio->rst);
+   if (err) {
+   dev_err(gpio->dev, "Cannot deassert reset lane\n");
+   return err;
+   }
+
+   return devm_add_action_or_reset(gpio->dev, dwapb_assert_reset, gpio);
+}
+
 static const struct of_device_id dwapb_of_match[] = {
{ .compatible = "snps,dw-apb-gpio", .data = (void *)0},
{ .compatible = "apm,xgene-gpio-v2", .data = (void 
*)GPIO_REG_OFFSET_V2},
@@ -660,11 +686,9 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
gpio->dev = >dev;
gpio->nr_ports = pdata->nports;
 
-   gpio->rst = devm_reset_control_get_optional_shared(dev, NULL);
-   if (IS_ERR(gpio->rst))
-   return PTR_ERR(gpio->rst);
-
-   reset_control_deassert(gpio->rst);
+   err = dwapb_get_reset(gpio);
+   if (err)
+   return err;
 
gpio->ports = devm_kcalloc(>dev, gpio->nr_ports,
   sizeof(*gpio->ports), GFP_KERNEL);
@@ -714,7 +738,6 @@ static int dwapb_gpio_remove(struct platform_device *pdev)
struct dwapb_gpio *gpio = platform_get_drvdata(pdev);
 
dwapb_gpio_unregister(gpio);
-   reset_control_assert(gpio->rst);
clk_bulk_disable_unprepare(DWAPB_NR_CLOCKS, gpio->clks);
 
return 0;
-- 
2.27.0



[PATCH v3 04/10] gpio: dwapb: Add max GPIOs macro

2020-07-30 Thread Serge Semin
Add a new macro DWAPB_MAX_GPIOS which defines the maximum possible number
of GPIO lines corresponding to the maximum DW APB GPIO controller port
width. Use the new macro instead of number literal 32 where it's
applicable.

Suggested-by: Andy Shevchenko 
Signed-off-by: Serge Semin 
---
 drivers/gpio/gpio-dwapb.c| 8 
 include/linux/platform_data/gpio-dwapb.h | 4 +++-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 3081213247d8..f34001152850 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -162,7 +162,7 @@ static struct dwapb_gpio_port *dwapb_offs_to_port(struct 
dwapb_gpio *gpio, unsig
 
for (i = 0; i < gpio->nr_ports; i++) {
port = >ports[i];
-   if (port->idx == offs / 32)
+   if (port->idx == offs / DWAPB_MAX_GPIOS)
return port;
}
 
@@ -182,7 +182,7 @@ static void dwapb_toggle_trigger(struct dwapb_gpio *gpio, 
unsigned int offs)
 
pol = dwapb_read(gpio, GPIO_INT_POLARITY);
/* Just read the current value right out of the data register */
-   val = gc->get(gc, offs % 32);
+   val = gc->get(gc, offs % DWAPB_MAX_GPIOS);
if (val)
pol &= ~BIT(offs);
else
@@ -197,7 +197,7 @@ static u32 dwapb_do_irq(struct dwapb_gpio *gpio)
irq_hw_number_t hwirq;
 
irq_status = dwapb_read(gpio, GPIO_INTSTATUS);
-   for_each_set_bit(hwirq, _status, 32) {
+   for_each_set_bit(hwirq, _status, DWAPB_MAX_GPIOS) {
int gpio_irq = irq_find_mapping(gpio->domain, hwirq);
u32 irq_type = irq_get_trigger_type(gpio_irq);
 
@@ -599,7 +599,7 @@ static struct dwapb_platform_data 
*dwapb_gpio_get_pdata(struct device *dev)
dev_info(dev,
 "failed to get number of gpios for port%d\n",
 i);
-   pp->ngpio = 32;
+   pp->ngpio = DWAPB_MAX_GPIOS;
}
 
pp->irq_shared  = false;
diff --git a/include/linux/platform_data/gpio-dwapb.h 
b/include/linux/platform_data/gpio-dwapb.h
index ff1be737bad6..0aa5c6720259 100644
--- a/include/linux/platform_data/gpio-dwapb.h
+++ b/include/linux/platform_data/gpio-dwapb.h
@@ -6,12 +6,14 @@
 #ifndef GPIO_DW_APB_H
 #define GPIO_DW_APB_H
 
+#define DWAPB_MAX_GPIOS32
+
 struct dwapb_port_property {
struct fwnode_handle *fwnode;
unsigned intidx;
unsigned intngpio;
unsigned intgpio_base;
-   int irq[32];
+   int irq[DWAPB_MAX_GPIOS];
boolirq_shared;
 };
 
-- 
2.27.0



[PATCH v3 06/10] gpio: dwapb: Discard GPIO-to-IRQ mapping function

2020-07-30 Thread Serge Semin
Since GPIOlib-based IRQ-chip interface is now utilized there is no need in
setting up a custom GPIO-to-IRQ mapping method. GPIO-lib defines the
standard mapping method - gpiochip_to_irq(), which will be used anyway no
matter whether the custom to_irq callback is specified or not.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 

---

Changelog v2:
- This is a new patch detached from commit
  "gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip".
---
 drivers/gpio/gpio-dwapb.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index eb3a6da453ff..b6affe7161b0 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -154,14 +154,6 @@ static inline void dwapb_write(struct dwapb_gpio *gpio, 
unsigned int offset,
gc->write_reg(reg_base + gpio_reg_convert(gpio, offset), val);
 }
 
-static int dwapb_gpio_to_irq(struct gpio_chip *gc, unsigned offset)
-{
-   struct dwapb_gpio_port *port = gpiochip_get_data(gc);
-   struct dwapb_gpio *gpio = port->gpio;
-
-   return irq_find_mapping(gpio->domain, offset);
-}
-
 static struct dwapb_gpio_port *dwapb_offs_to_port(struct dwapb_gpio *gpio, 
unsigned int offs)
 {
struct dwapb_gpio_port *port;
@@ -476,8 +468,6 @@ static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
 
girq->chip = >irqchip;
 
-   port->gc.to_irq = dwapb_gpio_to_irq;
-
return;
 
 err_kfree_pirq:
-- 
2.27.0



[PATCH v3 01/10] dt-bindings: gpio: dwapb: Add ngpios property support

2020-07-30 Thread Serge Semin
It's redundant to have a vendor-specific property describing a number of
GPIOS while there is a generic one. Let's mark the former one as
deprecated and define the "ngpios" property supported with constraints
of being within [1; 32] range.

Signed-off-by: Serge Semin 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/gpio/snps,dw-apb-gpio.yaml  | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml 
b/Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
index 1240f6289249..b391cc1b4590 100644
--- a/Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
+++ b/Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
@@ -61,8 +61,14 @@ patternProperties:
   '#gpio-cells':
 const: 2
 
+  ngpios:
+default: 32
+minimum: 1
+maximum: 32
+
   snps,nr-gpios:
 description: The number of GPIO pins exported by the port.
+deprecated: true
 $ref: /schemas/types.yaml#/definitions/uint32
 default: 32
 minimum: 1
-- 
2.27.0



[PATCH v3 09/10] gpio: dwapb: Get clocks by means of resource managed interface

2020-07-30 Thread Serge Semin
The kernel clock framework provides the resource managed version of
the clk_bulk_get() method. The only thing which needs to be also automated
is the clocks disable/unprepare procedure executed on the device removal.
It can be implemented by means of the custom action definition. After that
the clocks acquisition and release will be purely managed by the device
resources interface.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 48 ++-
 1 file changed, 32 insertions(+), 16 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index faac594287b3..803e1d1f4b5a 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -647,6 +647,36 @@ static int dwapb_get_reset(struct dwapb_gpio *gpio)
return devm_add_action_or_reset(gpio->dev, dwapb_assert_reset, gpio);
 }
 
+static void dwapb_disable_clks(void *data)
+{
+   struct dwapb_gpio *gpio = data;
+
+   clk_bulk_disable_unprepare(DWAPB_NR_CLOCKS, gpio->clks);
+}
+
+static int dwapb_get_clks(struct dwapb_gpio *gpio)
+{
+   int err;
+
+   /* Optional bus and debounce clocks */
+   gpio->clks[0].id = "bus";
+   gpio->clks[1].id = "db";
+   err = devm_clk_bulk_get_optional(gpio->dev, DWAPB_NR_CLOCKS,
+gpio->clks);
+   if (err) {
+   dev_err(gpio->dev, "Cannot get APB/Debounce clocks\n");
+   return err;
+   }
+
+   err = clk_bulk_prepare_enable(DWAPB_NR_CLOCKS, gpio->clks);
+   if (err) {
+   dev_err(gpio->dev, "Cannot enable APB/Debounce clocks\n");
+   return err;
+   }
+
+   return devm_add_action_or_reset(gpio->dev, dwapb_disable_clks, gpio);
+}
+
 static const struct of_device_id dwapb_of_match[] = {
{ .compatible = "snps,dw-apb-gpio", .data = (void *)0},
{ .compatible = "apm,xgene-gpio-v2", .data = (void 
*)GPIO_REG_OFFSET_V2},
@@ -699,21 +729,9 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
if (IS_ERR(gpio->regs))
return PTR_ERR(gpio->regs);
 
-   /* Optional bus and debounce clocks */
-   gpio->clks[0].id = "bus";
-   gpio->clks[1].id = "db";
-   err = devm_clk_bulk_get_optional(>dev, DWAPB_NR_CLOCKS,
-gpio->clks);
-   if (err) {
-   dev_err(>dev, "Cannot get APB/Debounce clocks\n");
-   return err;
-   }
-
-   err = clk_bulk_prepare_enable(DWAPB_NR_CLOCKS, gpio->clks);
-   if (err) {
-   dev_err(>dev, "Cannot enable APB/Debounce clocks\n");
+   err = dwapb_get_clks(gpio);
+   if (err)
return err;
-   }
 
gpio->flags = (uintptr_t)device_get_match_data(dev);
 
@@ -728,7 +746,6 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
 
 out_unregister:
dwapb_gpio_unregister(gpio);
-   clk_bulk_disable_unprepare(DWAPB_NR_CLOCKS, gpio->clks);
 
return err;
 }
@@ -738,7 +755,6 @@ static int dwapb_gpio_remove(struct platform_device *pdev)
struct dwapb_gpio *gpio = platform_get_drvdata(pdev);
 
dwapb_gpio_unregister(gpio);
-   clk_bulk_disable_unprepare(DWAPB_NR_CLOCKS, gpio->clks);
 
return 0;
 }
-- 
2.27.0



[PATCH v3 07/10] gpio: dwapb: Discard ACPI GPIO-chip IRQs request

2020-07-30 Thread Serge Semin
Since GPIOlib-based IRQ-chip interface is now utilized there is no need
in calling the methods acpi_gpiochip_{request,free}_interrupts() here.
They will be called from gpiochip_add_irqchip()/gpiochip_irqchip_remove()
anyway.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 

---

Changelog v2:
- This is a new patch detached from commit
  "gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip".
---
 drivers/gpio/gpio-dwapb.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index b6affe7161b0..707e9087ca59 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -526,9 +526,6 @@ static int dwapb_gpio_add_port(struct dwapb_gpio *gpio,
return err;
}
 
-   /* Add GPIO-signaled ACPI event support */
-   acpi_gpiochip_request_interrupts(>gc);
-
port->is_registered = true;
 
return 0;
@@ -544,7 +541,6 @@ static void dwapb_gpio_unregister(struct dwapb_gpio *gpio)
if (!port->is_registered)
continue;
 
-   acpi_gpiochip_free_interrupts(>gc);
gpiochip_remove(>gc);
}
 }
-- 
2.27.0



[PATCH v3 02/10] gpio: dwapb: Add ngpios DT-property support

2020-07-30 Thread Serge Semin
Indeed generic GPIO DT-schema implies that number of GPIOs should be
described by the "ngpios" property located under a GPIO-provider DT node.
In that case it's redundant to have a vendor-specific "snps,nr-gpios"
property describing the same setting. Moreover it might be errors prone.

Since commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when adding
all top level devices") the fwnode parsing is resumed after the vast
majority of the platform devices are added. Implicitly that commit
activates re-parsing of the whole device tree GPIOs-phandle properties
detected having "-gpio/-gpios" suffixes. Since the vendor-specific number
of GPIOs property is defined with "-gpios" suffix, then of_link_property()
will consider it as a suffix-property with "#gpio-cells" structure, which
obviously it doesn't have. As a result for two DW APB GPIO controllers
we'll have the next errors printed.

OF: /bus@1f059000/gpio@1f044000/gpio-port@0: could not find phandle
OF: /bus@1f059000/gpio@1f045000/gpio-port@0: could not get #gpio-cells for 
/opp-table
OF: /bus@1f059000/gpio@1f044000/gpio-port@0: could not find phandle
OF: /bus@1f059000/gpio@1f045000/gpio-port@0: could not get #gpio-cells for 
/opp-table

See, the kernel fwnode parsing procedure even tried to resolve the phandle
ID, which it thought was the opp-table DT-node, while in fact it was just
a number "32". What would happen if that magic number actually referred to
a GPIO DT-node with "#gpio-cells" property defined?..

In order to fix the problem let's mark the "snps,nr-gpios" property as
deprecated and add the generic "ngpios" property support with the same
purpose as the deprecated one. That and the errors log above shall
motivate the platform developer to convert the DW APB GPIO DT-nodes to
using the standard number of GPIOs property.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 1d8d55bd63aa..ccd80df16062 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -594,7 +594,8 @@ static struct dwapb_platform_data 
*dwapb_gpio_get_pdata(struct device *dev)
return ERR_PTR(-EINVAL);
}
 
-   if (fwnode_property_read_u32(fwnode, "snps,nr-gpios", 
>ngpio)) {
+   if (fwnode_property_read_u32(fwnode, "ngpios", >ngpio) &&
+   fwnode_property_read_u32(fwnode, "snps,nr-gpios", 
>ngpio)) {
dev_info(dev,
 "failed to get number of gpios for port%d\n",
 i);
-- 
2.27.0



[PATCH v3 10/10] gpio: dwapb: Use resource managed GPIO-chip add data method

2020-07-30 Thread Serge Semin
Since the resource managed version of gpiochip_add_data() will handle the
GPIO-chip data automated cleanup we can freely remove the DW APB GPIO
driver code responsible for that. After doing so the DW APB GPIO driver
removal callback can be also fully discarded since there is nothing left
to be done for it. All the cleanups are now performed by means of the
device managed framework.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 37 ++---
 1 file changed, 2 insertions(+), 35 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 803e1d1f4b5a..a5b326754124 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -91,7 +91,6 @@ struct dwapb_gpio_port_irqchip {
 struct dwapb_gpio_port {
struct gpio_chipgc;
struct dwapb_gpio_port_irqchip *pirq;
-   boolis_registered;
struct dwapb_gpio   *gpio;
 #ifdef CONFIG_PM_SLEEP
struct dwapb_context*ctx;
@@ -519,32 +518,16 @@ static int dwapb_gpio_add_port(struct dwapb_gpio *gpio,
if (pp->idx == 0)
dwapb_configure_irqs(gpio, port, pp);
 
-   err = gpiochip_add_data(>gc, port);
+   err = devm_gpiochip_add_data(gpio->dev, >gc, port);
if (err) {
dev_err(gpio->dev, "failed to register gpiochip for port%d\n",
port->idx);
return err;
}
 
-   port->is_registered = true;
-
return 0;
 }
 
-static void dwapb_gpio_unregister(struct dwapb_gpio *gpio)
-{
-   unsigned int m;
-
-   for (m = 0; m < gpio->nr_ports; ++m) {
-   struct dwapb_gpio_port *port = >ports[m];
-
-   if (!port->is_registered)
-   continue;
-
-   gpiochip_remove(>gc);
-   }
-}
-
 static void dwapb_get_irq(struct device *dev, struct fwnode_handle *fwnode,
  struct dwapb_port_property *pp)
 {
@@ -738,23 +721,8 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
for (i = 0; i < gpio->nr_ports; i++) {
err = dwapb_gpio_add_port(gpio, >properties[i], i);
if (err)
-   goto out_unregister;
+   return err;
}
-   platform_set_drvdata(pdev, gpio);
-
-   return 0;
-
-out_unregister:
-   dwapb_gpio_unregister(gpio);
-
-   return err;
-}
-
-static int dwapb_gpio_remove(struct platform_device *pdev)
-{
-   struct dwapb_gpio *gpio = platform_get_drvdata(pdev);
-
-   dwapb_gpio_unregister(gpio);
 
return 0;
 }
@@ -858,7 +826,6 @@ static struct platform_driver dwapb_gpio_driver = {
.acpi_match_table = dwapb_acpi_match,
},
.probe  = dwapb_gpio_probe,
-   .remove = dwapb_gpio_remove,
 };
 
 module_platform_driver(dwapb_gpio_driver);
-- 
2.27.0



[PATCH v3 03/10] gpio: dwapb: Move MFD-specific IRQ handler

2020-07-30 Thread Serge Semin
For better readability let's group all the IRQ handlers in a single place
of the driver instead of having them scatter around all over the file.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index ccd80df16062..3081213247d8 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -220,6 +220,11 @@ static void dwapb_irq_handler(struct irq_desc *desc)
chained_irq_exit(chip, desc);
 }
 
+static irqreturn_t dwapb_irq_handler_mfd(int irq, void *dev_id)
+{
+   return IRQ_RETVAL(dwapb_do_irq(dev_id));
+}
+
 static void dwapb_irq_enable(struct irq_data *d)
 {
struct irq_chip_generic *igc = irq_data_get_irq_chip_data(d);
@@ -349,11 +354,6 @@ static int dwapb_gpio_set_config(struct gpio_chip *gc, 
unsigned offset,
return dwapb_gpio_set_debounce(gc, offset, debounce);
 }
 
-static irqreturn_t dwapb_irq_handler_mfd(int irq, void *dev_id)
-{
-   return IRQ_RETVAL(dwapb_do_irq(dev_id));
-}
-
 static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
 struct dwapb_gpio_port *port,
 struct dwapb_port_property *pp)
-- 
2.27.0



[PATCH v3 05/10] gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip

2020-07-30 Thread Serge Semin
GPIO-lib provides a ready-to-use interface to initialize an IRQ-chip on
top of a GPIO chip. It's better from maintainability and readability
point of view to use one instead of supporting a hand-written Generic
IRQ-chip-based implementation. Moreover the new implementation won't
cause much functional overhead but will provide a cleaner driver code.
All of that makes the DW APB GPIO driver conversion pretty much justified
especially seeing a tendency of the other GPIO drivers getting converted
too.

Here is what we do in the framework of this commit to convert the driver
to using the GPIO-lib-based IRQ-chip interface:

1) IRQ ack, mask and unmask callbacks are locally defined instead of
using the Generic IRQ-chip ones.

2) An irq_chip structure instance is embedded into the dwapb_gpio
private data. Note we can't have a static instance of that structure since
GPIO-lib will add some hooks into it by calling gpiochip_set_irq_hooks().
A warning about that would have been printed by the GPIO-lib code if we
used a single irq_chip structure instance for multiple DW APB GPIO
controllers.

3) Initialize the gpio_irq_chip structure embedded into the gpio_chip
descriptor. By default there is no IRQ enabled so any event raised will be
handled by the handle_bad_irq() IRQ flow handler. If DW APB GPIO IP-core
is synthesized to have non-shared reference IRQ-lines, then as before the
hierarchical and cascaded cases are distinguished by checking how many
parental IRQs are defined. (Note irq_set_chained_handler_and_data() won't
initialize IRQs, which descriptors couldn't be found.) If DW APB GPIO IP
is used on a platform with shared IRQ line, then we simply won't let the
GPIO-lib to initialize the parental IRQs, but will handle them locally in
the driver.

4) Discard linear IRQ-domain and Generic IRQ-chip initialization, since
GPIO-lib IRQ-chip interface will create a new domain and accept a standard
IRQ-chip structure pointer based on the setting we provided in the
gpio_irq_chip structure.

5) Manually select a proper IRQ flow handler directly in the
irq_set_type() callback by calling irq_set_handler_locked() method, since
an ordinary (not Generic) irq_chip descriptor is now utilized. Note this
shalln't give any regression

6) Alter CONFIG_GPIO_DWAPB kernel config to select
CONFIG_GPIOLIB_IRQCHIP instead of CONFIG_GENERIC_IRQ_CHIP.

Note neither 4) nor 5) shall cause a regression of commit 6a2f4b7dadd5
("gpio: dwapb: use a second irq chip"), since the later isn't properly
used here anyway.

Signed-off-by: Serge Semin 

---

Note in this patch we omit the next GPIO-lib IRQ-chip settings
initialization:
gc->irq.map
gc->irq.init_valid_mask

That's intentionally done in order to keep the driver functionality at the
same level as before the conversion: each GPIO line will be registered as
the source of IRQs, no matter whether DW APB GPIO IP is configured to
generate combined or multiple IRQ signals.

If more thorough mapping is needed, the driver should be altered to detect
the IRQ signaling mode (for instance by checking a dedicated DT-property).
Then based on that it will be able to create a proper hardware GPIO-IRQ
to/from parental IRQs mapping and valid hardware GPIO-IRQs mask.

Changelog v2:
- Replace gc->to_irq() with irq_find_mapping() method.
- Refactor dwapb_irq_set_type() method to directly set IRQ flow handler
  instead of using a temporary variable.
- Initialize GPIOlib IRQ-chip settings before calling request_irq()
  method.
- Add a notice regarding regression of commit 6a2f4b7dadd5 ("gpio:
dwapb: use a second irq chip").
- Move the acpi_gpiochip_{request,free}_interrupts() methods invocation
  removal to a dedicated commit.
- Move GPIO-chip to_irq callback removal to a dedicated commit.
- Introduce dwapb_convert_irqs() method to convert the sparse parental
  IRQs array into an array of linearly distributed IRQs correctly
  perceived by GPIO-lib.

Changelog v3:
- Add blank lines to the commit message to make it more readable.
- Dynamically allocate memory for the IRQ-chip-related data.
---
 drivers/gpio/Kconfig  |   2 +-
 drivers/gpio/gpio-dwapb.c | 203 +-
 2 files changed, 111 insertions(+), 94 deletions(-)

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index c6b5c65c8405..bcd3e541a52b 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -202,7 +202,7 @@ config GPIO_DAVINCI
 config GPIO_DWAPB
tristate "Synopsys DesignWare APB GPIO driver"
select GPIO_GENERIC
-   select GENERIC_IRQ_CHIP
+   select GPIOLIB_IRQCHIP
help
  Say Y or M here to build support for the Synopsys DesignWare APB
  GPIO block.
diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index f34001152850..eb3a6da453ff 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 

Re: [PATCH v2 05/10] gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip

2020-07-30 Thread Serge Semin
On Thu, Jul 30, 2020 at 05:26:18PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 30, 2020 at 04:55:31PM +0300, Serge Semin wrote:
> > GPIO-lib provides a ready-to-use interface to initialize an IRQ-chip on
> > top of a GPIO chip. It's better from maintainability and readability
> > point of view to use one instead of supporting a hand-written Generic
> > IRQ-chip-based implementation. Moreover the new implementation won't
> > cause much functional overhead but will provide a cleaner driver code.
> > All of that makes the DW APB GPIO driver conversion pretty much justified
> > especially seeing a tendency of the other GPIO drivers getting converted
> > too.
> > 
> > Here is what we do in the framework of this commit to convert the driver
> > to using the GPIO-lib-based IRQ-chip interface:
> > 1) IRQ ack, mask and unmask callbacks are locally defined instead of
> > using the Generic IRQ-chip ones.
> 

> Easy to read if you put blank lines in between of items.

Ok.

> 
> > 2) An irq_chip structure instance is embedded into the dwapb_gpio
> > private data. Note we can't have a static instance of that structure since
> > GPIO-lib will add some hooks into it by calling gpiochip_set_irq_hooks().
> > A warning about that would have been printed by the GPIO-lib code if we
> > used a single irq_chip structure instance for multiple DW APB GPIO
> > controllers.
> > 3) Initialize the gpio_irq_chip structure embedded into the gpio_chip
> > descriptor. By default there is no IRQ enabled so any event raised will be
> > handled by the handle_bad_irq() IRQ flow handler. If DW APB GPIO IP-core
> > is synthesized to have non-shared reference IRQ-lines, then as before the
> > hierarchical and cascaded cases are distinguished by checking how many
> > parental IRQs are defined. (Note irq_set_chained_handler_and_data() won't
> > initialize IRQs, which descriptors couldn't be found.) If DW APB GPIO IP
> > is used on a platform with shared IRQ line, then we simply won't let the
> > GPIO-lib to initialize the parental IRQs, but will handle them locally in
> > the driver.
> > 4) Discard linear IRQ-domain and Generic IRQ-chip initialization, since
> > GPIO-lib IRQ-chip interface will create a new domain and accept a standard
> > IRQ-chip structure pointer based on the setting we provided in the
> > gpio_irq_chip structure.
> > 5) Manually select a proper IRQ flow handler directly in the
> > irq_set_type() callback by calling irq_set_handler_locked() method, since
> > an ordinary (not Generic) irq_chip descriptor is now utilized. Note this
> > shalln't give any regression
> > 6) Alter CONFIG_GPIO_DWAPB kernel config to select
> > CONFIG_GPIOLIB_IRQCHIP instead of CONFIG_GENERIC_IRQ_CHIP.
> > 
> > Note neither 4) nor 5) shall cause a regression of commit 6a2f4b7dadd5
> > ("gpio: dwapb: use a second irq chip"), since the later isn't properly
> > used here anyway.
> 
> ...
> 
> >  struct dwapb_gpio_port {
> > struct gpio_chipgc;
> > +   unsigned intnr_irqs;
> > +   unsigned intirq[DWAPB_MAX_GPIOS];
> > +   struct irq_chip irqchip;
> > boolis_registered;
> > struct dwapb_gpio   *gpio;
> 

> Isn't it too much wasted memory (imagine 4 port controller)?
> 
> What if we have it in another structure and allocate dynamically?
> 
> struct dwapb_gpio_port_irqchip {
>   struct irq_chip irqchip;
>   unsigned intnr_irqs;
>   unsigned intirq[DWAPB_MAX_GPIOS];
> };

Agree. I have to send a new revision of the series anyway. I'll do that shortly.

-Sergey

> 
>   ...
>   struct dwapb_gpio_port_irqchip *pirq;
>   ...
> 
> (I agree that IRQ chip is rather property of a port than controller)
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


Re: [PATCH v2 04/10] gpio: dwapb: Add max GPIOs macro

2020-07-30 Thread Serge Semin
On Thu, Jul 30, 2020 at 05:05:26PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 30, 2020 at 04:55:30PM +0300, Serge Semin wrote:
> > Add a new macro DWAPB_MAX_GPIOS which defines the maximum possible number
> > of GPIO lines corresponding to the maximum DW APB GPIO controller port
> > width. Use the new macro instead of number literal 32 where it's
> > applicable.
> 

> Since it's a modified version of what I sent earlier, perhaps Suggested-by?

Oh, found it
[PATCH v1 6/6] gpio: dwapb: Define magic number for IRQ and GPIO lines

Sorry for missing your suggested-by tag. I'll add it if new revision is
required. Otherwise, Linus, could you add it when you apply the series?

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


Re: [PATCH v2 00/10] gpio: dwapb: Refactor GPIO resources initialization

2020-07-30 Thread Serge Semin
Wou, I've confused my SOB tag here.

Linus, if no additional patchset revision is required, could you please
replace it with:
Signed-off-by: Serge Semin 
?

Alternatively I could resend the series with correct version of the tag.

-Sergey

On Thu, Jul 30, 2020 at 04:55:26PM +0300, Serge Semin wrote:
> This series is about the DW APB GPIO device initialization procedure
> cleaning up. First of all it has been discovered that having a
> vendor-specific "snps,nr-gpios" property isn't only redundant but also
> might be dangerous (see the commit log for details). Instead we suggest to
> use the generic "ngpios" property to define a number of GPIOs each DW APB
> GPIO controller port supports. Secondly seeing a tendency of the other
> GPIO drivers getting converted to using the GPIO-lib-based IRQ-chip
> interface this series provides a patch, which replaces the DW APB GPIO
> driver Generic IRQ-chip implementation with the GPIO-lib IRQ-chip one.
> Finally the DW APB GPIO device probe procedure is simplified by
> converting the code to be using the device managed resources for the
> reference clocks initialization, reset control assertion/de-assertion
> and GPIO-chip registration.
> 
> Some additional cleanups like replacing a number of GPIOs literal with a
> corresponding macro and grouping the IRQ handlers up in a single place of
> the driver are also introduced in this patchset.
> 
> Link: 
> https://lore.kernel.org/linux-gpio/20200723013858.10766-1-sergey.se...@baikalelectronics.ru/
> Changelog v2:
> - Replace gc->to_irq() with irq_find_mapping() method.
> - Refactor dwapb_irq_set_type() method to directly set IRQ flow handler
>   instead of using a temporary variable.
> - Initialize GPIOlib IRQ-chip settings before calling request_irq()
>   method.
> - Add a notice regarding regression of commit 6a2f4b7dadd5 ("gpio:
>   dwapb: use a second irq chip").
> - Move the acpi_gpiochip_{request,free}_interrupts() methods invocation
>   removal to a dedicated patch.
> - Move GPIO-chip to_irq callback removal to a dedicated patch.
> - Add a patch which replaces a max number of GPIO literals with a macro.
> - Introduce dwapb_convert_irqs() method to convert the sparse parental
>   IRQs array into an array of linearly distributed IRQs correctly
>   perceived by GPIO-lib.
> 
> Signed-off-by: Serge Semin 
> Cc: Andy Shevchenko 
> Cc: Andy Shevchenko 
> Cc: Alexey Malahov 
> Cc: Pavel Parkhomenko 
> Cc: Rob Herring 
> Cc: linux-g...@vger.kernel.org
> Cc: devicet...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> 
> Serge Semin (10):
>   dt-bindings: gpio: dwapb: Add ngpios property support
>   gpio: dwapb: Add ngpios DT-property support
>   gpio: dwapb: Move MFD-specific IRQ handler
>   gpio: dwapb: Add max GPIOs macro
>   gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip
>   gpio: dwapb: Discard GPIO-to-IRQ mapping function
>   gpio: dwapb: Discard ACPI GPIO-chip IRQs request
>   gpio: dwapb: Get reset control by means of resource managed interface
>   gpio: dwapb: Get clocks by means of resource managed interface
>   gpio: dwapb: Use resource managed GPIO-chip add data method
> 
>  .../bindings/gpio/snps,dw-apb-gpio.yaml   |   6 +
>  drivers/gpio/Kconfig  |   2 +-
>  drivers/gpio/gpio-dwapb.c | 340 +-
>  include/linux/platform_data/gpio-dwapb.h  |   4 +-
>  4 files changed, 178 insertions(+), 174 deletions(-)
> 
> -- 
> 2.27.0
> 


Re: [PATCH v2 04/10] gpio: dwapb: Add max GPIOs macro

2020-07-30 Thread Serge Semin
On Thu, Jul 30, 2020 at 05:05:26PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 30, 2020 at 04:55:30PM +0300, Serge Semin wrote:
> > Add a new macro DWAPB_MAX_GPIOS which defines the maximum possible number
> > of GPIO lines corresponding to the maximum DW APB GPIO controller port
> > width. Use the new macro instead of number literal 32 where it's
> > applicable.
> 

> Since it's a modified version of what I sent earlier, perhaps Suggested-by?

Could you point out to the message with that change? I must have missed that...(

-Sergey

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


[PATCH v2 03/10] gpio: dwapb: Move MFD-specific IRQ handler

2020-07-30 Thread Serge Semin
For better readability let's group all the IRQ handlers in a single place
of the driver instead of having them scatter around all over the file.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index ccd80df16062..3081213247d8 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -220,6 +220,11 @@ static void dwapb_irq_handler(struct irq_desc *desc)
chained_irq_exit(chip, desc);
 }
 
+static irqreturn_t dwapb_irq_handler_mfd(int irq, void *dev_id)
+{
+   return IRQ_RETVAL(dwapb_do_irq(dev_id));
+}
+
 static void dwapb_irq_enable(struct irq_data *d)
 {
struct irq_chip_generic *igc = irq_data_get_irq_chip_data(d);
@@ -349,11 +354,6 @@ static int dwapb_gpio_set_config(struct gpio_chip *gc, 
unsigned offset,
return dwapb_gpio_set_debounce(gc, offset, debounce);
 }
 
-static irqreturn_t dwapb_irq_handler_mfd(int irq, void *dev_id)
-{
-   return IRQ_RETVAL(dwapb_do_irq(dev_id));
-}
-
 static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
 struct dwapb_gpio_port *port,
 struct dwapb_port_property *pp)
-- 
2.27.0



[PATCH v2 07/10] gpio: dwapb: Discard ACPI GPIO-chip IRQs request

2020-07-30 Thread Serge Semin
Since GPIOlib-based IRQ-chip interface is now utilized there is no need
in calling the methods acpi_gpiochip_{request,free}_interrupts() here.
They will be called from gpiochip_add_irqchip()/gpiochip_irqchip_remove()
anyway.

Signed-off-by: Serge Semin 

---

Changelog v2:
- This is a new patch detached from commit
  "gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip".
---
 drivers/gpio/gpio-dwapb.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index f7acc5abbf5c..226d9c2d9493 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -512,9 +512,6 @@ static int dwapb_gpio_add_port(struct dwapb_gpio *gpio,
return err;
}
 
-   /* Add GPIO-signaled ACPI event support */
-   acpi_gpiochip_request_interrupts(>gc);
-
port->is_registered = true;
 
return 0;
@@ -530,7 +527,6 @@ static void dwapb_gpio_unregister(struct dwapb_gpio *gpio)
if (!port->is_registered)
continue;
 
-   acpi_gpiochip_free_interrupts(>gc);
gpiochip_remove(>gc);
}
 }
-- 
2.27.0



[PATCH v2 06/10] gpio: dwapb: Discard GPIO-to-IRQ mapping function

2020-07-30 Thread Serge Semin
Since GPIOlib-based IRQ-chip interface is now utilized there is no need in
setting up a custom GPIO-to-IRQ mapping method. GPIO-lib defines the
standard mapping method - gpiochip_to_irq(), which will be used anyway no
matter whether the custom to_irq callback is specified or not.

Signed-off-by: Serge Semin 

---

Changelog v2:
- This is a new patch detached from commit
  "gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip".
---
 drivers/gpio/gpio-dwapb.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 327333fbc750..f7acc5abbf5c 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -150,14 +150,6 @@ static inline void dwapb_write(struct dwapb_gpio *gpio, 
unsigned int offset,
gc->write_reg(reg_base + gpio_reg_convert(gpio, offset), val);
 }
 
-static int dwapb_gpio_to_irq(struct gpio_chip *gc, unsigned offset)
-{
-   struct dwapb_gpio_port *port = gpiochip_get_data(gc);
-   struct dwapb_gpio *gpio = port->gpio;
-
-   return irq_find_mapping(gpio->domain, offset);
-}
-
 static struct dwapb_gpio_port *dwapb_offs_to_port(struct dwapb_gpio *gpio, 
unsigned int offs)
 {
struct dwapb_gpio_port *port;
@@ -466,8 +458,6 @@ static void dwapb_configure_irqs(struct dwapb_gpio *gpio,
}
 
girq->chip = >irqchip;
-
-   port->gc.to_irq = dwapb_gpio_to_irq;
 }
 
 static int dwapb_gpio_add_port(struct dwapb_gpio *gpio,
-- 
2.27.0



[PATCH v2 04/10] gpio: dwapb: Add max GPIOs macro

2020-07-30 Thread Serge Semin
Add a new macro DWAPB_MAX_GPIOS which defines the maximum possible number
of GPIO lines corresponding to the maximum DW APB GPIO controller port
width. Use the new macro instead of number literal 32 where it's
applicable.

Signed-off-by: Serge Semin 
---
 drivers/gpio/gpio-dwapb.c| 8 
 include/linux/platform_data/gpio-dwapb.h | 4 +++-
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 3081213247d8..f34001152850 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -162,7 +162,7 @@ static struct dwapb_gpio_port *dwapb_offs_to_port(struct 
dwapb_gpio *gpio, unsig
 
for (i = 0; i < gpio->nr_ports; i++) {
port = >ports[i];
-   if (port->idx == offs / 32)
+   if (port->idx == offs / DWAPB_MAX_GPIOS)
return port;
}
 
@@ -182,7 +182,7 @@ static void dwapb_toggle_trigger(struct dwapb_gpio *gpio, 
unsigned int offs)
 
pol = dwapb_read(gpio, GPIO_INT_POLARITY);
/* Just read the current value right out of the data register */
-   val = gc->get(gc, offs % 32);
+   val = gc->get(gc, offs % DWAPB_MAX_GPIOS);
if (val)
pol &= ~BIT(offs);
else
@@ -197,7 +197,7 @@ static u32 dwapb_do_irq(struct dwapb_gpio *gpio)
irq_hw_number_t hwirq;
 
irq_status = dwapb_read(gpio, GPIO_INTSTATUS);
-   for_each_set_bit(hwirq, _status, 32) {
+   for_each_set_bit(hwirq, _status, DWAPB_MAX_GPIOS) {
int gpio_irq = irq_find_mapping(gpio->domain, hwirq);
u32 irq_type = irq_get_trigger_type(gpio_irq);
 
@@ -599,7 +599,7 @@ static struct dwapb_platform_data 
*dwapb_gpio_get_pdata(struct device *dev)
dev_info(dev,
 "failed to get number of gpios for port%d\n",
 i);
-   pp->ngpio = 32;
+   pp->ngpio = DWAPB_MAX_GPIOS;
}
 
pp->irq_shared  = false;
diff --git a/include/linux/platform_data/gpio-dwapb.h 
b/include/linux/platform_data/gpio-dwapb.h
index ff1be737bad6..0aa5c6720259 100644
--- a/include/linux/platform_data/gpio-dwapb.h
+++ b/include/linux/platform_data/gpio-dwapb.h
@@ -6,12 +6,14 @@
 #ifndef GPIO_DW_APB_H
 #define GPIO_DW_APB_H
 
+#define DWAPB_MAX_GPIOS32
+
 struct dwapb_port_property {
struct fwnode_handle *fwnode;
unsigned intidx;
unsigned intngpio;
unsigned intgpio_base;
-   int irq[32];
+   int irq[DWAPB_MAX_GPIOS];
boolirq_shared;
 };
 
-- 
2.27.0



[PATCH v2 01/10] dt-bindings: gpio: dwapb: Add ngpios property support

2020-07-30 Thread Serge Semin
It's redundant to have a vendor-specific property describing a number of
GPIOS while there is a generic one. Let's mark the former one as
deprecated and define the "ngpios" property supported with constraints
of being within [1; 32] range.

Signed-off-by: Serge Semin 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/gpio/snps,dw-apb-gpio.yaml  | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml 
b/Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
index 1240f6289249..b391cc1b4590 100644
--- a/Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
+++ b/Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
@@ -61,8 +61,14 @@ patternProperties:
   '#gpio-cells':
 const: 2
 
+  ngpios:
+default: 32
+minimum: 1
+maximum: 32
+
   snps,nr-gpios:
 description: The number of GPIO pins exported by the port.
+deprecated: true
 $ref: /schemas/types.yaml#/definitions/uint32
 default: 32
 minimum: 1
-- 
2.27.0



[PATCH v2 10/10] gpio: dwapb: Use resource managed GPIO-chip add data method

2020-07-30 Thread Serge Semin
Since the resource managed version of gpiochip_add_data() will handle the
GPIO-chip data automated cleanup we can freely remove the DW APB GPIO
driver code responsible for that. After doing so the DW APB GPIO driver
removal callback can be also fully discarded since there is nothing left
to be done for it. All the cleanups are now performed by means of the
device managed framework.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 37 ++---
 1 file changed, 2 insertions(+), 35 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 9b3065812de2..d3d555b3d492 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -87,7 +87,6 @@ struct dwapb_gpio_port {
unsigned intnr_irqs;
unsigned intirq[DWAPB_MAX_GPIOS];
struct irq_chip irqchip;
-   boolis_registered;
struct dwapb_gpio   *gpio;
 #ifdef CONFIG_PM_SLEEP
struct dwapb_context*ctx;
@@ -505,32 +504,16 @@ static int dwapb_gpio_add_port(struct dwapb_gpio *gpio,
if (pp->idx == 0)
dwapb_configure_irqs(gpio, port, pp);
 
-   err = gpiochip_add_data(>gc, port);
+   err = devm_gpiochip_add_data(gpio->dev, >gc, port);
if (err) {
dev_err(gpio->dev, "failed to register gpiochip for port%d\n",
port->idx);
return err;
}
 
-   port->is_registered = true;
-
return 0;
 }
 
-static void dwapb_gpio_unregister(struct dwapb_gpio *gpio)
-{
-   unsigned int m;
-
-   for (m = 0; m < gpio->nr_ports; ++m) {
-   struct dwapb_gpio_port *port = >ports[m];
-
-   if (!port->is_registered)
-   continue;
-
-   gpiochip_remove(>gc);
-   }
-}
-
 static void dwapb_get_irq(struct device *dev, struct fwnode_handle *fwnode,
  struct dwapb_port_property *pp)
 {
@@ -724,23 +707,8 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
for (i = 0; i < gpio->nr_ports; i++) {
err = dwapb_gpio_add_port(gpio, >properties[i], i);
if (err)
-   goto out_unregister;
+   return err;
}
-   platform_set_drvdata(pdev, gpio);
-
-   return 0;
-
-out_unregister:
-   dwapb_gpio_unregister(gpio);
-
-   return err;
-}
-
-static int dwapb_gpio_remove(struct platform_device *pdev)
-{
-   struct dwapb_gpio *gpio = platform_get_drvdata(pdev);
-
-   dwapb_gpio_unregister(gpio);
 
return 0;
 }
@@ -844,7 +812,6 @@ static struct platform_driver dwapb_gpio_driver = {
.acpi_match_table = dwapb_acpi_match,
},
.probe  = dwapb_gpio_probe,
-   .remove = dwapb_gpio_remove,
 };
 
 module_platform_driver(dwapb_gpio_driver);
-- 
2.27.0



[PATCH v2 09/10] gpio: dwapb: Get clocks by means of resource managed interface

2020-07-30 Thread Serge Semin
The kernel clock framework provides the resource managed version of
the clk_bulk_get() method. The only thing which needs to be also automated
is the clocks disable/unprepare procedure executed on the device removal.
It can be implemented by means of the custom action definition. After that
the clocks acquisition and release will be purely managed by the device
resources interface.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 48 ++-
 1 file changed, 32 insertions(+), 16 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 9aa681c79c82..9b3065812de2 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -633,6 +633,36 @@ static int dwapb_get_reset(struct dwapb_gpio *gpio)
return devm_add_action_or_reset(gpio->dev, dwapb_assert_reset, gpio);
 }
 
+static void dwapb_disable_clks(void *data)
+{
+   struct dwapb_gpio *gpio = data;
+
+   clk_bulk_disable_unprepare(DWAPB_NR_CLOCKS, gpio->clks);
+}
+
+static int dwapb_get_clks(struct dwapb_gpio *gpio)
+{
+   int err;
+
+   /* Optional bus and debounce clocks */
+   gpio->clks[0].id = "bus";
+   gpio->clks[1].id = "db";
+   err = devm_clk_bulk_get_optional(gpio->dev, DWAPB_NR_CLOCKS,
+gpio->clks);
+   if (err) {
+   dev_err(gpio->dev, "Cannot get APB/Debounce clocks\n");
+   return err;
+   }
+
+   err = clk_bulk_prepare_enable(DWAPB_NR_CLOCKS, gpio->clks);
+   if (err) {
+   dev_err(gpio->dev, "Cannot enable APB/Debounce clocks\n");
+   return err;
+   }
+
+   return devm_add_action_or_reset(gpio->dev, dwapb_disable_clks, gpio);
+}
+
 static const struct of_device_id dwapb_of_match[] = {
{ .compatible = "snps,dw-apb-gpio", .data = (void *)0},
{ .compatible = "apm,xgene-gpio-v2", .data = (void 
*)GPIO_REG_OFFSET_V2},
@@ -685,21 +715,9 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
if (IS_ERR(gpio->regs))
return PTR_ERR(gpio->regs);
 
-   /* Optional bus and debounce clocks */
-   gpio->clks[0].id = "bus";
-   gpio->clks[1].id = "db";
-   err = devm_clk_bulk_get_optional(>dev, DWAPB_NR_CLOCKS,
-gpio->clks);
-   if (err) {
-   dev_err(>dev, "Cannot get APB/Debounce clocks\n");
-   return err;
-   }
-
-   err = clk_bulk_prepare_enable(DWAPB_NR_CLOCKS, gpio->clks);
-   if (err) {
-   dev_err(>dev, "Cannot enable APB/Debounce clocks\n");
+   err = dwapb_get_clks(gpio);
+   if (err)
return err;
-   }
 
gpio->flags = (uintptr_t)device_get_match_data(dev);
 
@@ -714,7 +732,6 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
 
 out_unregister:
dwapb_gpio_unregister(gpio);
-   clk_bulk_disable_unprepare(DWAPB_NR_CLOCKS, gpio->clks);
 
return err;
 }
@@ -724,7 +741,6 @@ static int dwapb_gpio_remove(struct platform_device *pdev)
struct dwapb_gpio *gpio = platform_get_drvdata(pdev);
 
dwapb_gpio_unregister(gpio);
-   clk_bulk_disable_unprepare(DWAPB_NR_CLOCKS, gpio->clks);
 
return 0;
 }
-- 
2.27.0



[PATCH v2 05/10] gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip

2020-07-30 Thread Serge Semin
GPIO-lib provides a ready-to-use interface to initialize an IRQ-chip on
top of a GPIO chip. It's better from maintainability and readability
point of view to use one instead of supporting a hand-written Generic
IRQ-chip-based implementation. Moreover the new implementation won't
cause much functional overhead but will provide a cleaner driver code.
All of that makes the DW APB GPIO driver conversion pretty much justified
especially seeing a tendency of the other GPIO drivers getting converted
too.

Here is what we do in the framework of this commit to convert the driver
to using the GPIO-lib-based IRQ-chip interface:
1) IRQ ack, mask and unmask callbacks are locally defined instead of
using the Generic IRQ-chip ones.
2) An irq_chip structure instance is embedded into the dwapb_gpio
private data. Note we can't have a static instance of that structure since
GPIO-lib will add some hooks into it by calling gpiochip_set_irq_hooks().
A warning about that would have been printed by the GPIO-lib code if we
used a single irq_chip structure instance for multiple DW APB GPIO
controllers.
3) Initialize the gpio_irq_chip structure embedded into the gpio_chip
descriptor. By default there is no IRQ enabled so any event raised will be
handled by the handle_bad_irq() IRQ flow handler. If DW APB GPIO IP-core
is synthesized to have non-shared reference IRQ-lines, then as before the
hierarchical and cascaded cases are distinguished by checking how many
parental IRQs are defined. (Note irq_set_chained_handler_and_data() won't
initialize IRQs, which descriptors couldn't be found.) If DW APB GPIO IP
is used on a platform with shared IRQ line, then we simply won't let the
GPIO-lib to initialize the parental IRQs, but will handle them locally in
the driver.
4) Discard linear IRQ-domain and Generic IRQ-chip initialization, since
GPIO-lib IRQ-chip interface will create a new domain and accept a standard
IRQ-chip structure pointer based on the setting we provided in the
gpio_irq_chip structure.
5) Manually select a proper IRQ flow handler directly in the
irq_set_type() callback by calling irq_set_handler_locked() method, since
an ordinary (not Generic) irq_chip descriptor is now utilized. Note this
shalln't give any regression
6) Alter CONFIG_GPIO_DWAPB kernel config to select
CONFIG_GPIOLIB_IRQCHIP instead of CONFIG_GENERIC_IRQ_CHIP.

Note neither 4) nor 5) shall cause a regression of commit 6a2f4b7dadd5
("gpio: dwapb: use a second irq chip"), since the later isn't properly
used here anyway.

Signed-off-by: Serge Semin 

---

Note in this patch we omit the next GPIO-lib IRQ-chip settings
initialization:
gc->irq.map
gc->irq.init_valid_mask

That's intentionally done in order to keep the driver functionality at the
same level as before the conversion: each GPIO line will be registered as
the source of IRQs, no matter whether DW APB GPIO IP is configured to
generate combined or multiple IRQ signals.

If more thorough mapping is needed, the driver should be altered to detect
the IRQ signaling mode (for instance by checking a dedicated DT-property).
Then based on that it will be able to create a proper hardware GPIO-IRQ
to/from parental IRQs mapping and valid hardware GPIO-IRQs mask.

Changelog v2:
- Replace gc->to_irq() with irq_find_mapping() method.
- Refactor dwapb_irq_set_type() method to directly set IRQ flow handler
  instead of using a temporary variable.
- Initialize GPIOlib IRQ-chip settings before calling request_irq()
  method.
- Add a notice regarding regression of commit 6a2f4b7dadd5 ("gpio:
dwapb: use a second irq chip").
- Move the acpi_gpiochip_{request,free}_interrupts() methods invocation
  removal to a dedicated commit.
- Move GPIO-chip to_irq callback removal to a dedicated commit.
- Introduce dwapb_convert_irqs() method to convert the sparse parental
  IRQs array into an array of linearly distributed IRQs correctly
  perceived by GPIO-lib.
---
 drivers/gpio/Kconfig  |   2 +-
 drivers/gpio/gpio-dwapb.c | 191 +++---
 2 files changed, 98 insertions(+), 95 deletions(-)

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index c6b5c65c8405..bcd3e541a52b 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -202,7 +202,7 @@ config GPIO_DAVINCI
 config GPIO_DWAPB
tristate "Synopsys DesignWare APB GPIO driver"
select GPIO_GENERIC
-   select GENERIC_IRQ_CHIP
+   select GPIOLIB_IRQCHIP
help
  Say Y or M here to build support for the Synopsys DesignWare APB
  GPIO block.
diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index f34001152850..327333fbc750 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -13,7 +13,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -85,6 +84,9 @@ struct dwapb_context {
 
 struct dwapb_gpio_port {
struct gpio_chipgc;
+   unsigned intnr_irqs;
+   unsi

[PATCH v2 08/10] gpio: dwapb: Get reset control by means of resource managed interface

2020-07-30 Thread Serge Semin
The reset control interface provides the resource managed version of
the reset_control_get() method. The only thing which needs to be also
automated is the reset lane assertion on the device removal. It can be
implemented by means of the custom action definition. After that the reset
control will be purely managed by the device resources interface.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 35 +--
 1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 226d9c2d9493..9aa681c79c82 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -607,6 +607,32 @@ static struct dwapb_platform_data 
*dwapb_gpio_get_pdata(struct device *dev)
return pdata;
 }
 
+static void dwapb_assert_reset(void *data)
+{
+   struct dwapb_gpio *gpio = data;
+
+   reset_control_assert(gpio->rst);
+}
+
+static int dwapb_get_reset(struct dwapb_gpio *gpio)
+{
+   int err;
+
+   gpio->rst = devm_reset_control_get_optional_shared(gpio->dev, NULL);
+   if (IS_ERR(gpio->rst)) {
+   dev_err(gpio->dev, "Cannot get reset descriptor\n");
+   return PTR_ERR(gpio->rst);
+   }
+
+   err = reset_control_deassert(gpio->rst);
+   if (err) {
+   dev_err(gpio->dev, "Cannot deassert reset lane\n");
+   return err;
+   }
+
+   return devm_add_action_or_reset(gpio->dev, dwapb_assert_reset, gpio);
+}
+
 static const struct of_device_id dwapb_of_match[] = {
{ .compatible = "snps,dw-apb-gpio", .data = (void *)0},
{ .compatible = "apm,xgene-gpio-v2", .data = (void 
*)GPIO_REG_OFFSET_V2},
@@ -646,11 +672,9 @@ static int dwapb_gpio_probe(struct platform_device *pdev)
gpio->dev = >dev;
gpio->nr_ports = pdata->nports;
 
-   gpio->rst = devm_reset_control_get_optional_shared(dev, NULL);
-   if (IS_ERR(gpio->rst))
-   return PTR_ERR(gpio->rst);
-
-   reset_control_deassert(gpio->rst);
+   err = dwapb_get_reset(gpio);
+   if (err)
+   return err;
 
gpio->ports = devm_kcalloc(>dev, gpio->nr_ports,
   sizeof(*gpio->ports), GFP_KERNEL);
@@ -700,7 +724,6 @@ static int dwapb_gpio_remove(struct platform_device *pdev)
struct dwapb_gpio *gpio = platform_get_drvdata(pdev);
 
dwapb_gpio_unregister(gpio);
-   reset_control_assert(gpio->rst);
clk_bulk_disable_unprepare(DWAPB_NR_CLOCKS, gpio->clks);
 
return 0;
-- 
2.27.0



[PATCH v2 00/10] gpio: dwapb: Refactor GPIO resources initialization

2020-07-30 Thread Serge Semin
This series is about the DW APB GPIO device initialization procedure
cleaning up. First of all it has been discovered that having a
vendor-specific "snps,nr-gpios" property isn't only redundant but also
might be dangerous (see the commit log for details). Instead we suggest to
use the generic "ngpios" property to define a number of GPIOs each DW APB
GPIO controller port supports. Secondly seeing a tendency of the other
GPIO drivers getting converted to using the GPIO-lib-based IRQ-chip
interface this series provides a patch, which replaces the DW APB GPIO
driver Generic IRQ-chip implementation with the GPIO-lib IRQ-chip one.
Finally the DW APB GPIO device probe procedure is simplified by
converting the code to be using the device managed resources for the
reference clocks initialization, reset control assertion/de-assertion
and GPIO-chip registration.

Some additional cleanups like replacing a number of GPIOs literal with a
corresponding macro and grouping the IRQ handlers up in a single place of
the driver are also introduced in this patchset.

Link: 
https://lore.kernel.org/linux-gpio/20200723013858.10766-1-sergey.se...@baikalelectronics.ru/
Changelog v2:
- Replace gc->to_irq() with irq_find_mapping() method.
- Refactor dwapb_irq_set_type() method to directly set IRQ flow handler
  instead of using a temporary variable.
- Initialize GPIOlib IRQ-chip settings before calling request_irq()
  method.
- Add a notice regarding regression of commit 6a2f4b7dadd5 ("gpio:
  dwapb: use a second irq chip").
- Move the acpi_gpiochip_{request,free}_interrupts() methods invocation
  removal to a dedicated patch.
- Move GPIO-chip to_irq callback removal to a dedicated patch.
- Add a patch which replaces a max number of GPIO literals with a macro.
- Introduce dwapb_convert_irqs() method to convert the sparse parental
  IRQs array into an array of linearly distributed IRQs correctly
  perceived by GPIO-lib.

Signed-off-by: Serge Semin 
Cc: Andy Shevchenko 
Cc: Andy Shevchenko 
Cc: Alexey Malahov 
Cc: Pavel Parkhomenko 
Cc: Rob Herring 
Cc: linux-g...@vger.kernel.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Serge Semin (10):
  dt-bindings: gpio: dwapb: Add ngpios property support
  gpio: dwapb: Add ngpios DT-property support
  gpio: dwapb: Move MFD-specific IRQ handler
  gpio: dwapb: Add max GPIOs macro
  gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip
  gpio: dwapb: Discard GPIO-to-IRQ mapping function
  gpio: dwapb: Discard ACPI GPIO-chip IRQs request
  gpio: dwapb: Get reset control by means of resource managed interface
  gpio: dwapb: Get clocks by means of resource managed interface
  gpio: dwapb: Use resource managed GPIO-chip add data method

 .../bindings/gpio/snps,dw-apb-gpio.yaml   |   6 +
 drivers/gpio/Kconfig  |   2 +-
 drivers/gpio/gpio-dwapb.c | 340 +-
 include/linux/platform_data/gpio-dwapb.h  |   4 +-
 4 files changed, 178 insertions(+), 174 deletions(-)

-- 
2.27.0



[PATCH v2 02/10] gpio: dwapb: Add ngpios DT-property support

2020-07-30 Thread Serge Semin
Indeed generic GPIO DT-schema implies that number of GPIOs should be
described by the "ngpios" property located under a GPIO-provider DT node.
In that case it's redundant to have a vendor-specific "snps,nr-gpios"
property describing the same setting. Moreover it might be errors prone.

Since commit 93d2e4322aa7 ("of: platform: Batch fwnode parsing when adding
all top level devices") the fwnode parsing is resumed after the vast
majority of the platform devices are added. Implicitly that commit
activates re-parsing of the whole device tree GPIOs-phandle properties
detected having "-gpio/-gpios" suffixes. Since the vendor-specific number
of GPIOs property is defined with "-gpios" suffix, then of_link_property()
will consider it as a suffix-property with "#gpio-cells" structure, which
obviously it doesn't have. As a result for two DW APB GPIO controllers
we'll have the next errors printed.

OF: /bus@1f059000/gpio@1f044000/gpio-port@0: could not find phandle
OF: /bus@1f059000/gpio@1f045000/gpio-port@0: could not get #gpio-cells for 
/opp-table
OF: /bus@1f059000/gpio@1f044000/gpio-port@0: could not find phandle
OF: /bus@1f059000/gpio@1f045000/gpio-port@0: could not get #gpio-cells for 
/opp-table

See, the kernel fwnode parsing procedure even tried to resolve the phandle
ID, which it thought was the opp-table DT-node, while in fact it was just
a number "32". What would happen if that magic number actually referred to
a GPIO DT-node with "#gpio-cells" property defined?..

In order to fix the problem let's mark the "snps,nr-gpios" property as
deprecated and add the generic "ngpios" property support with the same
purpose as the deprecated one. That and the errors log above shall
motivate the platform developer to convert the DW APB GPIO DT-nodes to
using the standard number of GPIOs property.

Signed-off-by: Serge Semin 
Reviewed-by: Andy Shevchenko 
---
 drivers/gpio/gpio-dwapb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-dwapb.c b/drivers/gpio/gpio-dwapb.c
index 1d8d55bd63aa..ccd80df16062 100644
--- a/drivers/gpio/gpio-dwapb.c
+++ b/drivers/gpio/gpio-dwapb.c
@@ -594,7 +594,8 @@ static struct dwapb_platform_data 
*dwapb_gpio_get_pdata(struct device *dev)
return ERR_PTR(-EINVAL);
}
 
-   if (fwnode_property_read_u32(fwnode, "snps,nr-gpios", 
>ngpio)) {
+   if (fwnode_property_read_u32(fwnode, "ngpios", >ngpio) &&
+   fwnode_property_read_u32(fwnode, "snps,nr-gpios", 
>ngpio)) {
dev_info(dev,
 "failed to get number of gpios for port%d\n",
 i);
-- 
2.27.0



Re: [PATCH 4/7] gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip

2020-07-29 Thread Serge Semin
On Wed, Jul 29, 2020 at 06:10:24PM +0300, Andy Shevchenko wrote:
> On Wed, Jul 29, 2020 at 3:58 PM Serge Semin
>  wrote:
> > On Mon, Jul 27, 2020 at 12:22:28AM +0200, Linus Walleij wrote:
> 
> ...
> 
> > Sorry for a delay with a response to this issue. I had to give it a more 
> > thorough
> > thought since the problem is a bit more complex than it seemed originally. 
> > As I
> > see it now It might be wrong to implement the cases 2) and 3), but 1) is 
> > more
> > appropriate.
> >
> > First of all we need to note that GPIOlib framework provides the next 
> > parameters
> > to describe the IRQ-chip:
> > gc->irq.num_parents - number of parental IRQ numbers.
> > gc->irq.parents[] - array of parental IRQ numbers.
> > *gc->irq.valid_mask - a mask of IRQ/GPIO lines describing a valid IRQ.
> > *gc->irq.map - mapping of hw IRQ/GPIO ID -> parental IRQ numbers.
> >
> > Using that set we can handle any case of linear and sparse parental IRQs. 
> > Here
> > is how it can be implemented in the framework of DW APB GPIO controller.
> >
> > DW APB GPIO can be synthesized with two configs:
> > 1) Combined IRQ line (GPIO_INTR_IO == True),
> > 2) Multiple interrupt signals for each GPIO (GPIO_INTR_IO == False).
> >
> > Obviously the former case is trivial:
> >
> >  IRQ_combined
> > __^
> >/_ _ _ _ _ ___ _\
> >|_|_|_|_|_|...|_| - GPIOs
> >
> > In that case
> > gc->irq.num_parents = 1;
> > gc->irq.parents[0] = IRQ_combined;
> > *gc->irq.valid_mask = GENMASK(ngpio - 1, 0); // This is done by the GPIOlib 
> > core itself.
> >
> > The later one (when multiple interrupt signals are involved) can be a bit 
> > more
> > complicated. It can be also split up into two cases:
> > 2a) One-on-one GPIO-IRQ mapping.
> > 2b) Sparse GPIO-IRQ mapping.
> >
> > It's straightforward to implement 2a):
> >
> >i1i2i3i4i5 ... iN
> > _ _ _ _ _ ___ _
> >|_|_|_|_|_|...|_| - GPIOs
> >

> > In that case
> > gc->irq.num_parents = ngpio;
> > gc->irq.parents[] = {i1, i2, i3, i4, i5, ... iN};
> > gc->irq.map = {i1, i2, i3, i4, i5, ... iN};
> > *gc->irq.valid_mask = GENMASK(ngpio - 1, 0);
> >
> 
> This case puzzles me. Why is it not NULL and 0 and actually you handle
> everything as a nested case?

The code provided above is a sketch. Don't consider it literally. In reality
of course valid_mask will be not NULL, memory for which will be allocated by
the GPIOlib itself. The mask will be initialized by means of the
gc->irq.init_valid_mask() callback.

> 
> > The complication starts when we get to implementing 2b):
> >
> >i1 xi3i4 x ... iN
> > _ _ _ _ _ ___ _
> >|_|_|_|_|_|...|_| - GPIOs
> 
> So does this.
> 
> Valid mask will define exactly GPIOs that are IRQs. So, we will handle
> only nested IRQs which are valid.

Right.

> 
> > In order to cover this case we need to answer on two question.
> > Firstly how to get such platform config? I am not sure about ACPI, but
> > aside from straightforward platform_data-based setup such configuration
> > can be reached by setting up the "interrupts-extended" DT-property with
> > zeroed phandle.
> >
> > Ok, since it's possible to meet such platform config, we need to think
> > how to handle it and here is the second question. How to describe such
> > case in the framework of GPIOlib-IRQchip?
> >
> > So from my side it was wrong to set the sparse IRQs array to
> > gc->irq.parents. Instead I should have scanned the sparse IRQs array,
> > calculated the number of non-empty parental IRQs, created an array of linear
> > (non-sparse) IRQs, initialized *gc->irq.valid_mask in accordance with the
> > sparse parental IRQs array. In other words it was wrong to assume, that
> > each gc->irq.parents entry corresponds to the IRQ/GPIO line. The 
> > gc->irq.parents
> > array just describes the parental IRQs and nothing else.
> >
> > Shortly speaking here is how the GPIOlib IRQchip parameters should be
> > initialized in this case:
> > gc->irq.num_parents - number of valid parental IRQs.
> > gc->irq.parents - non-sparse, linear array of valid IRQs.
> > *gc->irq.valid_mask - mask initialized by means of the 
> > gc->irq.init_valid_mask()
> > callback, which indicates valid IRQ/GPIO IDs.
> > *gc->irq.map - sparse array of parental IRQ numbers (which I mistakenly 
> > tried to
> > pass through the gc->irq.parents pointer).
> >
>

Re: [PATCH 4/7] gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip

2020-07-29 Thread Serge Semin
On Mon, Jul 27, 2020 at 12:22:28AM +0200, Linus Walleij wrote:
> On Sat, Jul 25, 2020 at 1:03 AM Serge Semin
>  wrote:
> 
> > According to the DW APB GPIO databook it can be configured to provide 
> > either a
> > combined IRQ line or multiple interrupt signals for each GPIO. It's up to
> > the platform which of those signals are connected to an embedded IRQ
> > controller. So I guess theoretically the array values can be sparse.
> >
> > Anyway now I see it's rather problematic. I didn't forget about the sparse 
> > IRQs
> > array case. I just thought it would work out-of-box. Before getting your 
> > comment
> > and digging deeper into the IRQ subsystem I had thought that it wasn't a 
> > problem
> > passing invalid IRQ numbers to the irq_set_chained_handler_and_data() 
> > especially
> > seeing zero IRQ number was supposed to be considered as invalid. That 
> > method shall
> > just ignore the invalid IRQs since the method irq_to_desc() calling 
> > radix_tree_lookup()
> > will fail to find a descriptor with invalid IRQ value and return NULL. So 
> > after
> > getting a NULL irq_desc the method irq_set_chained_handler_and_data() would
> > have stopped setting the handler. But turns out it may work only for
> > CONFIG_SPARSE_IRQ. If that config isn't enabled, then a very first IRQ
> > descriptor will be returned for zero IRQ number. That descriptor will be
> > initialized with the passed parent_handler callback, which isn't what we 
> > want.
> 
> Ouch but different beahviour on the outside of the irqchip API depending
> on whether IRQs are sparse or not on some particular system seems to
> be a problem with irqchip reallty, if we wanna get to the bottom of things.
> (paging Marc)
> 
> > So in order to fix the problem we could follow either of the next paths:
> > 1) Just make sure the passed IRQs array is not sparse for instance by 
> > remapping
> >it to be linear.
> > 2) Move "if (gc->irq.parents[i]) irq_set_chained_handler_and_data()" 
> > statement to the
> >gpiochip_add_irqchip() method.
> >
> > What to you think? Linus?
> 

> What about (3) fixing irqchip?
> 
> Else (2), making the code inside gpiolib be careful and skip over
> invalid IRQs.

Sorry for a delay with a response to this issue. I had to give it a more 
thorough
thought since the problem is a bit more complex than it seemed originally. As I
see it now It might be wrong to implement the cases 2) and 3), but 1) is more
appropriate.

First of all we need to note that GPIOlib framework provides the next parameters
to describe the IRQ-chip:
gc->irq.num_parents - number of parental IRQ numbers.
gc->irq.parents[] - array of parental IRQ numbers.
*gc->irq.valid_mask - a mask of IRQ/GPIO lines describing a valid IRQ.
*gc->irq.map - mapping of hw IRQ/GPIO ID -> parental IRQ numbers.

Using that set we can handle any case of linear and sparse parental IRQs. Here
is how it can be implemented in the framework of DW APB GPIO controller.

DW APB GPIO can be synthesized with two configs:
1) Combined IRQ line (GPIO_INTR_IO == True),
2) Multiple interrupt signals for each GPIO (GPIO_INTR_IO == False).

Obviously the former case is trivial:

 IRQ_combined
__^
   /_ _ _ _ _ ___ _\
   |_|_|_|_|_|...|_| - GPIOs

In that case
gc->irq.num_parents = 1;
gc->irq.parents[0] = IRQ_combined;
*gc->irq.valid_mask = GENMASK(ngpio - 1, 0); // This is done by the GPIOlib 
core itself.

The later one (when multiple interrupt signals are involved) can be a bit more
complicated. It can be also split up into two cases:
2a) One-on-one GPIO-IRQ mapping.
2b) Sparse GPIO-IRQ mapping.

It's straightforward to implement 2a):

   i1i2i3i4i5 ... iN
_ _ _ _ _ ___ _
   |_|_|_|_|_|...|_| - GPIOs

In that case
gc->irq.num_parents = ngpio;
gc->irq.parents[] = {i1, i2, i3, i4, i5, ... iN};
gc->irq.map = {i1, i2, i3, i4, i5, ... iN};
*gc->irq.valid_mask = GENMASK(ngpio - 1, 0);

The complication starts when we get to implementing 2b):

   i1 xi3i4 x ... iN
_ _ _ _ _ ___ _
   |_|_|_|_|_|...|_| - GPIOs

In order to cover this case we need to answer on two question.
Firstly how to get such platform config? I am not sure about ACPI, but
aside from straightforward platform_data-based setup such configuration
can be reached by setting up the "interrupts-extended" DT-property with
zeroed phandle.

Ok, since it's possible to meet such platform config, we need to think
how to handle it and here is the second question. How to describe such
case in the framework of GPIOlib-IRQchip?

So from my side it was wrong to set the sparse IRQs array to
gc->irq.parents. Instead I should have scanned the sparse IRQs array,
calculated the number of non-empty parental I

Re: [PATCH 4/7] gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip

2020-07-27 Thread Serge Semin
On Sat, Jul 25, 2020 at 03:12:49PM +0300, Andy Shevchenko wrote:
> On Sat, Jul 25, 2020 at 2:03 AM Serge Semin
>  wrote:
> > On Thu, Jul 23, 2020 at 01:03:17PM +0300, Andy Shevchenko wrote:
> > > On Thu, Jul 23, 2020 at 04:38:55AM +0300, Serge Semin wrote:

...
 
> > > > +   /* This will let us handle the parent IRQ in the driver */
> > > > +   girq->parents = NULL;
> > > > +   girq->num_parents = 0;
> > > > +   girq->parent_handler = NULL;
> 
> > > Shan't we do this before request_irq() call (at least for consistency 
> > > with the
> > > rest of the drivers)?
> >
> > Technically we shan't. Please elaborate which drivers you are referring to?
> 
> All of them? Recent patches for IRQ chip template do something like
> 
> girq = &...;
> girq->foo = bar;
> ...
> ret = request_irq(...);
> 
> ...and here no more girq->baz = gaz; lines.
> 
> > Even the recent Linus' series "Use irqchip template" mostly does it in the
> > same order.
> 
> Funny, that's what I;m referring to.

It turns out my "mostly" was wrong in this matter. It's 4 out of 17 patches,
which make the initialization in the same order as mine:
drivers/gpio/gpio-max732x.c
drivers/gpio/gpio-pca953x.c
drivers/gpio/gpio-pcf857x.c
drivers/gpio/gpio-adp5588.c

while the rest of them does it in the order suggested by you:
drivers/gpio/gpio-pci-idio-16.c
drivers/gpio/gpio-pcie-idio-24.c
drivers/gpio/gpio-104-idio-16.c
drivers/gpio/gpio-104-dio-48e.c
drivers/gpio/gpio-ws16c48.c
drivers/gpio/gpio-rcar.c
drivers/gpio/gpio-wcove.c
drivers/pinctrl/pinctrl-amd.c
drivers/gpio/gpio-crystalcove.c
drivers/pinctrl/pinctrl-mcp23s08.c
drivers/pinctrl/pinctrl-sx150x.c
drivers/pinctrl/pinctrl-stmfx.c
drivers/gpio/gpio-tc3589x.c

Then, let's use the same order here as the most of the drivers do just for
consistency.

-Sergey 

> 
> -- 
> With Best Regards,
> Andy Shevchenko


Re: [PATCH v8 00/10] dmaengine: dw: Take Baikal-T1 SoC DW DMAC peculiarities into account

2020-07-27 Thread Serge Semin
On Mon, Jul 27, 2020 at 02:31:14PM +0530, Vinod Koul wrote:
> On 23-07-20, 03:58, Serge Semin wrote:
> > In the previous patchset I've written the next message:
> > 
> > > Folks, note I've removed the next patches from the series:
> > > [PATCH v7 04/11] dmaengine: Introduce max SG list entries capability
> > > [PATCH v7 11/11] dmaengine: dw: Initialize max_sg_nents capability
> > > It turns out the problem with the asynchronous handling of Tx- and Rx-
> > > SPI transfers interrupts is more complex than I expected. So in order to
> > > solve the problem it isn't enough to split the SG list entries submission
> > > up based on the max_sg_nents capability setting (though the synchronous
> > > one-by-one SG list entries handling does fix a part of the problem). So
> > > if and when I get to find a comprehensive solution for it I'll submit a
> > > new series with fixups. Until then please consider to merge the patchset
> > > in without those patches.
> > 
> > Those patches are returned back to the series. I've found a solution, which
> > fixes the problem for our hardware. A new patchset with several fixes for 
> > the
> > DW DMAC driver will be sent shortly after this one is merged in. Note the 
> > same
> > concerns the DW APB SPI driver. So please review and merge in as soon as
> > possible.
> > 
> > Regarding the patchset. Baikal-T1 SoC has an DW DMAC on-board to provide a
> > Mem-to-Mem, low-speed peripherals Dev-to-Mem and Mem-to-Dev functionality.
> > Mostly it's compatible with currently implemented in the kernel DW DMAC
> > driver, but there are some peculiarities which must be taken into account
> > in order to have the device fully supported.
> > 
> > First of all traditionally we replaced the legacy plain text-based 
> > dt-binding
> > file with yaml-based one. Secondly Baikal-T1 DW DMA Controller provides 
> > eight
> > channels, which alas have different max burst length configuration.
> > In particular first two channels may burst up to 128 bits (16 bytes) at a 
> > time
> > while the rest of them just up to 32 bits. We must make sure that the DMA
> > subsystem doesn't set values exceeding these limitations otherwise the
> > controller will hang up. In third currently we discovered the problem in 
> > using
> > the DW APB SPI driver together with DW DMAC. The problem happens if there 
> > is no
> > natively implemented multi-block LLP transfers support and the SPI-transfer
> > length exceeds the max lock size. In this case due to asynchronous handling 
> > of
> > Tx- and Rx- SPI transfers interrupt we might end up with DW APB SSI Rx FIFO
> > overflow. So if DW APB SSI (or any other DMAC service consumer) intends to 
> > use
> > the DMAC to asynchronously execute the transfers we'd have to at least warn
> > the user of the possible errors. In forth it's worth to set the DMA device 
> > max
> > segment size with max block size config specific to the DW DMA controller. 
> > It
> > shall help the DMA clients to create size-optimized SG-list items for the
> > controller. This in turn will cause less dw_desc allocations, less LLP
> > reinitializations, better DMA device performance.
> > 
> > Finally there is a bug in the algorithm of the nollp flag detection.
> > In particular even if DW DMAC parameters state the multi-block transfers
> > support there is still HC_LLP (hardcode LLP) flag, which if set makes 
> > expected
> > by the driver true multi-block LLP functionality unusable. This happens 
> > cause'
> > if HC_LLP flag is set the LLP registers will be hardcoded to zero so the
> > contiguous multi-block transfers will be only supported. We must take the
> > flag into account when detecting the LLP support otherwise the driver just
> > won't work correctly.
> 
> Applied all, thanks

Great! Thank you very much. Now I can send out another set of patches for
review.

-Sergey

> -- 
> ~Vinod


Re: [PATCH 4/7] gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip

2020-07-24 Thread Serge Semin
On Thu, Jul 23, 2020 at 05:08:15PM +0300, Andy Shevchenko wrote:
> On Thu, Jul 23, 2020 at 04:38:55AM +0300, Serge Semin wrote:
> > GPIO-lib provides a ready-to-use interface to initialize an IRQ-chip on
> > top of a GPIO chip. It's better from maintainability and readability
> > point of view to use one instead of supporting a hand-written Generic
> > IRQ-chip-based implementation. Moreover the new implementation won't
> > cause much functional overhead but will provide a cleaner driver code.
> > All of that makes the DW APB GPIO driver conversion pretty much justified
> > especially seeing a tendency of the other GPIO drivers getting converted
> > too.
> > 
> > Here is what we do in the framework of this commit to convert the driver
> > to using the GPIO-lib-based IRQ-chip interface:
> > 1) IRQ ack, mask and unmask callbacks are locally defined instead of
> > using the Generic IRQ-chip ones.
> > 2) An irq_chip structure instance is embedded into the dwapb_gpio
> > private data. Note we can't have a static instance of that structure since
> > GPIO-lib will add some hooks into it by calling gpiochip_set_irq_hooks().
> > A warning about that would have been printed by the GPIO-lib code if we
> > used a single irq_chip structure instance for multiple DW APB GPIO
> > controllers.
> > 3) Initialize the gpio_irq_chip structure embedded into the gpio_chip
> > descriptor. By default there is no IRQ enabled so any event raised will be
> > handled by the handle_bad_irq() IRQ flow handler. If DW APB GPIO IP-core
> > is synthesized to have non-shared reference IRQ-lines, then as before the
> > hierarchical and cascaded cases are distinguished by checking how many
> > parental IRQs are defined. (Note irq_set_chained_handler_and_data() won't
> > initialize IRQs, which descriptors couldn't be found.) If DW APB GPIO IP
> > is used on a platform with shared IRQ line, then we simply won't let the
> > GPIO-lib to initialize the parental IRQs, but will handle them locally in
> > the driver.
> > 4) Discard linear IRQ-domain and Generic IRQ-chip initialization, since
> > GPIO-lib IRQ-chip interface will create a new domain and accept a standard
> > IRQ-chip structure pointer based on the setting we provided in the
> > gpio_irq_chip structure.
> > 5) Manually select a proper IRQ flow handler directly in the
> > irq_set_type() callback by calling irq_set_handler_locked() method, since
> > an ordinary (not Generic) irq_chip descriptor is now utilized.
> > 6) Discard the custom GPIO-to-IRQ mapping function since GPIO-lib defines
> > the standard method gpiochip_to_irq(), which will be used anyway no matter
> > whether the custom to_irq callback is specified or not.
> > 7) Discard the acpi_gpiochip_{request,free}_interrupts()
> > invocations, since they will be called from
> > gpiochip_add_irqchip()/gpiochip_irqchip_remove() anyway.
> > 8) Alter CONFIG_GPIO_DWAPB kernel config to select
> > CONFIG_GPIOLIB_IRQCHIP instead of CONFIG_GENERIC_IRQ_CHIP.
> 
> 
> ...
> 
> One more thing...
> 
> >  static u32 dwapb_do_irq(struct dwapb_gpio *gpio)
> >  {
> > +   struct gpio_chip *gc = >ports[0].gc;
> > unsigned long irq_status;
> > irq_hw_number_t hwirq;
> >  
> > irq_status = dwapb_read(gpio, GPIO_INTSTATUS);
> > for_each_set_bit(hwirq, _status, 32) {
> > -   int gpio_irq = irq_find_mapping(gpio->domain, hwirq);
> > +   int gpio_irq = gc->to_irq(gc, hwirq);
> 

> Very, very few do this.
> Can we stick with the original one?
> (See plenty of other examples in the GPIO / pin control subsystems.

You are right. After more thorough studying the IRQ-domain code I've found out
that irq_domain_add_simple() provides the on-the-fly mapping creation. We
don't have to call irq_create_mapping() first before using irq_find_mapping().
So the irq_find_mapping(gc->irq.domain, hwirq) method can be freely called here
the same way as the most of the GPIO drivers do. Thanks for noticing this.

-Sergey

> 
> > u32 irq_type = irq_get_trigger_type(gpio_irq);
> >  
> > generic_handle_irq(gpio_irq);
> 
> 
> >  }
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 


<    2   3   4   5   6   7   8   9   10   11   >