Re: [PATCH] i2c: mpc: Removal of of_node_put with __free for auto cleanup

2024-04-15 Thread Chris Packham

On 16/04/24 08:54, Andi Shyti wrote:
> Hi Abhinav,
>
>>  /* Enable I2C interrupts for mpc5121 */
>> -node_ctrl = of_find_compatible_node(NULL, NULL,
>> -"fsl,mpc5121-i2c-ctrl");
>> +struct device_node *node_ctrl __free(device_node) =
> How have you tested this?

I'm not sure I know anyone that still has a mpc5121. Maybe someone on 
linuxppc-dev?

I did try to take the patch for a spin on my T2081RDB but I'm having 
some userland issues on it for some reason (unrelated to this change). 
The kernel boot does discover a few peripherals hanging of the I2C 
interface but I'm not in a position to offer up a Tested-by and I've run 
out of time to debug why my board is unhappy.



Re: [PATCH v1 2/4] powerpc/mpc5xxx: Switch mpc5xxx_get_bus_frequency() to use fwnode

2022-05-04 Thread Chris Packham

On 5/05/22 01:44, Andy Shevchenko wrote:
> Switch mpc5xxx_get_bus_frequency() to use fwnode in order to help
> cleaning up other parts of the kernel from OF specific code.
>
> No functional change intended.
>
> Signed-off-by: Andy Shevchenko 
> ---
>   arch/powerpc/include/asm/mpc5xxx.h|  9 +++-
>   arch/powerpc/platforms/52xx/mpc52xx_gpt.c |  2 +-
>   arch/powerpc/sysdev/mpc5xxx_clocks.c  | 41 ++-
>   drivers/ata/pata_mpc52xx.c|  2 +-
>   drivers/i2c/busses/i2c-mpc.c  |  7 ++--
>   drivers/net/can/mscan/mpc5xxx_can.c   |  2 +-
>   drivers/net/ethernet/freescale/fec_mpc52xx.c  |  2 +-
>   .../net/ethernet/freescale/fec_mpc52xx_phy.c  |  3 +-
>   .../net/ethernet/freescale/fs_enet/mii-fec.c  |  4 +-
>   drivers/spi/spi-mpc52xx.c |  2 +-
>   drivers/tty/serial/mpc52xx_uart.c |  4 +-
>   11 files changed, 44 insertions(+), 34 deletions(-)

> diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c
> index 6c698c10d3cd..2030668ecde5 100644
> --- a/drivers/i2c/busses/i2c-mpc.c
> +++ b/drivers/i2c/busses/i2c-mpc.c
> @@ -239,6 +239,7 @@ static const struct mpc_i2c_divider 
> mpc_i2c_dividers_52xx[] = {
>   static int mpc_i2c_get_fdr_52xx(struct device_node *node, u32 clock,
> u32 *real_clk)
>   {
> + struct fwnode_handle = of_fwnode_handle(node);

I think you mean

+   struct fwnode_handle *fwnode = of_fwnode_handle(node);

>   const struct mpc_i2c_divider *div = NULL;
>   unsigned int pvr = mfspr(SPRN_PVR);
>   u32 divider;
> @@ -246,12 +247,12 @@ static int mpc_i2c_get_fdr_52xx(struct device_node 
> *node, u32 clock,
>   
>   if (clock == MPC_I2C_CLOCK_LEGACY) {
>   /* see below - default fdr = 0x3f -> div = 2048 */
> - *real_clk = mpc5xxx_get_bus_frequency(node) / 2048;
> + *real_clk = mpc5xxx_fwnode_get_bus_frequency(fwnode) / 2048;
>   return -EINVAL;
>   }
>   
>   /* Determine divider value */
> - divider = mpc5xxx_get_bus_frequency(node) / clock;
> + divider = mpc5xxx_fwnode_get_bus_frequency(fwnode) / clock;
>   
>   /*
>* We want to choose an FDR/DFSR that generates an I2C bus speed that
> @@ -266,7 +267,7 @@ static int mpc_i2c_get_fdr_52xx(struct device_node *node, 
> u32 clock,
>   break;
>   }
>   
> - *real_clk = mpc5xxx_get_bus_frequency(node) / div->divider;
> + *real_clk = mpc5xxx_fwnode_get_bus_frequency(fwnode) / div->divider;
>   return (int)div->fdr;
>   }
>   

Re: [PATCH AUTOSEL 5.12 42/43] powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c controllers

2021-06-03 Thread Chris Packham

On 4/06/21 12:42 pm, Michael Ellerman wrote:
> Sasha Levin  writes:
>> From: Chris Packham 
>>
>> [ Upstream commit 7adc7b225cddcfd0f346d10144fd7a3d3d9f9ea7 ]
>>
>> The i2c controllers on the P2040/P2041 have an erratum where the
>> documented scheme for i2c bus recovery will not work (A-004447). A
>> different mechanism is needed which is documented in the P2040 Chip
>> Errata Rev Q (latest available at the time of writing).
>>
>> Signed-off-by: Chris Packham 
>> Acked-by: Michael Ellerman 
>> Signed-off-by: Wolfram Sang 
>> Signed-off-by: Sasha Levin 
>> ---
>>   arch/powerpc/boot/dts/fsl/p2041si-post.dtsi | 16 
>>   1 file changed, 16 insertions(+)
> This patch (and the subsequent one), just set a flag in the device tree.
>
> They have no effect unless you also backport the code change that looks
> for that flag, which was upstream commit:
>
>8f0cdec8b5fd ("i2c: mpc: implement erratum A-004447 workaround")

That change itself isn't cherry-pick able without

65171b2df15e ("i2c: mpc: Make use of i2c_recover_bus()")

and in between 65171b2df15e and 8f0cdec8b5fd are a bunch of cleanups and 
a fairly major rewrite which may also affect the cherry-pick ability.

> AFAICS you haven't picked that one up for any of the stable trees.
>
> I'll defer to Chris & Wolfram on whether it's a good idea to take the
> code change for stable.

We have been doing some extra QA on our end for the "i2c: mpc: Refactor 
to improve responsiveness" and "P2040/P2041 i2c recovery erratum" series 
which hasn't thrown up any issues. But it's still a lot of new code and 
at some point we're going to run into API changes.

Given the fact that it's starting to snowball one might err on the side 
of caution.

> I guess it's harmless to pick these two patches, but it's also
> pointless. So I think you either want to take all three, or drop these
> two.

At a minimum you need

65171b2df15e ("i2c: mpc: Make use of i2c_recover_bus()")
8f0cdec8b5fd ("i2c: mpc: implement erratum A-004447 workaround")
7adc7b225cdd ("powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 
i2c controllers")
19ae697a1e4e ("powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 
i2c controllers")

> cheers
>
>> diff --git a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi 
>> b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
>> index 872e4485dc3f..ddc018d42252 100644
>> --- a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
>> +++ b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
>> @@ -371,7 +371,23 @@ sdhc@114000 {
>>  };
>>   
>>   /include/ "qoriq-i2c-0.dtsi"
>> +i2c@118000 {
>> +fsl,i2c-erratum-a004447;
>> +};
>> +
>> +i2c@118100 {
>> +fsl,i2c-erratum-a004447;
>> +};
>> +
>>   /include/ "qoriq-i2c-1.dtsi"
>> +i2c@119000 {
>> +fsl,i2c-erratum-a004447;
>> +};
>> +
>> +i2c@119100 {
>> +fsl,i2c-erratum-a004447;
>> +};
>> +
>>   /include/ "qoriq-duart-0.dtsi"
>>   /include/ "qoriq-duart-1.dtsi"
>>   /include/ "qoriq-gpio-0.dtsi"
>> -- 
>> 2.30.2

Re: [PATCH v3 0/4] P2040/P2041 i2c recovery erratum

2021-05-19 Thread Chris Packham


On 13/05/21 3:01 am, w...@kernel.org wrote:
>>> I've been doing my recent work with a P2040 and prior to that I did test
>>> out the recovery on a T2081 (which isn't documented to have this
>>> erratum) when I was re-working the driver. The "new" recovery actually
>>> seems better but I don't have a reliably faulty i2c device so that's
>>> only based on me writing some code to manually trigger the recovery
>>> (using the snippet below) and observing it with an oscilloscope.
>> You don't need a faulty device, just an aborted I2C read/write op.
> If you can wire GPIOs to the bus, you can use the I2C fault injector:
>
>   Documentation/i2c/gpio-fault-injection.rst
>
> There are already two "incomplete transfer" injectors.
>
Just giving this thread a poke. I have been looking at my options for 
triggering an i2c recovery but haven't really had time to do much. I 
think the best option given what I've got access to is a modified SFP 
that grounds the SDA line but I need to find a system where I can attach 
an oscilloscope (should be a few of these in the office when I can get 
on-site).

I can confirm that when manually triggered the existing recovery and the 
new erratum workaround produce what I'd expect to observe on an 
oscilloscope.

I haven't explored Joakim's alternative recovery but I don't think that 
should hold up these changes, any improvement to the existing recovery 
can be done later as a follow-up.


Re: [PATCH v3 0/4] P2040/P2041 i2c recovery erratum

2021-05-11 Thread Chris Packham

On 12/05/21 10:10 am, Joakim Tjernlund wrote:
> On Wed, 2021-05-12 at 09:20 +1200, Chris Packham wrote:
>> The P2040/P2041 has an erratum where the i2c recovery scheme
>> documented in the reference manual (and currently implemented
>> in the i2c-mpc.c driver) does not work. The errata document
>> provides an alternative that does work. This series implements
>> that alternative and uses a property in the devicetree to
>> decide when the alternative mechanism is needed.
>>
>> Chris Packham (4):
>>    dt-bindings: i2c: mpc: Add fsl,i2c-erratum-a004447 flag
>>    powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c
>>  controllers
>>    powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c
>>  controllers
>>    i2c: mpc: implement erratum A-004447 workaround
>>
>>   .../devicetree/bindings/i2c/i2c-mpc.yaml  |  7 ++
>>   arch/powerpc/boot/dts/fsl/p1010si-post.dtsi   |  8 ++
>>   arch/powerpc/boot/dts/fsl/p2041si-post.dtsi   | 16 
>>   drivers/i2c/busses/i2c-mpc.c  | 81 ++-
>>   4 files changed, 110 insertions(+), 2 deletions(-)
>>
> This now reminds me about the current I2C reset procedure, it didn't work for 
> us and I came up with this one:
>https://www.spinics.net/lists/linux-i2c/msg29490.html
> it never got in but we are still using it.

For those reading along the v2 mentioned in that thread was posted as 
https://lore.kernel.org/linux-i2c/20170511122033.22471-1-joakim.tjernl...@infinera.com/
 
there was a bit of discussion but it seemed to die out without reaching 
a conclusion.

The i2c-mpc driver is now using the generic recovery mechanism so that 
addresses one bit of feedback from the original thread.

I do wonder if the reason the recovery wasn't working for your case was 
because of the erratum. Do you happen to remember which SoC your issue 
was on?

I've been doing my recent work with a P2040 and prior to that I did test 
out the recovery on a T2081 (which isn't documented to have this 
erratum) when I was re-working the driver. The "new" recovery actually 
seems better but I don't have a reliably faulty i2c device so that's 
only based on me writing some code to manually trigger the recovery 
(using the snippet below) and observing it with an oscilloscope.

---8<---
diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
index c85d6618723f..8fa4363414bc 100644
--- a/drivers/i2c/i2c-core-base.c
+++ b/drivers/i2c/i2c-core-base.c
@@ -1217,6 +1217,25 @@ new_device_store(struct device *dev, struct 
device_attribute *attr,
  }
  static DEVICE_ATTR_WO(new_device);

+static ssize_t recover_store(struct device *dev, struct 
device_attribute *attr,
+    const char *buf, size_t count)
+{
+   struct i2c_adapter *adap = to_i2c_adapter(dev);
+   int ret;
+
+   dev_info(dev, "Recovering bus\n");
+
+   ret = i2c_recover_bus(adap);
+   if (ret == -EOPNOTSUPP)
+   dev_info(dev, "recovery not supported\n");
+   else if (ret)
+   dev_err(dev, "recovery failed %d\n", ret);
+
+   return count;
+}
+
+static DEVICE_ATTR_WO(recover);
+
  /*
   * And of course let the users delete the devices they instantiated, if
   * they got it wrong. This interface can only be used to delete devices
@@ -1277,6 +1296,7 @@ static struct attribute *i2c_adapter_attrs[] = {
     _attr_name.attr,
     _attr_new_device.attr,
     _attr_delete_device.attr,
+   _attr_recover.attr,
     NULL
  };
  ATTRIBUTE_GROUPS(i2c_adapter);
---8<---



[PATCH v3 1/4] dt-bindings: i2c: mpc: Add fsl, i2c-erratum-a004447 flag

2021-05-11 Thread Chris Packham
Document the fsl,i2c-erratum-a004447 flag which indicates the presence
of an i2c erratum on some QorIQ SoCs.

Signed-off-by: Chris Packham 
Acked-by: Rob Herring 
---

Notes:
Changes in v3:
- Add Ack from Rob

 Documentation/devicetree/bindings/i2c/i2c-mpc.yaml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml 
b/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
index 7b553d559c83..98c6fcf7bf26 100644
--- a/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
+++ b/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
@@ -46,6 +46,13 @@ properties:
 description: |
   I2C bus timeout in microseconds
 
+  fsl,i2c-erratum-a004447:
+$ref: /schemas/types.yaml#/definitions/flag
+description: |
+  Indicates the presence of QorIQ erratum A-004447, which
+  says that the standard i2c recovery scheme mechanism does
+  not work and an alternate implementation is needed.
+
 required:
   - compatible
   - reg
-- 
2.31.1



[PATCH v3 3/4] powerpc/fsl: set fsl, i2c-erratum-a004447 flag for P1010 i2c controllers

2021-05-11 Thread Chris Packham
The i2c controllers on the P1010 have an erratum where the documented
scheme for i2c bus recovery will not work (A-004447). A different
mechanism is needed which is documented in the P1010 Chip Errata Rev L.

Signed-off-by: Chris Packham 
---

Notes:
Changes in v3:
- New

 arch/powerpc/boot/dts/fsl/p1010si-post.dtsi | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi
index c2717f31925a..ccda0a91abf0 100644
--- a/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/p1010si-post.dtsi
@@ -122,7 +122,15 @@ memory-controller@2000 {
};
 
 /include/ "pq3-i2c-0.dtsi"
+   i2c@3000 {
+   fsl,i2c-erratum-a004447;
+   };
+
 /include/ "pq3-i2c-1.dtsi"
+   i2c@3100 {
+   fsl,i2c-erratum-a004447;
+   };
+
 /include/ "pq3-duart-0.dtsi"
 /include/ "pq3-espi-0.dtsi"
spi0: spi@7000 {
-- 
2.31.1



[PATCH v3 2/4] powerpc/fsl: set fsl, i2c-erratum-a004447 flag for P2041 i2c controllers

2021-05-11 Thread Chris Packham
The i2c controllers on the P2040/P2041 have an erratum where the
documented scheme for i2c bus recovery will not work (A-004447). A
different mechanism is needed which is documented in the P2040 Chip
Errata Rev Q (latest available at the time of writing).

Signed-off-by: Chris Packham 
---
 arch/powerpc/boot/dts/fsl/p2041si-post.dtsi | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
index 872e4485dc3f..ddc018d42252 100644
--- a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
@@ -371,7 +371,23 @@ sdhc@114000 {
};
 
 /include/ "qoriq-i2c-0.dtsi"
+   i2c@118000 {
+   fsl,i2c-erratum-a004447;
+   };
+
+   i2c@118100 {
+   fsl,i2c-erratum-a004447;
+   };
+
 /include/ "qoriq-i2c-1.dtsi"
+   i2c@119000 {
+   fsl,i2c-erratum-a004447;
+   };
+
+   i2c@119100 {
+   fsl,i2c-erratum-a004447;
+   };
+
 /include/ "qoriq-duart-0.dtsi"
 /include/ "qoriq-duart-1.dtsi"
 /include/ "qoriq-gpio-0.dtsi"
-- 
2.31.1



[PATCH v3 0/4] P2040/P2041 i2c recovery erratum

2021-05-11 Thread Chris Packham
The P2040/P2041 has an erratum where the i2c recovery scheme
documented in the reference manual (and currently implemented
in the i2c-mpc.c driver) does not work. The errata document
provides an alternative that does work. This series implements
that alternative and uses a property in the devicetree to
decide when the alternative mechanism is needed.

Chris Packham (4):
  dt-bindings: i2c: mpc: Add fsl,i2c-erratum-a004447 flag
  powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c
controllers
  powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P1010 i2c
controllers
  i2c: mpc: implement erratum A-004447 workaround

 .../devicetree/bindings/i2c/i2c-mpc.yaml  |  7 ++
 arch/powerpc/boot/dts/fsl/p1010si-post.dtsi   |  8 ++
 arch/powerpc/boot/dts/fsl/p2041si-post.dtsi   | 16 
 drivers/i2c/busses/i2c-mpc.c  | 81 ++-
 4 files changed, 110 insertions(+), 2 deletions(-)

-- 
2.31.1



[PATCH v3 4/4] i2c: mpc: implement erratum A-004447 workaround

2021-05-11 Thread Chris Packham
The P2040/P2041 has an erratum where the normal i2c recovery mechanism
does not work. Implement the alternative recovery mechanism documented
in the P2040 Chip Errata Rev Q.

Signed-off-by: Chris Packham 
---

Notes:
Changes in v3:
- Further code reduction in i2c_mpc_wait_sr()
Changes in v2:
- Use readb_poll_timeout instead of open-coded loop

 drivers/i2c/busses/i2c-mpc.c | 81 +++-
 1 file changed, 79 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c
index 30d9e89a3db2..dcca9c2396db 100644
--- a/drivers/i2c/busses/i2c-mpc.c
+++ b/drivers/i2c/busses/i2c-mpc.c
@@ -19,6 +19,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -45,6 +46,7 @@
 #define CCR_MTX  0x10
 #define CCR_TXAK 0x08
 #define CCR_RSTA 0x04
+#define CCR_RSVD 0x02
 
 #define CSR_MCF  0x80
 #define CSR_MAAS 0x40
@@ -97,7 +99,7 @@ struct mpc_i2c {
u32 block;
int rc;
int expect_rxack;
-
+   bool has_errata_A004447;
 };
 
 struct mpc_i2c_divider {
@@ -136,6 +138,75 @@ static void mpc_i2c_fixup(struct mpc_i2c *i2c)
}
 }
 
+static int i2c_mpc_wait_sr(struct mpc_i2c *i2c, int mask)
+{
+   void __iomem *addr = i2c->base + MPC_I2C_SR;
+   u8 val;
+
+   return readb_poll_timeout(addr, val, val & mask, 0, 100);
+}
+
+/*
+ * Workaround for Erratum A004447. From the P2040CE Rev Q
+ *
+ * 1.  Set up the frequency divider and sampling rate.
+ * 2.  I2CCR - a0h
+ * 3.  Poll for I2CSR[MBB] to get set.
+ * 4.  If I2CSR[MAL] is set (an indication that SDA is stuck low), then go to
+ * step 5. If MAL is not set, then go to step 13.
+ * 5.  I2CCR - 00h
+ * 6.  I2CCR - 22h
+ * 7.  I2CCR - a2h
+ * 8.  Poll for I2CSR[MBB] to get set.
+ * 9.  Issue read to I2CDR.
+ * 10. Poll for I2CSR[MIF] to be set.
+ * 11. I2CCR - 82h
+ * 12. Workaround complete. Skip the next steps.
+ * 13. Issue read to I2CDR.
+ * 14. Poll for I2CSR[MIF] to be set.
+ * 15. I2CCR - 80h
+ */
+static void mpc_i2c_fixup_A004447(struct mpc_i2c *i2c)
+{
+   int ret;
+   u32 val;
+
+   writeccr(i2c, CCR_MEN | CCR_MSTA);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+   return;
+   }
+
+   val = readb(i2c->base + MPC_I2C_SR);
+
+   if (val & CSR_MAL) {
+   writeccr(i2c, 0x00);
+   writeccr(i2c, CCR_MSTA | CCR_RSVD);
+   writeccr(i2c, CCR_MEN | CCR_MSTA | CCR_RSVD);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+   return;
+   }
+   val = readb(i2c->base + MPC_I2C_DR);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+   return;
+   }
+   writeccr(i2c, CCR_MEN | CCR_RSVD);
+   } else {
+   val = readb(i2c->base + MPC_I2C_DR);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+   return;
+   }
+   writeccr(i2c, CCR_MEN);
+   }
+}
+
 #if defined(CONFIG_PPC_MPC52xx) || defined(CONFIG_PPC_MPC512x)
 static const struct mpc_i2c_divider mpc_i2c_dividers_52xx[] = {
{20, 0x20}, {22, 0x21}, {24, 0x22}, {26, 0x23},
@@ -670,7 +741,10 @@ static int fsl_i2c_bus_recovery(struct i2c_adapter *adap)
 {
struct mpc_i2c *i2c = i2c_get_adapdata(adap);
 
-   mpc_i2c_fixup(i2c);
+   if (i2c->has_errata_A004447)
+   mpc_i2c_fixup_A004447(i2c);
+   else
+   mpc_i2c_fixup(i2c);
 
return 0;
 }
@@ -767,6 +841,9 @@ static int fsl_i2c_probe(struct platform_device *op)
}
dev_info(i2c->dev, "timeout %u us\n", mpc_ops.timeout * 100 / HZ);
 
+   if (of_property_read_bool(op->dev.of_node, "fsl,i2c-erratum-a004447"))
+   i2c->has_errata_A004447 = true;
+
i2c->adap = mpc_ops;
scnprintf(i2c->adap.name, sizeof(i2c->adap.name),
  "MPC adapter (%s)", of_node_full_name(op->dev.of_node));
-- 
2.31.1



Re: [PATCH v2 2/3] powerpc/fsl: set fsl, i2c-erratum-a004447 flag for P2041 i2c controllers

2021-05-09 Thread Chris Packham

On 7/05/21 8:24 pm, Joakim Tjernlund wrote:
> On Fri, 2021-05-07 at 10:04 +0200, Joakim Tjernlund wrote:
>> On Fri, 2021-05-07 at 12:40 +1200, Chris Packham wrote:
>>> The i2c controllers on the P2040/P2041 have an erratum where the
>>> documented scheme for i2c bus recovery will not work (A-004447). A
>>> different mechanism is needed which is documented in the P2040 Chip
>>> Errata Rev Q (latest available at the time of writing).
>>  From what I can tell this Erratum also applies to P1010
Will add for v3.
>>   Jocke
> Reference: https://media.digikey.com/pdf/PCNs/Freescale/P1010CE_RevL.pdf
>
> Also, I think this series should go to stable.
This series builds on changes that have been merged for v5.13. I haven't 
checked if it applies to stable, I think at least commit 65171b2df15e 
("i2c: mpc: Make use of i2c_recover_bus()") would need to come along 
with it.

Re: [PATCH v2 1/3] dt-bindings: i2c: mpc: Add fsl, i2c-erratum-a004447 flag

2021-05-09 Thread Chris Packham

On 8/05/21 9:49 am, Rob Herring wrote:
> On Fri, May 07, 2021 at 12:40:45PM +1200, Chris Packham wrote:
>> Document the fsl,i2c-erratum-a004447 flag which indicates the presence
>> of an i2c erratum on some QorIQ SoCs.
>>
>> Signed-off-by: Chris Packham 
>> ---
>>   Documentation/devicetree/bindings/i2c/i2c-mpc.yaml | 7 +++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml 
>> b/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
>> index 7b553d559c83..98c6fcf7bf26 100644
>> --- a/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
>> +++ b/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
>> @@ -46,6 +46,13 @@ properties:
>>   description: |
>> I2C bus timeout in microseconds
>>   
>> +  fsl,i2c-erratum-a004447:
>> +$ref: /schemas/types.yaml#/definitions/flag
>> +description: |
>> +  Indicates the presence of QorIQ erratum A-004447, which
>> +  says that the standard i2c recovery scheme mechanism does
>> +  not work and an alternate implementation is needed.
> The problem with adding a property for an errata is you have to update
> the dtb. If you use the compatible string, then only an OS update is
> needed. That assumes you have specific enough compatible strings.

I was following the style of the existing fsl,usb-erratum-a007792 or 
fsl,erratum-a008585 properties. But that's not really a compelling reason.

The existing compatible string is "fsl-i2c" and it's used by pretty much 
every powerpc QorIQ SoC. There are some specific compatible strings in 
the driver for some of the older mpc SoCs. A more specific compatible 
string will work although determining which ones are affected might be a 
bit troublesome. That we know of the P2041 and P1010 are affected but I 
suspect there may be more. One disadvantage of using the compatible 
string is that as affected SoCs are identified we'll have to update the 
driver to know that SoC is affected and update the dtb to use it. With 
the property we'd just have to update the dtb.

I'm not too fussed either way so if that's a hard NACK on the property I 
can send a version that uses compatible strings instead.


[PATCH v2 3/3] i2c: mpc: implement erratum A-004447 workaround

2021-05-06 Thread Chris Packham
The P2040/P2041 has an erratum where the normal i2c recovery mechanism
does not work. Implement the alternative recovery mechanism documented
in the P2040 Chip Errata Rev Q.

Signed-off-by: Chris Packham 
---

Notes:
Changes in v2:
- Use readb_poll_timeout instead of open-coded loop

 drivers/i2c/busses/i2c-mpc.c | 84 +++-
 1 file changed, 82 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c
index 30d9e89a3db2..44e88a13a9e3 100644
--- a/drivers/i2c/busses/i2c-mpc.c
+++ b/drivers/i2c/busses/i2c-mpc.c
@@ -19,6 +19,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -45,6 +46,7 @@
 #define CCR_MTX  0x10
 #define CCR_TXAK 0x08
 #define CCR_RSTA 0x04
+#define CCR_RSVD 0x02
 
 #define CSR_MCF  0x80
 #define CSR_MAAS 0x40
@@ -97,7 +99,7 @@ struct mpc_i2c {
u32 block;
int rc;
int expect_rxack;
-
+   bool has_errata_A004447;
 };
 
 struct mpc_i2c_divider {
@@ -136,6 +138,78 @@ static void mpc_i2c_fixup(struct mpc_i2c *i2c)
}
 }
 
+static int i2c_mpc_wait_sr(struct mpc_i2c *i2c, int mask)
+{
+   int ret;
+   u8 val;
+
+   ret = readb_poll_timeout(i2c->base + MPC_I2C_SR, val,
+val & mask, 0, 100);
+
+   return ret;
+}
+
+/*
+ * Workaround for Erratum A004447. From the P2040CE Rev Q
+ *
+ * 1.  Set up the frequency divider and sampling rate.
+ * 2.  I2CCR - a0h
+ * 3.  Poll for I2CSR[MBB] to get set.
+ * 4.  If I2CSR[MAL] is set (an indication that SDA is stuck low), then go to
+ * step 5. If MAL is not set, then go to step 13.
+ * 5.  I2CCR - 00h
+ * 6.  I2CCR - 22h
+ * 7.  I2CCR - a2h
+ * 8.  Poll for I2CSR[MBB] to get set.
+ * 9.  Issue read to I2CDR.
+ * 10. Poll for I2CSR[MIF] to be set.
+ * 11. I2CCR - 82h
+ * 12. Workaround complete. Skip the next steps.
+ * 13. Issue read to I2CDR.
+ * 14. Poll for I2CSR[MIF] to be set.
+ * 15. I2CCR - 80h
+ */
+static void mpc_i2c_fixup_A004447(struct mpc_i2c *i2c)
+{
+   int ret;
+   u32 val;
+
+   writeccr(i2c, CCR_MEN | CCR_MSTA);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+   return;
+   }
+
+   val = readb(i2c->base + MPC_I2C_SR);
+
+   if (val & CSR_MAL) {
+   writeccr(i2c, 0x00);
+   writeccr(i2c, CCR_MSTA | CCR_RSVD);
+   writeccr(i2c, CCR_MEN | CCR_MSTA | CCR_RSVD);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+   return;
+   }
+   val = readb(i2c->base + MPC_I2C_DR);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+   return;
+   }
+   writeccr(i2c, CCR_MEN | CCR_RSVD);
+   } else {
+   val = readb(i2c->base + MPC_I2C_DR);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+   return;
+   }
+   writeccr(i2c, CCR_MEN);
+   }
+}
+
 #if defined(CONFIG_PPC_MPC52xx) || defined(CONFIG_PPC_MPC512x)
 static const struct mpc_i2c_divider mpc_i2c_dividers_52xx[] = {
{20, 0x20}, {22, 0x21}, {24, 0x22}, {26, 0x23},
@@ -670,7 +744,10 @@ static int fsl_i2c_bus_recovery(struct i2c_adapter *adap)
 {
struct mpc_i2c *i2c = i2c_get_adapdata(adap);
 
-   mpc_i2c_fixup(i2c);
+   if (i2c->has_errata_A004447)
+   mpc_i2c_fixup_A004447(i2c);
+   else
+   mpc_i2c_fixup(i2c);
 
return 0;
 }
@@ -767,6 +844,9 @@ static int fsl_i2c_probe(struct platform_device *op)
}
dev_info(i2c->dev, "timeout %u us\n", mpc_ops.timeout * 100 / HZ);
 
+   if (of_property_read_bool(op->dev.of_node, "fsl,i2c-erratum-a004447"))
+   i2c->has_errata_A004447 = true;
+
i2c->adap = mpc_ops;
scnprintf(i2c->adap.name, sizeof(i2c->adap.name),
  "MPC adapter (%s)", of_node_full_name(op->dev.of_node));
-- 
2.31.1



[PATCH v2 1/3] dt-bindings: i2c: mpc: Add fsl, i2c-erratum-a004447 flag

2021-05-06 Thread Chris Packham
Document the fsl,i2c-erratum-a004447 flag which indicates the presence
of an i2c erratum on some QorIQ SoCs.

Signed-off-by: Chris Packham 
---
 Documentation/devicetree/bindings/i2c/i2c-mpc.yaml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml 
b/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
index 7b553d559c83..98c6fcf7bf26 100644
--- a/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
+++ b/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
@@ -46,6 +46,13 @@ properties:
 description: |
   I2C bus timeout in microseconds
 
+  fsl,i2c-erratum-a004447:
+$ref: /schemas/types.yaml#/definitions/flag
+description: |
+  Indicates the presence of QorIQ erratum A-004447, which
+  says that the standard i2c recovery scheme mechanism does
+  not work and an alternate implementation is needed.
+
 required:
   - compatible
   - reg
-- 
2.31.1



[PATCH v2 0/3] P2040/P2041 i2c recovery erratum

2021-05-06 Thread Chris Packham
The P2040/P2041 has an erratum where the i2c recovery scheme
documented in the reference manual (and currently implemented
in the i2c-mpc.c driver) does not work. The errata document
provides an alternative that does work. This series implements
that alternative and uses a property in the devicetree to
decide when the alternative mechanism is needed.

Chris Packham (3):
  dt-bindings: i2c: mpc: Add fsl,i2c-erratum-a004447 flag
  powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c
controllers
  i2c: mpc: implement erratum A-004447 workaround

 .../devicetree/bindings/i2c/i2c-mpc.yaml  |  7 ++
 arch/powerpc/boot/dts/fsl/p2041si-post.dtsi   | 16 
 drivers/i2c/busses/i2c-mpc.c  | 84 ++-
 3 files changed, 105 insertions(+), 2 deletions(-)

-- 
2.31.1



[PATCH v2 2/3] powerpc/fsl: set fsl, i2c-erratum-a004447 flag for P2041 i2c controllers

2021-05-06 Thread Chris Packham
The i2c controllers on the P2040/P2041 have an erratum where the
documented scheme for i2c bus recovery will not work (A-004447). A
different mechanism is needed which is documented in the P2040 Chip
Errata Rev Q (latest available at the time of writing).

Signed-off-by: Chris Packham 
---
 arch/powerpc/boot/dts/fsl/p2041si-post.dtsi | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
index 872e4485dc3f..ddc018d42252 100644
--- a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
@@ -371,7 +371,23 @@ sdhc@114000 {
};
 
 /include/ "qoriq-i2c-0.dtsi"
+   i2c@118000 {
+   fsl,i2c-erratum-a004447;
+   };
+
+   i2c@118100 {
+   fsl,i2c-erratum-a004447;
+   };
+
 /include/ "qoriq-i2c-1.dtsi"
+   i2c@119000 {
+   fsl,i2c-erratum-a004447;
+   };
+
+   i2c@119100 {
+   fsl,i2c-erratum-a004447;
+   };
+
 /include/ "qoriq-duart-0.dtsi"
 /include/ "qoriq-duart-1.dtsi"
 /include/ "qoriq-gpio-0.dtsi"
-- 
2.31.1



Re: [PATCH 3/3] i2c: mpc: implement erratum A-004447 workaround

2021-05-06 Thread Chris Packham

On 6/05/21 8:03 pm, Andy Shevchenko wrote:


On Thursday, May 6, 2021, Chris Packham 
mailto:chris.pack...@alliedtelesis.co.nz>> 
wrote:
The P2040/P2041 has an erratum where the normal i2c recovery mechanism
does not work. Implement the alternative recovery mechanism documented
in the P2040 Chip Errata Rev Q.

Signed-off-by: Chris Packham 
mailto:chris.pack...@alliedtelesis.co.nz>>
---
 drivers/i2c/busses/i2c-mpc.c | 88 +++-
 1 file changed, 86 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c
index 30d9e89a3db2..052e37718771 100644
--- a/drivers/i2c/busses/i2c-mpc.c
+++ b/drivers/i2c/busses/i2c-mpc.c
@@ -45,6 +45,7 @@
 #define CCR_MTX  0x10
 #define CCR_TXAK 0x08
 #define CCR_RSTA 0x04
+#define CCR_RSVD 0x02

 #define CSR_MCF  0x80
 #define CSR_MAAS 0x40
@@ -97,7 +98,7 @@ struct mpc_i2c {
u32 block;
int rc;
int expect_rxack;
-
+   bool has_errata_A004447;
 };

 struct mpc_i2c_divider {
@@ -136,6 +137,83 @@ static void mpc_i2c_fixup(struct mpc_i2c *i2c)
}
 }

+static int i2c_mpc_wait_sr(struct mpc_i2c *i2c, int mask)
+{
+   unsigned long timeout = jiffies + usecs_to_jiffies(100);
+   int ret = 0;
+
+   while ((readb(i2c->base + MPC_I2C_SR) & mask) == 0) {
+   if (time_after(jiffies, timeout)) {
+   ret = -ETIMEDOUT;
+   break;
+   }
+   cond_resched();
+   }
+
+   return ret;
+}

readb_poll_timeout()


Thanks. I figured this existed I was just grepping for wait_.* and didn't find 
it. I'll prepare a v2 and get it out by the end of my day.

+
+/*
+ * Workaround for Erratum A004447. From the P2040CE Rev Q
+ *
+ * 1.  Set up the frequency divider and sampling rate.
+ * 2.  I2CCR - a0h
+ * 3.  Poll for I2CSR[MBB] to get set.
+ * 4.  If I2CSR[MAL] is set (an indication that SDA is stuck low), then go to
+ * step 5. If MAL is not set, then go to step 13.
+ * 5.  I2CCR - 00h
+ * 6.  I2CCR - 22h
+ * 7.  I2CCR - a2h
+ * 8.  Poll for I2CSR[MBB] to get set.
+ * 9.  Issue read to I2CDR.
+ * 10. Poll for I2CSR[MIF] to be set.
+ * 11. I2CCR - 82h
+ * 12. Workaround complete. Skip the next steps.
+ * 13. Issue read to I2CDR.
+ * 14. Poll for I2CSR[MIF] to be set.
+ * 15. I2CCR - 80h
+ */
+static void mpc_i2c_fixup_A004447(struct mpc_i2c *i2c)
+{
+   int ret;
+   u32 val;
+
+   writeccr(i2c, CCR_MEN | CCR_MSTA);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+   return;
+   }
+
+   val = readb(i2c->base + MPC_I2C_SR);
+
+   if (val & CSR_MAL) {
+   writeccr(i2c, 0x00);
+   writeccr(i2c, CCR_MSTA | CCR_RSVD);
+   writeccr(i2c, CCR_MEN | CCR_MSTA | CCR_RSVD);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+   return;
+   }
+   val = readb(i2c->base + MPC_I2C_DR);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+   return;
+   }
+   writeccr(i2c, CCR_MEN | CCR_RSVD);
+   } else {
+   val = readb(i2c->base + MPC_I2C_DR);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+   return;
+   }
+   writeccr(i2c, CCR_MEN);
+   }
+}
+
 #if defined(CONFIG_PPC_MPC52xx) || defined(CONFIG_PPC_MPC512x)
 static const struct mpc_i2c_divider mpc_i2c_dividers_52xx[] = {
{20, 0x20}, {22, 0x21}, {24, 0x22}, {26, 0x23},
@@ -670,7 +748,10 @@ static int fsl_i2c_bus_recovery(struct i2c_adapter *adap)
 {
struct mpc_i2c *i2c = i2c_get_adapdata(adap);

-   mpc_i2c_fixup(i2c);
+   if (i2c->has_errata_A004447)
+   mpc_i2c_fixup_A004447(i2c);
+   else
+   mpc_i2c_fixup(i2c);

return 0;
 }
@@ -767,6 +848,9 @@ static int fsl_i2c_probe(struct platform_device *op)
}
dev_info(i2c->dev, "timeout %u us\n", mpc_ops.timeout * 100 / HZ);

+   if (of_property_read_bool(op->dev.of_node, "fsl,i2c-erratum-a004447"))
+   i2c->has_errata_A004447 = true;
+
i2c->adap = mpc_ops;
scnprintf(i2c->adap.name<http://adap.name>, 
sizeof(i2c->adap.name<http://adap.name>),
  "MPC adapter (%s)", of_node_full_name(op->dev.of_node));
--
2.31.1



--
With Best Regards,
Andy Shevchenko




[PATCH 2/3] powerpc/fsl: set fsl, i2c-erratum-a004447 flag for P2041 i2c controllers

2021-05-05 Thread Chris Packham
The i2c controllers on the P2040/P2041 have an erratum where the
documented scheme for i2c bus recovery will not work (A-004447). A
different mechanism is needed which is documented in the P2040 Chip
Errata Rev Q (latest available at the time of writing).

Signed-off-by: Chris Packham 
---
 arch/powerpc/boot/dts/fsl/p2041si-post.dtsi | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
index 872e4485dc3f..ddc018d42252 100644
--- a/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/p2041si-post.dtsi
@@ -371,7 +371,23 @@ sdhc@114000 {
};
 
 /include/ "qoriq-i2c-0.dtsi"
+   i2c@118000 {
+   fsl,i2c-erratum-a004447;
+   };
+
+   i2c@118100 {
+   fsl,i2c-erratum-a004447;
+   };
+
 /include/ "qoriq-i2c-1.dtsi"
+   i2c@119000 {
+   fsl,i2c-erratum-a004447;
+   };
+
+   i2c@119100 {
+   fsl,i2c-erratum-a004447;
+   };
+
 /include/ "qoriq-duart-0.dtsi"
 /include/ "qoriq-duart-1.dtsi"
 /include/ "qoriq-gpio-0.dtsi"
-- 
2.31.1



[PATCH 3/3] i2c: mpc: implement erratum A-004447 workaround

2021-05-05 Thread Chris Packham
The P2040/P2041 has an erratum where the normal i2c recovery mechanism
does not work. Implement the alternative recovery mechanism documented
in the P2040 Chip Errata Rev Q.

Signed-off-by: Chris Packham 
---
 drivers/i2c/busses/i2c-mpc.c | 88 +++-
 1 file changed, 86 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c
index 30d9e89a3db2..052e37718771 100644
--- a/drivers/i2c/busses/i2c-mpc.c
+++ b/drivers/i2c/busses/i2c-mpc.c
@@ -45,6 +45,7 @@
 #define CCR_MTX  0x10
 #define CCR_TXAK 0x08
 #define CCR_RSTA 0x04
+#define CCR_RSVD 0x02
 
 #define CSR_MCF  0x80
 #define CSR_MAAS 0x40
@@ -97,7 +98,7 @@ struct mpc_i2c {
u32 block;
int rc;
int expect_rxack;
-
+   bool has_errata_A004447;
 };
 
 struct mpc_i2c_divider {
@@ -136,6 +137,83 @@ static void mpc_i2c_fixup(struct mpc_i2c *i2c)
}
 }
 
+static int i2c_mpc_wait_sr(struct mpc_i2c *i2c, int mask)
+{
+   unsigned long timeout = jiffies + usecs_to_jiffies(100);
+   int ret = 0;
+
+   while ((readb(i2c->base + MPC_I2C_SR) & mask) == 0) {
+   if (time_after(jiffies, timeout)) {
+   ret = -ETIMEDOUT;
+   break;
+   }
+   cond_resched();
+   }
+
+   return ret;
+}
+
+/*
+ * Workaround for Erratum A004447. From the P2040CE Rev Q
+ *
+ * 1.  Set up the frequency divider and sampling rate.
+ * 2.  I2CCR - a0h
+ * 3.  Poll for I2CSR[MBB] to get set.
+ * 4.  If I2CSR[MAL] is set (an indication that SDA is stuck low), then go to
+ * step 5. If MAL is not set, then go to step 13.
+ * 5.  I2CCR - 00h
+ * 6.  I2CCR - 22h
+ * 7.  I2CCR - a2h
+ * 8.  Poll for I2CSR[MBB] to get set.
+ * 9.  Issue read to I2CDR.
+ * 10. Poll for I2CSR[MIF] to be set.
+ * 11. I2CCR - 82h
+ * 12. Workaround complete. Skip the next steps.
+ * 13. Issue read to I2CDR.
+ * 14. Poll for I2CSR[MIF] to be set.
+ * 15. I2CCR - 80h
+ */
+static void mpc_i2c_fixup_A004447(struct mpc_i2c *i2c)
+{
+   int ret;
+   u32 val;
+
+   writeccr(i2c, CCR_MEN | CCR_MSTA);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+   return;
+   }
+
+   val = readb(i2c->base + MPC_I2C_SR);
+
+   if (val & CSR_MAL) {
+   writeccr(i2c, 0x00);
+   writeccr(i2c, CCR_MSTA | CCR_RSVD);
+   writeccr(i2c, CCR_MEN | CCR_MSTA | CCR_RSVD);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+   return;
+   }
+   val = readb(i2c->base + MPC_I2C_DR);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+   return;
+   }
+   writeccr(i2c, CCR_MEN | CCR_RSVD);
+   } else {
+   val = readb(i2c->base + MPC_I2C_DR);
+   ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+   if (ret) {
+   dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+   return;
+   }
+   writeccr(i2c, CCR_MEN);
+   }
+}
+
 #if defined(CONFIG_PPC_MPC52xx) || defined(CONFIG_PPC_MPC512x)
 static const struct mpc_i2c_divider mpc_i2c_dividers_52xx[] = {
{20, 0x20}, {22, 0x21}, {24, 0x22}, {26, 0x23},
@@ -670,7 +748,10 @@ static int fsl_i2c_bus_recovery(struct i2c_adapter *adap)
 {
struct mpc_i2c *i2c = i2c_get_adapdata(adap);
 
-   mpc_i2c_fixup(i2c);
+   if (i2c->has_errata_A004447)
+   mpc_i2c_fixup_A004447(i2c);
+   else
+   mpc_i2c_fixup(i2c);
 
return 0;
 }
@@ -767,6 +848,9 @@ static int fsl_i2c_probe(struct platform_device *op)
}
dev_info(i2c->dev, "timeout %u us\n", mpc_ops.timeout * 100 / HZ);
 
+   if (of_property_read_bool(op->dev.of_node, "fsl,i2c-erratum-a004447"))
+   i2c->has_errata_A004447 = true;
+
i2c->adap = mpc_ops;
scnprintf(i2c->adap.name, sizeof(i2c->adap.name),
  "MPC adapter (%s)", of_node_full_name(op->dev.of_node));
-- 
2.31.1



[PATCH 0/3] P2040/P2041 i2c recovery erratum

2021-05-05 Thread Chris Packham
The P2040/P2041 has an erratum where the i2c recovery scheme
documented in the reference manual (and currently implemented
in the i2c-mpc.c driver) does not work. The errata document
provides an alternative that does work. This series implements
that alternative and uses a property in the devicetree to
decide when the alternative mechanism is needed.

Chris Packham (3):
  dt-bindings: i2c: mpc: Add fsl,i2c-erratum-a004447 flag
  powerpc/fsl: set fsl,i2c-erratum-a004447 flag for P2041 i2c
controllers
  i2c: mpc: implement erratum A-004447 workaround

 .../devicetree/bindings/i2c/i2c-mpc.yaml  |  7 ++
 arch/powerpc/boot/dts/fsl/p2041si-post.dtsi   | 16 
 drivers/i2c/busses/i2c-mpc.c  | 88 ++-
 3 files changed, 109 insertions(+), 2 deletions(-)

-- 
2.31.1



[PATCH 1/3] dt-bindings: i2c: mpc: Add fsl,i2c-erratum-a004447 flag

2021-05-05 Thread Chris Packham
Document the fsl,i2c-erratum-a004447 flag which indicates the presence
of an i2c erratum on some QorIQ SoCs.

Signed-off-by: Chris Packham 
---
 Documentation/devicetree/bindings/i2c/i2c-mpc.yaml | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml 
b/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
index 7b553d559c83..98c6fcf7bf26 100644
--- a/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
+++ b/Documentation/devicetree/bindings/i2c/i2c-mpc.yaml
@@ -46,6 +46,13 @@ properties:
 description: |
   I2C bus timeout in microseconds
 
+  fsl,i2c-erratum-a004447:
+$ref: /schemas/types.yaml#/definitions/flag
+description: |
+  Indicates the presence of QorIQ erratum A-004447, which
+  says that the standard i2c recovery scheme mechanism does
+  not work and an alternate implementation is needed.
+
 required:
   - compatible
   - reg
-- 
2.31.1



Re: [PATCH v2 2/2] powerpc/legacy_serial: Use early_ioremap()

2021-04-20 Thread Chris Packham
Hi Christophe,

On 21/04/21 1:32 am, Christophe Leroy wrote:
> From: Christophe Leroy 
>
> [0.00] ioremap() called early from 
> find_legacy_serial_ports+0x3cc/0x474. Use early_ioremap() instead
>
> find_legacy_serial_ports() is called early from setup_arch(), before
> paging_init(). vmalloc is not available yet, ioremap shouldn't be
> used that early.
>
> Use early_ioremap() and switch to a regular ioremap() later.
>
> Signed-off-by: Christophe Leroy 
> Signed-off-by: Christophe Leroy 
> ---

I gave it a spin on my T2080RDB. The error message is gone and I still 
get console output.

Tested-by: Chris Packham 

>   arch/powerpc/kernel/legacy_serial.c | 33 +
>   1 file changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/kernel/legacy_serial.c 
> b/arch/powerpc/kernel/legacy_serial.c
> index f061e06e9f51..8b2c1a8553a0 100644
> --- a/arch/powerpc/kernel/legacy_serial.c
> +++ b/arch/powerpc/kernel/legacy_serial.c
> @@ -15,6 +15,7 @@
>   #include 
>   #include 
>   #include 
> +#include 
>   
>   #undef DEBUG
>   
> @@ -34,6 +35,7 @@ static struct legacy_serial_info {
>   unsigned intclock;
>   int irq_check_parent;
>   phys_addr_t taddr;
> + void __iomem*early_addr;
>   } legacy_serial_infos[MAX_LEGACY_SERIAL_PORTS];
>   
>   static const struct of_device_id legacy_serial_parents[] __initconst = {
> @@ -325,17 +327,16 @@ static void __init setup_legacy_serial_console(int 
> console)
>   {
>   struct legacy_serial_info *info = _serial_infos[console];
>   struct plat_serial8250_port *port = _serial_ports[console];
> - void __iomem *addr;
>   unsigned int stride;
>   
>   stride = 1 << port->regshift;
>   
>   /* Check if a translated MMIO address has been found */
>   if (info->taddr) {
> - addr = ioremap(info->taddr, 0x1000);
> - if (addr == NULL)
> + info->early_addr = early_ioremap(info->taddr, 0x1000);
> + if (info->early_addr == NULL)
>   return;
> - udbg_uart_init_mmio(addr, stride);
> + udbg_uart_init_mmio(info->early_addr, stride);
>   } else {
>   /* Check if it's PIO and we support untranslated PIO */
>   if (port->iotype == UPIO_PORT && isa_io_special)
> @@ -353,6 +354,30 @@ static void __init setup_legacy_serial_console(int 
> console)
>   udbg_uart_setup(info->speed, info->clock);
>   }
>   
> +static int __init ioremap_legacy_serial_console(void)
> +{
> + struct legacy_serial_info *info = 
> _serial_infos[legacy_serial_console];
> + struct plat_serial8250_port *port = 
> _serial_ports[legacy_serial_console];
> + void __iomem *vaddr;
> +
> + if (legacy_serial_console < 0)
> + return 0;
> +
> + if (!info->early_addr)
> + return 0;
> +
> + vaddr = ioremap(info->taddr, 0x1000);
> + if (WARN_ON(!vaddr))
> + return -ENOMEM;
> +
> + udbg_uart_init_mmio(vaddr, 1 << port->regshift);
> + early_iounmap(info->early_addr, 0x1000);
> + info->early_addr = NULL;
> +
> + return 0;
> +}
> +early_initcall(ioremap_legacy_serial_console);
> +
>   /*
>* This is called very early, as part of setup_system() or eventually
>* setup_arch(), basically before anything else in this file. This function

Re: Errant readings on LM81 with T2080 SoC

2021-03-17 Thread Chris Packham


On 12/03/21 10:34 am, Guenter Roeck wrote:
> On 3/11/21 1:17 PM, Chris Packham wrote:
>> On 11/03/21 9:18 pm, Wolfram Sang wrote:
>>>> Bummer. What is really weird is that you see clock stretching under
>>>> CPU load. Normally clock stretching is triggered by the device, not
>>>> by the host.
>>> One example: Some hosts need an interrupt per byte to know if they
>>> should send ACK or NACK. If that interrupt is delayed, they stretch the
>>> clock.
>>>
>> It feels like something like that is happening. Looking at the T2080
>> Reference manual there is an interesting timing diagram (Figure 14-2 if
>> someone feels like looking it up). It shows SCL low between the ACK for
>> the address and the data byte. I think if we're delayed in sending the
>> next byte we could violate Ttimeout or Tlow:mext from the SMBUS spec.
>>
> I think that really leaves you only two options that I can see:
> Rework the driver to handle critical actions (such as setting TXAK,
> and everything else that might result in clock stretching) in the
> interrupt handler, or rework the driver to handle everything in
> a high priority kernel thread.
I've made some reasonable progress on making i2c-mpc more interrupt 
driven. Assuming it works out for my use-case is there an opinion on 
making interrupt support mandatory? Looking at all the in-tree dts files 
that use one of the compatible strings from i2c-mpc.c they all have 
interrupt properties so in theory nothing is using the polling mode. But 
there may be some out-of-tree boards or boards using an old dtb that 
would be affected?

Re: Errant readings on LM81 with T2080 SoC

2021-03-14 Thread Chris Packham
On 12/03/21 10:25 pm, David Laight wrote:
> From: Linuxppc-dev Guenter Roeck
>> Sent: 11 March 2021 21:35
>>
>> On 3/11/21 1:17 PM, Chris Packham wrote:
>>> On 11/03/21 9:18 pm, Wolfram Sang wrote:
>>>>> Bummer. What is really weird is that you see clock stretching under
>>>>> CPU load. Normally clock stretching is triggered by the device, not
>>>>> by the host.
>>>> One example: Some hosts need an interrupt per byte to know if they
>>>> should send ACK or NACK. If that interrupt is delayed, they stretch the
>>>> clock.
>>>>
>>> It feels like something like that is happening. Looking at the T2080
>>> Reference manual there is an interesting timing diagram (Figure 14-2 if
>>> someone feels like looking it up). It shows SCL low between the ACK for
>>> the address and the data byte. I think if we're delayed in sending the
>>> next byte we could violate Ttimeout or Tlow:mext from the SMBUS spec.
>>>
>> I think that really leaves you only two options that I can see:
>> Rework the driver to handle critical actions (such as setting TXAK,
>> and everything else that might result in clock stretching) in the
>> interrupt handler, or rework the driver to handle everything in
>> a high priority kernel thread.
> I'm not sure a high priority kernel thread will help.
> Without CONFIG_PREEMPT (which has its own set of nasties)
> a RT process won't be scheduled until the processor it last
> ran on does a reschedule.
> I don't think a kernel thread will be any different from a
> user process running under the RT scheduler.
>
> I'm trying to remember the smbus spec (without remembering the I2C one).
For those following along the spec is available here[0]. I know there's 
a 3.0 version[1] as well but the devices I'm dealing with are from a 2.0 
vintage.
> While basically a clock+data bit-bang the slave is allowed to drive
> the clock low to extend a cycle.
> It may be allowed to do this at any point?
 From what I can see it's actually the master extending the clock. Or 
more accurately holding it low between the address and data bytes (which 
from the T2080 reference manual looks expected). I think this may cause 
a strictly compliant SMBUS device to determine that Tlow:mext has been 
violated.
> The master can generate the data at almost any rate (below the maximum)
> but I don't think it can go down to zero.
> But I do remember one of the specs having a timeout.
>
> But I'd have thought the slave should answer the cycle correctly
> regardless of any 'random' delays the master adds in.
Probably depends on the device implementation. I've got multiple other 
I2C/SMBUS devices and the LM81 seems to be the one that objects.
> Unless you are getting away with de-asserting chipselect?
>
> The only implementation I've done is one an FPGA so doesn't have
> worry about interrupt latencies.
> It doesn't actually support clock stretching; it wasn't in the
> code I started from and none of the slaves we need to connect to
> ever does it.
>
>   David

[0] - http://www.smbus.org/specs/smbus20.pdf
[1] - https://pmbus.org/Assets/PDFS/Public/SMBus_3_0_20141220.pdf

>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)
>

Re: Errant readings on LM81 with T2080 SoC

2021-03-11 Thread Chris Packham


On 12/03/21 1:07 pm, Guenter Roeck wrote:
> On 3/11/21 3:47 PM, Chris Packham wrote:
>> On 12/03/21 10:34 am, Guenter Roeck wrote:
>>> On 3/11/21 1:17 PM, Chris Packham wrote:
>>>> On 11/03/21 9:18 pm, Wolfram Sang wrote:
>>>>>> Bummer. What is really weird is that you see clock stretching under
>>>>>> CPU load. Normally clock stretching is triggered by the device, not
>>>>>> by the host.
>>>>> One example: Some hosts need an interrupt per byte to know if they
>>>>> should send ACK or NACK. If that interrupt is delayed, they stretch the
>>>>> clock.
>>>>>
>>>> It feels like something like that is happening. Looking at the T2080
>>>> Reference manual there is an interesting timing diagram (Figure 14-2 if
>>>> someone feels like looking it up). It shows SCL low between the ACK for
>>>> the address and the data byte. I think if we're delayed in sending the
>>>> next byte we could violate Ttimeout or Tlow:mext from the SMBUS spec.
>>>>
>>> I think that really leaves you only two options that I can see:
>>> Rework the driver to handle critical actions (such as setting TXAK,
>>> and everything else that might result in clock stretching) in the
>>> interrupt handler, or rework the driver to handle everything in
>>> a high priority kernel thread.
>> One thing I've found that does seem to avoid the problem is to disable
>> preemption, use polling and replace the schedule() in i2c_wait() with
>> udelay(50). That's kind of like the kernel thread option.
> It is kind of hackish, though, especially since it makes the "loaded system"
> situation even worse by adding even more active wait loops.
No -ish about it :). But it might put out one fire for me while I'm 
looking at doing some kind of interrupt driven state machine.

Re: Errant readings on LM81 with T2080 SoC

2021-03-11 Thread Chris Packham


On 12/03/21 10:34 am, Guenter Roeck wrote:
> On 3/11/21 1:17 PM, Chris Packham wrote:
>> On 11/03/21 9:18 pm, Wolfram Sang wrote:
>>>> Bummer. What is really weird is that you see clock stretching under
>>>> CPU load. Normally clock stretching is triggered by the device, not
>>>> by the host.
>>> One example: Some hosts need an interrupt per byte to know if they
>>> should send ACK or NACK. If that interrupt is delayed, they stretch the
>>> clock.
>>>
>> It feels like something like that is happening. Looking at the T2080
>> Reference manual there is an interesting timing diagram (Figure 14-2 if
>> someone feels like looking it up). It shows SCL low between the ACK for
>> the address and the data byte. I think if we're delayed in sending the
>> next byte we could violate Ttimeout or Tlow:mext from the SMBUS spec.
>>
> I think that really leaves you only two options that I can see:
> Rework the driver to handle critical actions (such as setting TXAK,
> and everything else that might result in clock stretching) in the
> interrupt handler, or rework the driver to handle everything in
> a high priority kernel thread.
One thing I've found that does seem to avoid the problem is to disable 
preemption, use polling and replace the schedule() in i2c_wait() with 
udelay(50). That's kind of like the kernel thread option.
> Guenter

Re: Errant readings on LM81 with T2080 SoC

2021-03-11 Thread Chris Packham


On 11/03/21 9:18 pm, Wolfram Sang wrote:
>> Bummer. What is really weird is that you see clock stretching under
>> CPU load. Normally clock stretching is triggered by the device, not
>> by the host.
> One example: Some hosts need an interrupt per byte to know if they
> should send ACK or NACK. If that interrupt is delayed, they stretch the
> clock.
>
It feels like something like that is happening. Looking at the T2080 
Reference manual there is an interesting timing diagram (Figure 14-2 if 
someone feels like looking it up). It shows SCL low between the ACK for 
the address and the data byte. I think if we're delayed in sending the 
next byte we could violate Ttimeout or Tlow:mext from the SMBUS spec.

Re: Errant readings on LM81 with T2080 SoC

2021-03-10 Thread Chris Packham

On 10/03/21 6:06 pm, Guenter Roeck wrote:
> On 3/9/21 6:19 PM, Chris Packham wrote:
>> On 9/03/21 9:27 am, Chris Packham wrote:
>>> On 8/03/21 5:59 pm, Guenter Roeck wrote:
>>>> Other than that, the only other real idea I have would be to monitor
>>>> the i2c bus.
>>> I am in the fortunate position of being able to go into the office and
>>> even happen to have the expensive scope at the moment. Now I just need
>>> to find a tame HW engineer so I don't burn myself trying to attach the
>>> probes.
>> One thing I see on the scope is that when there is a CPU load there
>> appears to be some clock stretching going on (SCL is held low some
>> times). I don't see it without the CPU load. It's hard to correlate a
>> clock stretching event with a bad read or error but it is one area where
>> the SMBUS spec has a maximum that might cause the device to give up waiting.
>>
> Do you have CONFIG_PREEMPT enabled in your kernel ? But even without
> that it is possible that the hot loops at the beginning and end of
> each operation mess up the driver and cause it to sleep longer
> than intended. Did you try usleep_range() ?

I've been running with and without CONFIG_PREEMPT. The failures happen 
with both.

I did try usleep_range() and still saw failures.

> On a side note, can you send me a register dump for the lm81 ?
> It would be useful for my module test code.

Here you go this is from a largely unconfigured LM81

  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f 0123456789abcdef
00: 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 
10: 47 81 24 03 94 00 00 00 00 ff ff ff ff ff ff ff G?$??...
20: bf cb c1 00 c0 47 ec 24 ff ff 65 ff 00 ff 00 ff ???.?G?$..e.
30: 00 ff 00 ff 00 ff 00 71 a9 7f 7f ff ff 58 01 04 ...q???..X??
40: 01 08 00 00 00 00 00 50 2f 80 80 01 44 00 00 00 ??.P/???D...
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
90: 00 81 24 03 94 00 00 00 00 ff ff ff ff ff ff ff .?$??...
a0: bf cb c1 00 c0 47 ec 24 ff ff 65 ff 00 ff 00 ff ???.?G?$..e.
b0: 00 ff 00 ff 00 ff 00 71 a9 7f 7f ff ff 58 01 04 ...q???..X??
c0: 01 00 00 00 00 00 00 50 2f 80 80 01 44 00 00 00 ?..P/???D...
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

This is from a LM81 that's been configured by our application SW with 
limits appropriate for the platform.

  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f 0123456789abcdef
00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 
10: ff 81 24 03 94 00 00 00 00 ff ff ff ff ff ff ff ..$.
20: bf cc c1 00 c0 47 ec 1c ff ff 65 dc b4 ff c0 d3 .Ge.
30: ad ff 00 d3 ad 4e 40 71 a9 4b 46 ff ff 58 01 04 .n...@q.kf..X..
40: 01 08 00 00 00 00 00 f0 2f 80 80 81 44 80 80 80 /...D...
50: 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
60: 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
70: 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
80: 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
90: 80 81 24 03 94 00 00 00 00 ff ff ff ff ff ff ff ..$.
a0: bf cc c1 00 c0 47 ec 1c ff ff 65 dc b4 ff c0 d3 .Ge.
b0: ad ff 00 d3 ad 4e 40 71 a9 4b 46 ff ff 58 01 04 .n...@q.kf..X..
c0: 01 00 00 00 00 00 00 f0 2f 80 80 81 44 80 80 80 /...D...
d0: 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
e0: 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 
f0: 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 80 


Re: Errant readings on LM81 with T2080 SoC

2021-03-09 Thread Chris Packham
On 9/03/21 9:27 am, Chris Packham wrote:
> On 8/03/21 5:59 pm, Guenter Roeck wrote:
>> Other than that, the only other real idea I have would be to monitor
>> the i2c bus.
> I am in the fortunate position of being able to go into the office and 
> even happen to have the expensive scope at the moment. Now I just need 
> to find a tame HW engineer so I don't burn myself trying to attach the 
> probes.
One thing I see on the scope is that when there is a CPU load there 
appears to be some clock stretching going on (SCL is held low some 
times). I don't see it without the CPU load. It's hard to correlate a 
clock stretching event with a bad read or error but it is one area where 
the SMBUS spec has a maximum that might cause the device to give up waiting.


Re: Errant readings on LM81 with T2080 SoC

2021-03-09 Thread Chris Packham

On 8/03/21 1:31 pm, Guenter Roeck wrote:
> On 3/7/21 2:52 PM, Chris Packham wrote:
>> Fundamentally I think this is a problem with the fact that the LM81 is
>> an SMBus device but the T2080 (and other Freescale SoCs) uses i2c and we
>> emulate SMBus. I suspect the errant readings are when we don't get round
>> to completing the read within the timeout specified by the SMBus
>> specification. Depending on when that happens we either fail the
>> transfer or interpret the result as all-1s.
> That is quite unlikely. Many sensor chips are SMBus chips connected to
> i2c busses. It is much more likely that there is a bug in the T2080 i2c 
> driver,
> that the chip doesn't like the bulk read command issued through regmap, that
> the chip has problems with the i2c bus speed, or that the i2c bus is noisy.
I have noticed that with the switch to regmap we end up using plain i2c 
instead of SMBUS. There appears to be no way of saying use SMBUS 
semantics if the i2c adapter reports I2C_FUNC_I2C.

Re: Errant readings on LM81 with T2080 SoC

2021-03-08 Thread Chris Packham

On 9/03/21 11:10 am, Chris Packham wrote:
>
> On 8/03/21 5:59 pm, Guenter Roeck wrote:
>> On 3/7/21 8:37 PM, Chris Packham wrote:
>> [ ... ]
>>>> That's from -ENXIO which is used in only one place in i2c-mpc.c. I'll
>>>> enable some debug and see what we get.
>>> For the errant readings there was nothing abnormal reported by the 
>>> driver.
>>>
>>> For the "No such device or address" I saw "mpc-i2c ffe119000.i2c: No
>>> RXAK" which matches up with the -ENXIO return.
>>>
>> Id suggest to check the time until not busy and stop in mpc_xfer().
>> Those hot loops are unusual, and may well mess up the code especially
>> if preempt is enabled.
> Reworking those loops seems to have had a positive result. I'll do a 
> bit more testing and hopefully get a patch out later today.
D'oh my "fix" was to replace the cond_reshed() with msleep(10) which did 
"fix" the problem but made every i2c read slow. I didn't notice when 
testing just the lm81 but as soon as I booted the system with more i2c 
devices I saw stupidly slow boot times.
>>   Also, are you using interrupts or polling in
>> your system ? The interrupt handler looks a bit odd, with "Read again
>> to allow register to stabilise".
>>
>> Do you have fsl,timeout set in the devicetree properties and, if so,
>> have you played with it ?
>>
>> Other than that, the only other real idea I have would be to monitor
>> the i2c bus.
>>
>> Guenter

Re: Errant readings on LM81 with T2080 SoC

2021-03-08 Thread Chris Packham

On 8/03/21 5:59 pm, Guenter Roeck wrote:
> On 3/7/21 8:37 PM, Chris Packham wrote:
> [ ... ]
>>> That's from -ENXIO which is used in only one place in i2c-mpc.c. I'll
>>> enable some debug and see what we get.
>> For the errant readings there was nothing abnormal reported by the driver.
>>
>> For the "No such device or address" I saw "mpc-i2c ffe119000.i2c: No
>> RXAK" which matches up with the -ENXIO return.
>>
> Id suggest to check the time until not busy and stop in mpc_xfer().
> Those hot loops are unusual, and may well mess up the code especially
> if preempt is enabled.
Reworking those loops seems to have had a positive result. I'll do a bit 
more testing and hopefully get a patch out later today.
>   Also, are you using interrupts or polling in
> your system ? The interrupt handler looks a bit odd, with "Read again
> to allow register to stabilise".
>
> Do you have fsl,timeout set in the devicetree properties and, if so,
> have you played with it ?
>
> Other than that, the only other real idea I have would be to monitor
> the i2c bus.
>
> Guenter

Re: Errant readings on LM81 with T2080 SoC

2021-03-08 Thread Chris Packham

On 8/03/21 5:59 pm, Guenter Roeck wrote:
> On 3/7/21 8:37 PM, Chris Packham wrote:
> [ ... ]
>>> That's from -ENXIO which is used in only one place in i2c-mpc.c. I'll
>>> enable some debug and see what we get.
>> For the errant readings there was nothing abnormal reported by the driver.
>>
>> For the "No such device or address" I saw "mpc-i2c ffe119000.i2c: No
>> RXAK" which matches up with the -ENXIO return.
>>
> Id suggest to check the time until not busy and stop in mpc_xfer().
> Those hot loops are unusual, and may well mess up the code especially
> if preempt is enabled. Also, are you using interrupts or polling in
> your system ?
I'm using interrupts but I see the same issue if I comment out the 
interrupts in the dtsi file (i.e. force it to use polling).
> The interrupt handler looks a bit odd, with "Read again
> to allow register to stabilise".

Yeah that stuck out to me too. The code in question predates git, I went 
spelunking in history.git and the "Read again" seems to be in the 
initial version[0]. I did try to alter the interrupt handler so that it 
only does one read but that didn't seem to change anything.

> Do you have fsl,timeout set in the devicetree properties and, if so,
> have you played with it ?
Haven't got it set but I'll have a go at tweaking it.
> Other than that, the only other real idea I have would be to monitor
> the i2c bus.
I am in the fortunate position of being able to go into the office and 
even happen to have the expensive scope at the moment. Now I just need 
to find a tame HW engineer so I don't burn myself trying to attach the 
probes.

-- 

[0] - 
https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=11b3235dc04a306f6a9ba14c1ab621b2d54f2c56



Re: Errant readings on LM81 with T2080 SoC

2021-03-07 Thread Chris Packham

On 8/03/21 3:27 pm, Chris Packham wrote:
>
> On 8/03/21 1:31 pm, Guenter Roeck wrote:
>> On 3/7/21 2:52 PM, Chris Packham wrote:
>>> Hi,
>>>
>>> I've got a system using a PowerPC T2080 SoC and among other things has
>>> an LM81 hwmon chip.
>>>
>>> Under a high CPU load we see errant readings from the LM81 as well as
>>> actual failures. It's the errant readings that cause the most concern
>>> since we can easily ignore the read errors in our monitoring 
>>> application
>>> (although it would be better if they weren't there at all).
>>>
>>> I'm able to reproduce this with a test application[0] that artificially
>>> creates a high CPU load then by repeatedly checking for the all-1s
>>> values from the LM81 datasheet[1](page 17). The all-1s readings stick
>>> out as they are obviously higher than the voltage rails that are
>>> connected and disagree with measurements taken with a multimeter.
>>>
>>> Here's the output from my device
>>>
>>> [root@linuxbox ~]# cpuload 90&
>>> [root@linuxbox ~]# (while true; do cat 
>>> /sys/class/hwmon/hwmon0/in*_input
>>> | grep '3320\|4383\|6641\|15930\|3586'; sleep 1; done)&
>>> 3586
>>> 3586
>>> cat: read error: No such device or address
>>> cat: read error: No such device or address
>>> 3320
>>> 3320
>>> 3586
>>> 3586
>>> 6641
>>> 6641
>>> 4383
>>> 4383
>>>
>>> Fundamentally I think this is a problem with the fact that the LM81 is
>>> an SMBus device but the T2080 (and other Freescale SoCs) uses i2c 
>>> and we
>>> emulate SMBus. I suspect the errant readings are when we don't get 
>>> round
>>> to completing the read within the timeout specified by the SMBus
>>> specification. Depending on when that happens we either fail the
>>> transfer or interpret the result as all-1s.
>>>
>> That is quite unlikely. Many sensor chips are SMBus chips connected to
>> i2c busses. It is much more likely that there is a bug in the T2080 
>> i2c driver,
>> that the chip doesn't like the bulk read command issued through 
>> regmap, that
>> the chip has problems with the i2c bus speed, or that the i2c bus is 
>> noisy.
> Perhaps something gets upset when interrupt processing is delayed 
> because of CPU load. I don't see the problem when there isn't a CPU 
> load so I think that eliminates board issues.
>> In this context, the "No such device or address" responses are very 
>> suspicious.
>> Those are reported by the i2c driver, not by the hwmon driver, and 
>> suggest
>> that the chip did not respond to a read request. Maybe it helps to 
>> enable
>> debugging to the i2c driver to see if it reports anything useful. Even
>> better might be to connect an i2c bus analyzer to the i2c bus and check
>> what is going on.
> That's from -ENXIO which is used in only one place in i2c-mpc.c. I'll 
> enable some debug and see what we get.

For the errant readings there was nothing abnormal reported by the driver.

For the "No such device or address" I saw "mpc-i2c ffe119000.i2c: No 
RXAK" which matches up with the -ENXIO return.


Re: Errant readings on LM81 with T2080 SoC

2021-03-07 Thread Chris Packham

On 8/03/21 1:31 pm, Guenter Roeck wrote:
> On 3/7/21 2:52 PM, Chris Packham wrote:
>> Hi,
>>
>> I've got a system using a PowerPC T2080 SoC and among other things has
>> an LM81 hwmon chip.
>>
>> Under a high CPU load we see errant readings from the LM81 as well as
>> actual failures. It's the errant readings that cause the most concern
>> since we can easily ignore the read errors in our monitoring application
>> (although it would be better if they weren't there at all).
>>
>> I'm able to reproduce this with a test application[0] that artificially
>> creates a high CPU load then by repeatedly checking for the all-1s
>> values from the LM81 datasheet[1](page 17). The all-1s readings stick
>> out as they are obviously higher than the voltage rails that are
>> connected and disagree with measurements taken with a multimeter.
>>
>> Here's the output from my device
>>
>> [root@linuxbox ~]# cpuload 90&
>> [root@linuxbox ~]# (while true; do cat /sys/class/hwmon/hwmon0/in*_input
>> | grep '3320\|4383\|6641\|15930\|3586'; sleep 1; done)&
>> 3586
>> 3586
>> cat: read error: No such device or address
>> cat: read error: No such device or address
>> 3320
>> 3320
>> 3586
>> 3586
>> 6641
>> 6641
>> 4383
>> 4383
>>
>> Fundamentally I think this is a problem with the fact that the LM81 is
>> an SMBus device but the T2080 (and other Freescale SoCs) uses i2c and we
>> emulate SMBus. I suspect the errant readings are when we don't get round
>> to completing the read within the timeout specified by the SMBus
>> specification. Depending on when that happens we either fail the
>> transfer or interpret the result as all-1s.
>>
> That is quite unlikely. Many sensor chips are SMBus chips connected to
> i2c busses. It is much more likely that there is a bug in the T2080 i2c 
> driver,
> that the chip doesn't like the bulk read command issued through regmap, that
> the chip has problems with the i2c bus speed, or that the i2c bus is noisy.
Perhaps something gets upset when interrupt processing is delayed 
because of CPU load. I don't see the problem when there isn't a CPU load 
so I think that eliminates board issues.
> In this context, the "No such device or address" responses are very 
> suspicious.
> Those are reported by the i2c driver, not by the hwmon driver, and suggest
> that the chip did not respond to a read request. Maybe it helps to enable
> debugging to the i2c driver to see if it reports anything useful. Even
> better might be to connect an i2c bus analyzer to the i2c bus and check
> what is going on.
That's from -ENXIO which is used in only one place in i2c-mpc.c. I'll 
enable some debug and see what we get.
>
> Guenter

Errant readings on LM81 with T2080 SoC

2021-03-07 Thread Chris Packham
Hi,

I've got a system using a PowerPC T2080 SoC and among other things has 
an LM81 hwmon chip.

Under a high CPU load we see errant readings from the LM81 as well as 
actual failures. It's the errant readings that cause the most concern 
since we can easily ignore the read errors in our monitoring application 
(although it would be better if they weren't there at all).

I'm able to reproduce this with a test application[0] that artificially 
creates a high CPU load then by repeatedly checking for the all-1s 
values from the LM81 datasheet[1](page 17). The all-1s readings stick 
out as they are obviously higher than the voltage rails that are 
connected and disagree with measurements taken with a multimeter.

Here's the output from my device

[root@linuxbox ~]# cpuload 90&
[root@linuxbox ~]# (while true; do cat /sys/class/hwmon/hwmon0/in*_input 
| grep '3320\|4383\|6641\|15930\|3586'; sleep 1; done)&
3586
3586
cat: read error: No such device or address
cat: read error: No such device or address
3320
3320
3586
3586
6641
6641
4383
4383

Fundamentally I think this is a problem with the fact that the LM81 is 
an SMBus device but the T2080 (and other Freescale SoCs) uses i2c and we 
emulate SMBus. I suspect the errant readings are when we don't get round 
to completing the read within the timeout specified by the SMBus 
specification. Depending on when that happens we either fail the 
transfer or interpret the result as all-1s.

[0] - https://gist.github.com/cpackham/6356a3a943accebb228135dc10daf721
[1] - https://www.ti.com/lit/ds/symlink/lm81.pdf


[net-next PATCH] net: freescale: ucc_geth: remove unused SKB_ALLOC_TIMEOUT

2020-11-29 Thread Chris Packham
This was added in commit ce973b141dfa ("[PATCH] Freescale QE UCC gigabit
ethernet driver") but doesn't appear to have been used. Remove it now.

Signed-off-by: Chris Packham 
---
 drivers/net/ethernet/freescale/ucc_geth.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth.h 
b/drivers/net/ethernet/freescale/ucc_geth.h
index 3fe903972195..1a9bdf66a7d8 100644
--- a/drivers/net/ethernet/freescale/ucc_geth.h
+++ b/drivers/net/ethernet/freescale/ucc_geth.h
@@ -882,7 +882,6 @@ struct ucc_geth_hardware_statistics {
   addresses */
 
 #define TX_TIMEOUT  (1*HZ)
-#define SKB_ALLOC_TIMEOUT   10
 #define PHY_INIT_TIMEOUT10
 #define PHY_CHANGE_TIME 2
 
-- 
2.29.2



Re: [PATCH] spi: fsl-espi: Only process interrupts for expected events

2020-09-23 Thread Chris Packham

On 24/09/20 8:27 am, Heiner Kallweit wrote:
> On 04.09.2020 02:28, Chris Packham wrote:
>> The SPIE register contains counts for the TX FIFO so any time the irq
>> handler was invoked we would attempt to process the RX/TX fifos. Use the
>> SPIM value to mask the events so that we only process interrupts that
>> were expected.
>>
>> This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64:
>> Implement soft interrupt replay in C").
>>
>> Signed-off-by: Chris Packham 
>> Cc: sta...@vger.kernel.org
>> ---
>>
>> Notes:
>>  I've tested this on a T2080RDB and a custom board using the T2081 SoC. 
>> With
>>  this change I don't see any spurious instances of the "Transfer done but
>>  SPIE_DON isn't set!" or "Transfer done but rx/tx fifo's aren't empty!" 
>> messages
>>  and the updates to spi flash are successful.
>>  
>>  I think this should go into the stable trees that contain 3282a3da25bd 
>> but I
>>  haven't added a Fixes: tag because I think 3282a3da25bd exposed the 
>> issue as
>>  opposed to causing it.
>>
>>   drivers/spi/spi-fsl-espi.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
>> index 7e7c92cafdbb..cb120b68c0e2 100644
>> --- a/drivers/spi/spi-fsl-espi.c
>> +++ b/drivers/spi/spi-fsl-espi.c
>> @@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi *espi, 
>> u32 events)
>>   static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
>>   {
>>  struct fsl_espi *espi = context_data;
>> -u32 events;
>> +u32 events, mask;
>>   
>>  spin_lock(>lock);
>>   
>>  /* Get interrupt events(tx/rx) */
>>  events = fsl_espi_read_reg(espi, ESPI_SPIE);
>> -if (!events) {
>> +mask = fsl_espi_read_reg(espi, ESPI_SPIM);
>> +if (!(events & mask)) {
>>  spin_unlock(>lock);
>>  return IRQ_NONE;
> Sorry, I was on vacation and therefore couldn't comment earlier.
> I'm fine with the change, just one thing could be improved IMO.
> If we skip an unneeded interrupt now, then returning IRQ_NONE
> causes reporting this interrupt as spurious. This isn't too nice
> as spurious interrupts typically are seen as a problem indicator.
> Therefore returning IRQ_HANDLED should be more appropriate.
> This would just require a comment in the code explaining why we
> do this, and why it can happen that we receive interrupts
> we're not interested in.
I'd be happy to send a follow-up to change IRQ_NONE to IRQ_HANDLED. I 
don't think the old code could have ever hit the IRQ_NONE (because event 
will always be non-zero) so it won't really be a change in behaviour. 
With the patch (that is now in spi/for-next) so far I do see a low 
number of spurious interrupts on the test setup where previously I would 
have seen failure to talk to the spi-flash.


Re: [PATCH] spi: fsl-espi: Only process interrupts for expected events

2020-09-13 Thread Chris Packham
Hi All,

On 4/09/20 12:28 pm, Chris Packham wrote:
> The SPIE register contains counts for the TX FIFO so any time the irq
> handler was invoked we would attempt to process the RX/TX fifos. Use the
> SPIM value to mask the events so that we only process interrupts that
> were expected.
>
> This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64:
> Implement soft interrupt replay in C").
>
> Signed-off-by: Chris Packham 
> Cc: sta...@vger.kernel.org
> ---
ping?
>
> Notes:
>  I've tested this on a T2080RDB and a custom board using the T2081 SoC. 
> With
>  this change I don't see any spurious instances of the "Transfer done but
>  SPIE_DON isn't set!" or "Transfer done but rx/tx fifo's aren't empty!" 
> messages
>  and the updates to spi flash are successful.
>  
>  I think this should go into the stable trees that contain 3282a3da25bd 
> but I
>  haven't added a Fixes: tag because I think 3282a3da25bd exposed the 
> issue as
>  opposed to causing it.
>
>   drivers/spi/spi-fsl-espi.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
> index 7e7c92cafdbb..cb120b68c0e2 100644
> --- a/drivers/spi/spi-fsl-espi.c
> +++ b/drivers/spi/spi-fsl-espi.c
> @@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi *espi, u32 
> events)
>   static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
>   {
>   struct fsl_espi *espi = context_data;
> - u32 events;
> + u32 events, mask;
>   
>   spin_lock(>lock);
>   
>   /* Get interrupt events(tx/rx) */
>   events = fsl_espi_read_reg(espi, ESPI_SPIE);
> - if (!events) {
> + mask = fsl_espi_read_reg(espi, ESPI_SPIM);
> + if (!(events & mask)) {
>   spin_unlock(>lock);
>   return IRQ_NONE;
>   }

Re: fsl_espi errors on v5.7.15

2020-09-06 Thread Chris Packham

On 5/09/20 5:23 am, Heiner Kallweit wrote:


On Fri 4. Sep 2020 at 01:58, Chris Packham 
mailto:chris.pack...@alliedtelesis.co.nz>> 
wrote:


On 1/09/20 6:14 pm, Nicholas Piggin wrote:

> Excerpts from Chris Packham's message of September 1, 2020 11:25 am:

>> On 1/09/20 12:33 am, Heiner Kallweit wrote:

>>> On 30.08.2020 23:59, Chris Packham wrote:

>>>> On 31/08/20 9:41 am, Heiner Kallweit wrote:

>>>>> On 30.08.2020 23:00, Chris Packham wrote:

>>>>>> On 31/08/20 12:30 am, Nicholas Piggin wrote:

>>>>>>> Excerpts from Chris Packham's message of August 28, 2020 8:07 am:

>>>>>> 

>>>>>>

>>>>>>>>>>>> I've also now seen the RX FIFO not empty error on the T2080RDB

>>>>>>>>>>>>

>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!

>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!

>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!

>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!

>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>>>>>>> empty!

>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32

>>>>>>>>>>>>

>>>>>>>>>>>> With my current workaround of emptying the RX FIFO. It seems

>>>>>>>>>>>> survivable. Interestingly it only ever seems to be 1 extra byte in 
>>>>>>>>>>>> the

>>>>>>>>>>>> RX FIFO and it seems to be after either a READ_SR or a READ_FSR.

>>>>>>>>>>>>

>>>>>>>>>>>> fsl_espi ffe11.spi: tx 70

>>>>>>>>>>>> fsl_espi ffe11.spi: rx 03

>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 00

>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!

>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>>>>>>> empty!

>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32

>>>>>>>>>>>> fsl_espi ffe11.spi: tx 05

>>>>>>>>>>>> fsl_espi ffe11.spi: rx 00

>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03

>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!

>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>>>>>>> empty!

>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32

>>>>>>>>>>>> fsl_espi ffe11.spi: tx 05

>>>>>>>>>>>> fsl_espi ffe11.spi: rx 00

>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03

>>>>>>>>>>>>

>>>>>>>>>>>>  From all the Micron SPI-NOR datasheets I've got access to it 
>>>>>>>>>>>> is

>>>>>>>>>>>> possible to continually read the SR/FSR. But I've no idea why it

>>>>>>>>>>>> happens some times and not others.

>>>>>>>>>>> So I think I've got a reproduction and I think I've bisected the 
>>>>>>>>>>> problem

>>>>>>>>>>> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt 
>>>>>>>>>>> replay in

>>>>>>>>>>> C"). My day is just finishing now so I haven't applied too much 
>>>>>>>>>>> scrutiny

>>>>>>>>>>> to this result. Given the various rabbit holes I've been down on 
>>>>>>>>>>> this

>>>>>>>>>>> issue already I'd take this information with a good degree of 
>>>>>>>>>>> skepticism.

>>>>>>>>>>>

>>>>>>>>>> OK, so an easy test should be to re-t

Re: fsl_espi errors on v5.7.15

2020-09-06 Thread Chris Packham
(resend as something decided to insert html, some context stripped)

On 5/09/20 5:23 am, Heiner Kallweit wrote:
> On Fri 4. Sep 2020 at 01:58, Chris Packham 
>  <mailto:chris.pack...@alliedtelesis.co.nz>> wrote:
>
>

>
>
> I tried ftrace but I really wasn't sure what I was looking for.
>
> Capturing a "bad" case was pretty tricky. But I think I've
> identified a
>
> fix (I'll send it as a proper patch shortly). The gist is
>
>
>
> diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
>
> index 7e7c92cafdbb..cb120b68c0e2 100644
>
> --- a/drivers/spi/spi-fsl-espi.c
>
> +++ b/drivers/spi/spi-fsl-espi.c
>
> @@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi
>
> *espi, u32 events)
>
>   static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
>
>   {
>
>  struct fsl_espi *espi = context_data;
>
> -   u32 events;
>
> +   u32 events, mask;
>
>
>
>  spin_lock(>lock);
>
>
>
>  /* Get interrupt events(tx/rx) */
>
>  events = fsl_espi_read_reg(espi, ESPI_SPIE);
>
> -   if (!events) {
>
> +   mask = fsl_espi_read_reg(espi, ESPI_SPIM);
>
> +   if (!(events & mask)) {
>
>  spin_unlock(>lock);
>
>  return IRQ_NONE;
>
>  }
>
>
>
> The SPIE register contains the TXCNT so events is pretty much always
>
> going to have something set. By checking events against what we've
>
> actually requested interrupts for we don't see any spurious events.
>
>
> Usually we shouldn’t receive interrupts we’re not interested in, 
> except the interrupt is shared.
My theory is that we get an interrupt to the core for RXT and another 
for DON. With the changes to the powerpc interrupt handling and the fact 
that fsl_espi_cpu_irq() doesn't actually look at the specific event bits 
means that sometimes both events are handled in the processing of the 
first interrupt and the second one trips the SPI_DON not set error.

There's an old comment "SPI bus sometimes got lost interrupts..." which 
makes me wonder if the interrupt handling change has fixed this original 
problem.

I still think the change makes sense regardless because the SPIE 
register has some counter fields in it.

> This leads to the question: is the SPI interrupt shared with another 
> device on your system?
It's not shared on either the custom board or the T2080RDB.
> Do you see spurious interrupts with the patch under 
> /proc/irq/(irq)/spurious?

Yes it looks like it

[root@linuxbox ~]# cat /proc/irq/53/spurious
count 3126
unhandled 0
last_unhandled 0 ms

[root@linuxbox ~]# /flash/dd_test.sh

[root@linuxbox ~]# cat /proc/irq/53/spurious
count 1016
unhandled 0
last_unhandled 4294746100 ms

[root@linuxbox ~]# /flash/dd_test.sh

[root@linuxbox ~]# cat /proc/irq/53/spurious
count 88391
unhandled 0
last_unhandled 4294746100 ms

[root@linuxbox ~]# /flash/dd_test.sh

[root@linuxbox ~]# cat /proc/irq/53/spurious
count 72459
unhandled 2
last_unhandled 4294758632 ms
>
>
>
> I've tested this on the T2080RDB and on our custom hardware and it
> seems
>
> to resolve the problem.
>
>
>

[PATCH] spi: fsl-espi: Only process interrupts for expected events

2020-09-03 Thread Chris Packham
The SPIE register contains counts for the TX FIFO so any time the irq
handler was invoked we would attempt to process the RX/TX fifos. Use the
SPIM value to mask the events so that we only process interrupts that
were expected.

This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64:
Implement soft interrupt replay in C").

Signed-off-by: Chris Packham 
Cc: sta...@vger.kernel.org
---

Notes:
I've tested this on a T2080RDB and a custom board using the T2081 SoC. With
this change I don't see any spurious instances of the "Transfer done but
SPIE_DON isn't set!" or "Transfer done but rx/tx fifo's aren't empty!" 
messages
and the updates to spi flash are successful.

I think this should go into the stable trees that contain 3282a3da25bd but I
haven't added a Fixes: tag because I think 3282a3da25bd exposed the issue as
opposed to causing it.

 drivers/spi/spi-fsl-espi.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
index 7e7c92cafdbb..cb120b68c0e2 100644
--- a/drivers/spi/spi-fsl-espi.c
+++ b/drivers/spi/spi-fsl-espi.c
@@ -574,13 +574,14 @@ static void fsl_espi_cpu_irq(struct fsl_espi *espi, u32 
events)
 static irqreturn_t fsl_espi_irq(s32 irq, void *context_data)
 {
struct fsl_espi *espi = context_data;
-   u32 events;
+   u32 events, mask;
 
spin_lock(>lock);
 
/* Get interrupt events(tx/rx) */
events = fsl_espi_read_reg(espi, ESPI_SPIE);
-   if (!events) {
+   mask = fsl_espi_read_reg(espi, ESPI_SPIM);
+   if (!(events & mask)) {
spin_unlock(>lock);
return IRQ_NONE;
}
-- 
2.28.0



Re: fsl_espi errors on v5.7.15

2020-09-03 Thread Chris Packham

On 1/09/20 6:14 pm, Nicholas Piggin wrote:
> Excerpts from Chris Packham's message of September 1, 2020 11:25 am:
>> On 1/09/20 12:33 am, Heiner Kallweit wrote:
>>> On 30.08.2020 23:59, Chris Packham wrote:
>>>> On 31/08/20 9:41 am, Heiner Kallweit wrote:
>>>>> On 30.08.2020 23:00, Chris Packham wrote:
>>>>>> On 31/08/20 12:30 am, Nicholas Piggin wrote:
>>>>>>> Excerpts from Chris Packham's message of August 28, 2020 8:07 am:
>>>>>> 
>>>>>>
>>>>>>>>>>>> I've also now seen the RX FIFO not empty error on the T2080RDB
>>>>>>>>>>>>
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>>>>>>> empty!
>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>>>
>>>>>>>>>>>> With my current workaround of emptying the RX FIFO. It seems
>>>>>>>>>>>> survivable. Interestingly it only ever seems to be 1 extra byte in 
>>>>>>>>>>>> the
>>>>>>>>>>>> RX FIFO and it seems to be after either a READ_SR or a READ_FSR.
>>>>>>>>>>>>
>>>>>>>>>>>> fsl_espi ffe11.spi: tx 70
>>>>>>>>>>>> fsl_espi ffe11.spi: rx 03
>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 00
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>>>>>>> empty!
>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>>>>>>> empty!
>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>>>>
>>>>>>>>>>>>  From all the Micron SPI-NOR datasheets I've got access to it 
>>>>>>>>>>>> is
>>>>>>>>>>>> possible to continually read the SR/FSR. But I've no idea why it
>>>>>>>>>>>> happens some times and not others.
>>>>>>>>>>> So I think I've got a reproduction and I think I've bisected the 
>>>>>>>>>>> problem
>>>>>>>>>>> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt 
>>>>>>>>>>> replay in
>>>>>>>>>>> C"). My day is just finishing now so I haven't applied too much 
>>>>>>>>>>> scrutiny
>>>>>>>>>>> to this result. Given the various rabbit holes I've been down on 
>>>>>>>>>>> this
>>>>>>>>>>> issue already I'd take this information with a good degree of 
>>>>>>>>>>> skepticism.
>>>>>>>>>>>
>>>>>>>>>> OK, so an easy test should be to re-test with a 5.4 kernel.
>>>>>>>>>> It doesn't have yet the change you're referring to, and the fsl-espi 
>>>>>>>>>> driver
>>>>

Re: fsl_espi errors on v5.7.15

2020-09-01 Thread Chris Packham

On 2/09/20 11:29 am, Chris Packham wrote:
>
> On 1/09/20 6:14 pm, Nicholas Piggin wrote:
>> Excerpts from Chris Packham's message of September 1, 2020 11:25 am:
>>> On 1/09/20 12:33 am, Heiner Kallweit wrote:
>>>> On 30.08.2020 23:59, Chris Packham wrote:
>>>>> On 31/08/20 9:41 am, Heiner Kallweit wrote:
>>>>>> On 30.08.2020 23:00, Chris Packham wrote:
>>>>>>> On 31/08/20 12:30 am, Nicholas Piggin wrote:
>>>>>>>> Excerpts from Chris Packham's message of August 28, 2020 8:07 am:
>>>>>>> 
>>>>>>>
>>>>>>>>>>>>> I've also now seen the RX FIFO not empty error on the 
>>>>>>>>>>>>> T2080RDB
>>>>>>>>>>>>>
>>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's 
>>>>>>>>>>>>> aren't empty!
>>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>>>>
>>>>>>>>>>>>> With my current workaround of emptying the RX FIFO. It seems
>>>>>>>>>>>>> survivable. Interestingly it only ever seems to be 1 extra 
>>>>>>>>>>>>> byte in the
>>>>>>>>>>>>> RX FIFO and it seems to be after either a READ_SR or a 
>>>>>>>>>>>>> READ_FSR.
>>>>>>>>>>>>>
>>>>>>>>>>>>> fsl_espi ffe11.spi: tx 70
>>>>>>>>>>>>> fsl_espi ffe11.spi: rx 03
>>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 00
>>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's 
>>>>>>>>>>>>> aren't empty!
>>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's 
>>>>>>>>>>>>> aren't empty!
>>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>>>>>
>>>>>>>>>>>>>  From all the Micron SPI-NOR datasheets I've got 
>>>>>>>>>>>>> access to it is
>>>>>>>>>>>>> possible to continually read the SR/FSR. But I've no idea 
>>>>>>>>>>>>> why it
>>>>>>>>>>>>> happens some times and not others.
>>>>>>>>>>>> So I think I've got a reproduction and I think I've 
>>>>>>>>>>>> bisected the problem
>>>>>>>>>>>> to commit 3282a3da25bd ("powerpc/64: Implement soft 
>>>>>>>>>>>> interrupt replay in
>>>>>>>>>>>> C"). My day is just finishing now so I haven't applied too 
>>>>>>>>>>>> much scrutiny
>>>>>>>>>>>> to this result. Given the various rabbit holes I've been 
>>>>>>>>>>>> down on this
>>>>>>>>>>

Re: fsl_espi errors on v5.7.15

2020-09-01 Thread Chris Packham

On 1/09/20 6:14 pm, Nicholas Piggin wrote:
> Excerpts from Chris Packham's message of September 1, 2020 11:25 am:
>> On 1/09/20 12:33 am, Heiner Kallweit wrote:
>>> On 30.08.2020 23:59, Chris Packham wrote:
>>>> On 31/08/20 9:41 am, Heiner Kallweit wrote:
>>>>> On 30.08.2020 23:00, Chris Packham wrote:
>>>>>> On 31/08/20 12:30 am, Nicholas Piggin wrote:
>>>>>>> Excerpts from Chris Packham's message of August 28, 2020 8:07 am:
>>>>>> 
>>>>>>
>>>>>>>>>>>> I've also now seen the RX FIFO not empty error on the T2080RDB
>>>>>>>>>>>>
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>>>>>>> empty!
>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>>>
>>>>>>>>>>>> With my current workaround of emptying the RX FIFO. It seems
>>>>>>>>>>>> survivable. Interestingly it only ever seems to be 1 extra byte in 
>>>>>>>>>>>> the
>>>>>>>>>>>> RX FIFO and it seems to be after either a READ_SR or a READ_FSR.
>>>>>>>>>>>>
>>>>>>>>>>>> fsl_espi ffe11.spi: tx 70
>>>>>>>>>>>> fsl_espi ffe11.spi: rx 03
>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 00
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>>>>>>> empty!
>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>>>>>>> empty!
>>>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>>>>
>>>>>>>>>>>>  From all the Micron SPI-NOR datasheets I've got access to it 
>>>>>>>>>>>> is
>>>>>>>>>>>> possible to continually read the SR/FSR. But I've no idea why it
>>>>>>>>>>>> happens some times and not others.
>>>>>>>>>>> So I think I've got a reproduction and I think I've bisected the 
>>>>>>>>>>> problem
>>>>>>>>>>> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt 
>>>>>>>>>>> replay in
>>>>>>>>>>> C"). My day is just finishing now so I haven't applied too much 
>>>>>>>>>>> scrutiny
>>>>>>>>>>> to this result. Given the various rabbit holes I've been down on 
>>>>>>>>>>> this
>>>>>>>>>>> issue already I'd take this information with a good degree of 
>>>>>>>>>>> skepticism.
>>>>>>>>>>>
>>>>>>>>>> OK, so an easy test should be to re-test with a 5.4 kernel.
>>>>>>>>>> It doesn't have yet the change you're referring to, and the fsl-espi 
>>>>>>>>>> driver
>>>>

Re: fsl_espi errors on v5.7.15

2020-08-31 Thread Chris Packham

On 1/09/20 12:33 am, Heiner Kallweit wrote:
> On 30.08.2020 23:59, Chris Packham wrote:
>> On 31/08/20 9:41 am, Heiner Kallweit wrote:
>>> On 30.08.2020 23:00, Chris Packham wrote:
>>>> On 31/08/20 12:30 am, Nicholas Piggin wrote:
>>>>> Excerpts from Chris Packham's message of August 28, 2020 8:07 am:
>>>> 
>>>>
>>>>>>>>>> I've also now seen the RX FIFO not empty error on the T2080RDB
>>>>>>>>>>
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>
>>>>>>>>>> With my current workaround of emptying the RX FIFO. It seems
>>>>>>>>>> survivable. Interestingly it only ever seems to be 1 extra byte in 
>>>>>>>>>> the
>>>>>>>>>> RX FIFO and it seems to be after either a READ_SR or a READ_FSR.
>>>>>>>>>>
>>>>>>>>>> fsl_espi ffe11.spi: tx 70
>>>>>>>>>> fsl_espi ffe11.spi: rx 03
>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 00
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>>
>>>>>>>>>> From all the Micron SPI-NOR datasheets I've got access to it is
>>>>>>>>>> possible to continually read the SR/FSR. But I've no idea why it
>>>>>>>>>> happens some times and not others.
>>>>>>>>> So I think I've got a reproduction and I think I've bisected the 
>>>>>>>>> problem
>>>>>>>>> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay 
>>>>>>>>> in
>>>>>>>>> C"). My day is just finishing now so I haven't applied too much 
>>>>>>>>> scrutiny
>>>>>>>>> to this result. Given the various rabbit holes I've been down on this
>>>>>>>>> issue already I'd take this information with a good degree of 
>>>>>>>>> skepticism.
>>>>>>>>>
>>>>>>>> OK, so an easy test should be to re-test with a 5.4 kernel.
>>>>>>>> It doesn't have yet the change you're referring to, and the fsl-espi 
>>>>>>>> driver
>>>>>>>> is basically the same as in 5.7 (just two small changes in 5.7).
>>>>>>> There's 6cc0c16d82f88 and maybe also other interrupt related patches
>>>>>>> around this time that could affect book E, so it's good if that exact
>>>>>>> patch is confirmed.
>>>>>> My confirmation is basically that I can induce the issue in a 5.4 kernel
>>>>>> by cherry-picking 3282a3da25bd. I'm also able to "fix" the issue in
>>>>>> 5.9-rc2 by reverting that one commit.
>>>>>>
>>>>>> I both cases it's not exactly a clean cherry-pick/revert so I also
>>>>>> confirmed the bisection result by building at 3282a3da25bd 

Re: fsl_espi errors on v5.7.15

2020-08-31 Thread Chris Packham

On 1/09/20 12:33 am, Heiner Kallweit wrote:
> On 30.08.2020 23:59, Chris Packham wrote:
>> On 31/08/20 9:41 am, Heiner Kallweit wrote:
>>> On 30.08.2020 23:00, Chris Packham wrote:
>>>> On 31/08/20 12:30 am, Nicholas Piggin wrote:
>>>>> Excerpts from Chris Packham's message of August 28, 2020 8:07 am:
>>>> 
>>>>
>>>>>>>>>> I've also now seen the RX FIFO not empty error on the T2080RDB
>>>>>>>>>>
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>>
>>>>>>>>>> With my current workaround of emptying the RX FIFO. It seems
>>>>>>>>>> survivable. Interestingly it only ever seems to be 1 extra byte in 
>>>>>>>>>> the
>>>>>>>>>> RX FIFO and it seems to be after either a READ_SR or a READ_FSR.
>>>>>>>>>>
>>>>>>>>>> fsl_espi ffe11.spi: tx 70
>>>>>>>>>> fsl_espi ffe11.spi: rx 03
>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 00
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>>>
>>>>>>>>>> From all the Micron SPI-NOR datasheets I've got access to it is
>>>>>>>>>> possible to continually read the SR/FSR. But I've no idea why it
>>>>>>>>>> happens some times and not others.
>>>>>>>>> So I think I've got a reproduction and I think I've bisected the 
>>>>>>>>> problem
>>>>>>>>> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay 
>>>>>>>>> in
>>>>>>>>> C"). My day is just finishing now so I haven't applied too much 
>>>>>>>>> scrutiny
>>>>>>>>> to this result. Given the various rabbit holes I've been down on this
>>>>>>>>> issue already I'd take this information with a good degree of 
>>>>>>>>> skepticism.
>>>>>>>>>
>>>>>>>> OK, so an easy test should be to re-test with a 5.4 kernel.
>>>>>>>> It doesn't have yet the change you're referring to, and the fsl-espi 
>>>>>>>> driver
>>>>>>>> is basically the same as in 5.7 (just two small changes in 5.7).
>>>>>>> There's 6cc0c16d82f88 and maybe also other interrupt related patches
>>>>>>> around this time that could affect book E, so it's good if that exact
>>>>>>> patch is confirmed.
>>>>>> My confirmation is basically that I can induce the issue in a 5.4 kernel
>>>>>> by cherry-picking 3282a3da25bd. I'm also able to "fix" the issue in
>>>>>> 5.9-rc2 by reverting that one commit.
>>>>>>
>>>>>> I both cases it's not exactly a clean cherry-pick/revert so I also
>>>>>> confirmed the bisection result by building at 3282a3da25bd 

Re: fsl_espi errors on v5.7.15

2020-08-30 Thread Chris Packham

On 31/08/20 9:41 am, Heiner Kallweit wrote:
> On 30.08.2020 23:00, Chris Packham wrote:
>> On 31/08/20 12:30 am, Nicholas Piggin wrote:
>>> Excerpts from Chris Packham's message of August 28, 2020 8:07 am:
>> 
>>
>>>>>>>> I've also now seen the RX FIFO not empty error on the T2080RDB
>>>>>>>>
>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>>
>>>>>>>> With my current workaround of emptying the RX FIFO. It seems
>>>>>>>> survivable. Interestingly it only ever seems to be 1 extra byte in the
>>>>>>>> RX FIFO and it seems to be after either a READ_SR or a READ_FSR.
>>>>>>>>
>>>>>>>> fsl_espi ffe11.spi: tx 70
>>>>>>>> fsl_espi ffe11.spi: rx 03
>>>>>>>> fsl_espi ffe11.spi: Extra RX 00
>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>>> fsl_espi ffe11.spi: tx 05
>>>>>>>> fsl_espi ffe11.spi: rx 00
>>>>>>>> fsl_espi ffe11.spi: Extra RX 03
>>>>>>>>
>>>>>>>>From all the Micron SPI-NOR datasheets I've got access to it is
>>>>>>>> possible to continually read the SR/FSR. But I've no idea why it
>>>>>>>> happens some times and not others.
>>>>>>> So I think I've got a reproduction and I think I've bisected the problem
>>>>>>> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in
>>>>>>> C"). My day is just finishing now so I haven't applied too much scrutiny
>>>>>>> to this result. Given the various rabbit holes I've been down on this
>>>>>>> issue already I'd take this information with a good degree of 
>>>>>>> skepticism.
>>>>>>>
>>>>>> OK, so an easy test should be to re-test with a 5.4 kernel.
>>>>>> It doesn't have yet the change you're referring to, and the fsl-espi 
>>>>>> driver
>>>>>> is basically the same as in 5.7 (just two small changes in 5.7).
>>>>> There's 6cc0c16d82f88 and maybe also other interrupt related patches
>>>>> around this time that could affect book E, so it's good if that exact
>>>>> patch is confirmed.
>>>> My confirmation is basically that I can induce the issue in a 5.4 kernel
>>>> by cherry-picking 3282a3da25bd. I'm also able to "fix" the issue in
>>>> 5.9-rc2 by reverting that one commit.
>>>>
>>>> I both cases it's not exactly a clean cherry-pick/revert so I also
>>>> confirmed the bisection result by building at 3282a3da25bd (which sees
>>>> the issue) and the commit just before (which does not).
>>> Thanks for testing, that confirms it well.
>>>
>>> [snip patch]
>>>
>>>> I still saw the issue with this change applied. PPC_IRQ_SOFT_MASK_DEBUG
>>>> didn't report anything (either with or without the change above).
>>> Okay, it was a bit of a shot in the dark. I still can't see what
>>> else has changed.
>>>
>>> What would cause this, a lost interrupt? A spurious interrupt? Or
>>> higher interrupt latency?
>>>
>>> I don't think the patch should cause significantly worse latency,
>>> (it's 

Re: fsl_espi errors on v5.7.15

2020-08-30 Thread Chris Packham

On 31/08/20 12:30 am, Nicholas Piggin wrote:
> Excerpts from Chris Packham's message of August 28, 2020 8:07 am:



>> I've also now seen the RX FIFO not empty error on the T2080RDB
>>
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>
>> With my current workaround of emptying the RX FIFO. It seems
>> survivable. Interestingly it only ever seems to be 1 extra byte in the
>> RX FIFO and it seems to be after either a READ_SR or a READ_FSR.
>>
>> fsl_espi ffe11.spi: tx 70
>> fsl_espi ffe11.spi: rx 03
>> fsl_espi ffe11.spi: Extra RX 00
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>> fsl_espi ffe11.spi: tx 05
>> fsl_espi ffe11.spi: rx 00
>> fsl_espi ffe11.spi: Extra RX 03
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>> fsl_espi ffe11.spi: tx 05
>> fsl_espi ffe11.spi: rx 00
>> fsl_espi ffe11.spi: Extra RX 03
>>
>>   From all the Micron SPI-NOR datasheets I've got access to it is
>> possible to continually read the SR/FSR. But I've no idea why it
>> happens some times and not others.
> So I think I've got a reproduction and I think I've bisected the problem
> to commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in
> C"). My day is just finishing now so I haven't applied too much scrutiny
> to this result. Given the various rabbit holes I've been down on this
> issue already I'd take this information with a good degree of skepticism.
>
 OK, so an easy test should be to re-test with a 5.4 kernel.
 It doesn't have yet the change you're referring to, and the fsl-espi driver
 is basically the same as in 5.7 (just two small changes in 5.7).
>>> There's 6cc0c16d82f88 and maybe also other interrupt related patches
>>> around this time that could affect book E, so it's good if that exact
>>> patch is confirmed.
>> My confirmation is basically that I can induce the issue in a 5.4 kernel
>> by cherry-picking 3282a3da25bd. I'm also able to "fix" the issue in
>> 5.9-rc2 by reverting that one commit.
>>
>> I both cases it's not exactly a clean cherry-pick/revert so I also
>> confirmed the bisection result by building at 3282a3da25bd (which sees
>> the issue) and the commit just before (which does not).
> Thanks for testing, that confirms it well.
>
> [snip patch]
>
>> I still saw the issue with this change applied. PPC_IRQ_SOFT_MASK_DEBUG
>> didn't report anything (either with or without the change above).
> Okay, it was a bit of a shot in the dark. I still can't see what
> else has changed.
>
> What would cause this, a lost interrupt? A spurious interrupt? Or
> higher interrupt latency?
>
> I don't think the patch should cause significantly worse latency,
> (it's supposed to be a bit better if anything because it doesn't set
> up the full interrupt frame). But it's possible.

My working theory is that the SPI_DON indication is all about the TX 
direction an now that the interrupts are faster we're hitting an error 
because there is still RX activity going on. Heiner disagrees with my 
interpretation of the SPI_DON indication and the fact that it doesn't 
happen every time does throw doubt on it.

I can't really explain the extra RX byte in the fifo. We know how many 
bytes to expect and we pull that many from the fifo so it's not as if 
we're missing an interrupt causing us to skip the last byte. I've been 
looking for some kind of off-by-one calculation but again if it were 
something like that it'd happen all the time.



Re: fsl_espi errors on v5.7.15

2020-08-27 Thread Chris Packham
On 27/08/20 7:12 pm, Nicholas Piggin wrote:
> Excerpts from Heiner Kallweit's message of August 26, 2020 4:38 pm:
>> On 26.08.2020 08:07, Chris Packham wrote:
>>> On 26/08/20 1:48 pm, Chris Packham wrote:
>>>> On 26/08/20 10:22 am, Chris Packham wrote:
>>>>> On 25/08/20 7:22 pm, Heiner Kallweit wrote:
>>>>>
>>>>> 
>>>>>> I've been staring at spi-fsl-espi.c for while now and I think I've
>>>>>>> identified a couple of deficiencies that may or may not be related
>>>>>>> to my
>>>>>>> issue.
>>>>>>>
>>>>>>> First I think the 'Transfer done but SPIE_DON isn't set' message
>>>>>>> can be
>>>>>>> generated spuriously. In fsl_espi_irq() we read the ESPI_SPIE
>>>>>>> register.
>>>>>>> We also write back to it to clear the current events. We re-read it in
>>>>>>> fsl_espi_cpu_irq() and complain when SPIE_DON is not set. But we can
>>>>>>> naturally end up in that situation if we're doing a large read.
>>>>>>> Consider
>>>>>>> the messages for reading a block of data from a spi-nor chip
>>>>>>>
>>>>>>>     tx = READ_OP + ADDR
>>>>>>>     rx = data
>>>>>>>
>>>>>>> We setup the transfer and pump out the tx_buf. The first interrupt
>>>>>>> goes
>>>>>>> off and ESPI_SPIE has SPIM_DON and SPIM_RXT set. We empty the rx fifo,
>>>>>>> clear ESPI_SPIE and wait for the next interrupt. The next interrupt
>>>>>>> fires and this time we have ESPI_SPIE with just SPIM_RXT set. This
>>>>>>> continues until we've received all the data and we finish with
>>>>>>> ESPI_SPIE
>>>>>>> having only SPIM_RXT set. When we re-read it we complain that SPIE_DON
>>>>>>> isn't set.
>>>>>>>
>>>>>>> The other deficiency is that we only get an interrupt when the
>>>>>>> amount of
>>>>>>> data in the rx fifo is above FSL_ESPI_RXTHR. If there are fewer than
>>>>>>> FSL_ESPI_RXTHR left to be received we will never pull them out of
>>>>>>> the fifo.
>>>>>>>
>>>>>> SPIM_DON will trigger an interrupt once the last characters have been
>>>>>> transferred, and read the remaining characters from the FIFO.
>>>>> The T2080RM that I have says the following about the DON bit
>>>>>
>>>>> "Last character was transmitted. The last character was transmitted
>>>>> and a new command can be written for the next frame."
>>>>>
>>>>> That does at least seem to fit with my assertion that it's all about
>>>>> the TX direction. But the fact that it doesn't happen all the time
>>>>> throws some doubt on it.
>>>>>
>>>>>> I think the reason I'm seeing some variability is because of how fast
>>>>>>> (or slow) the interrupts get processed and how fast the spi-nor
>>>>>>> chip can
>>>>>>> fill the CPUs rx fifo.
>>>>>>>
>>>>>> To rule out timing issues at high bus frequencies I initially asked
>>>>>> for re-testing at lower frequencies. If you e.g. limit the bus to 1 MHz
>>>>>> or even less, then timing shouldn't be an issue.
>>>>> Yes I've currently got spi-max-frequency = <100>; in my dts. I
>>>>> would also expect a slower frequency would fit my "DON is for TX"
>>>>> narrative.
>>>>>> Last relevant functional changes have been done almost 4 years ago.
>>>>>> And yours is the first such report I see. So question is what could
>>>>>> be so
>>>>>> special with your setup that it seems you're the only one being
>>>>>> affected.
>>>>>> The scenarios you describe are standard, therefore much more people
>>>>>> should be affected in case of a driver bug.
>>>>> Agreed. But even on my hardware (which may have a latent issue
>>>>> despite being in the field for going on 5 years) the issue only
>>>>> triggers under some fairly specific circumstances.
>>>>>> You said that kernel config impa

Re: fsl_espi errors on v5.7.15

2020-08-26 Thread Chris Packham
(adding Nicholas)

On 26/08/20 6:38 pm, Heiner Kallweit wrote:
> On 26.08.2020 08:07, Chris Packham wrote:
>> On 26/08/20 1:48 pm, Chris Packham wrote:
>>> On 26/08/20 10:22 am, Chris Packham wrote:
>>>> On 25/08/20 7:22 pm, Heiner Kallweit wrote:
>>>>
>>>> 
>>>>> I've been staring at spi-fsl-espi.c for while now and I think I've
>>>>>> identified a couple of deficiencies that may or may not be related
>>>>>> to my
>>>>>> issue.
>>>>>>
>>>>>> First I think the 'Transfer done but SPIE_DON isn't set' message
>>>>>> can be
>>>>>> generated spuriously. In fsl_espi_irq() we read the ESPI_SPIE
>>>>>> register.
>>>>>> We also write back to it to clear the current events. We re-read it in
>>>>>> fsl_espi_cpu_irq() and complain when SPIE_DON is not set. But we can
>>>>>> naturally end up in that situation if we're doing a large read.
>>>>>> Consider
>>>>>> the messages for reading a block of data from a spi-nor chip
>>>>>>
>>>>>>     tx = READ_OP + ADDR
>>>>>>     rx = data
>>>>>>
>>>>>> We setup the transfer and pump out the tx_buf. The first interrupt
>>>>>> goes
>>>>>> off and ESPI_SPIE has SPIM_DON and SPIM_RXT set. We empty the rx fifo,
>>>>>> clear ESPI_SPIE and wait for the next interrupt. The next interrupt
>>>>>> fires and this time we have ESPI_SPIE with just SPIM_RXT set. This
>>>>>> continues until we've received all the data and we finish with
>>>>>> ESPI_SPIE
>>>>>> having only SPIM_RXT set. When we re-read it we complain that SPIE_DON
>>>>>> isn't set.
>>>>>>
>>>>>> The other deficiency is that we only get an interrupt when the
>>>>>> amount of
>>>>>> data in the rx fifo is above FSL_ESPI_RXTHR. If there are fewer than
>>>>>> FSL_ESPI_RXTHR left to be received we will never pull them out of
>>>>>> the fifo.
>>>>>>
>>>>> SPIM_DON will trigger an interrupt once the last characters have been
>>>>> transferred, and read the remaining characters from the FIFO.
>>>> The T2080RM that I have says the following about the DON bit
>>>>
>>>> "Last character was transmitted. The last character was transmitted
>>>> and a new command can be written for the next frame."
>>>>
>>>> That does at least seem to fit with my assertion that it's all about
>>>> the TX direction. But the fact that it doesn't happen all the time
>>>> throws some doubt on it.
>>>>
>>>>> I think the reason I'm seeing some variability is because of how fast
>>>>>> (or slow) the interrupts get processed and how fast the spi-nor
>>>>>> chip can
>>>>>> fill the CPUs rx fifo.
>>>>>>
>>>>> To rule out timing issues at high bus frequencies I initially asked
>>>>> for re-testing at lower frequencies. If you e.g. limit the bus to 1 MHz
>>>>> or even less, then timing shouldn't be an issue.
>>>> Yes I've currently got spi-max-frequency = <100>; in my dts. I
>>>> would also expect a slower frequency would fit my "DON is for TX"
>>>> narrative.
>>>>> Last relevant functional changes have been done almost 4 years ago.
>>>>> And yours is the first such report I see. So question is what could
>>>>> be so
>>>>> special with your setup that it seems you're the only one being
>>>>> affected.
>>>>> The scenarios you describe are standard, therefore much more people
>>>>> should be affected in case of a driver bug.
>>>> Agreed. But even on my hardware (which may have a latent issue
>>>> despite being in the field for going on 5 years) the issue only
>>>> triggers under some fairly specific circumstances.
>>>>> You said that kernel config impacts how frequently the issue happens.
>>>>> Therefore question is what's the diff in kernel config, and how could
>>>>> the differences be related to SPI.
>>>> It did seem to be somewhat random. Things like CONFIG_PREEMPT have an
>>>> impact but every time I found something that seemed to be having an
>

Re: fsl_espi errors on v5.7.15

2020-08-26 Thread Chris Packham

On 26/08/20 1:48 pm, Chris Packham wrote:
>
> On 26/08/20 10:22 am, Chris Packham wrote:
>> On 25/08/20 7:22 pm, Heiner Kallweit wrote:
>>
>> 
>>> I've been staring at spi-fsl-espi.c for while now and I think I've
>>>> identified a couple of deficiencies that may or may not be related 
>>>> to my
>>>> issue.
>>>>
>>>> First I think the 'Transfer done but SPIE_DON isn't set' message 
>>>> can be
>>>> generated spuriously. In fsl_espi_irq() we read the ESPI_SPIE 
>>>> register.
>>>> We also write back to it to clear the current events. We re-read it in
>>>> fsl_espi_cpu_irq() and complain when SPIE_DON is not set. But we can
>>>> naturally end up in that situation if we're doing a large read. 
>>>> Consider
>>>> the messages for reading a block of data from a spi-nor chip
>>>>
>>>>    tx = READ_OP + ADDR
>>>>    rx = data
>>>>
>>>> We setup the transfer and pump out the tx_buf. The first interrupt 
>>>> goes
>>>> off and ESPI_SPIE has SPIM_DON and SPIM_RXT set. We empty the rx fifo,
>>>> clear ESPI_SPIE and wait for the next interrupt. The next interrupt
>>>> fires and this time we have ESPI_SPIE with just SPIM_RXT set. This
>>>> continues until we've received all the data and we finish with 
>>>> ESPI_SPIE
>>>> having only SPIM_RXT set. When we re-read it we complain that SPIE_DON
>>>> isn't set.
>>>>
>>>> The other deficiency is that we only get an interrupt when the 
>>>> amount of
>>>> data in the rx fifo is above FSL_ESPI_RXTHR. If there are fewer than
>>>> FSL_ESPI_RXTHR left to be received we will never pull them out of 
>>>> the fifo.
>>>>
>>> SPIM_DON will trigger an interrupt once the last characters have been
>>> transferred, and read the remaining characters from the FIFO.
>>
>> The T2080RM that I have says the following about the DON bit
>>
>> "Last character was transmitted. The last character was transmitted 
>> and a new command can be written for the next frame."
>>
>> That does at least seem to fit with my assertion that it's all about 
>> the TX direction. But the fact that it doesn't happen all the time 
>> throws some doubt on it.
>>
>>> I think the reason I'm seeing some variability is because of how fast
>>>> (or slow) the interrupts get processed and how fast the spi-nor 
>>>> chip can
>>>> fill the CPUs rx fifo.
>>>>
>>> To rule out timing issues at high bus frequencies I initially asked
>>> for re-testing at lower frequencies. If you e.g. limit the bus to 1 MHz
>>> or even less, then timing shouldn't be an issue.
>> Yes I've currently got spi-max-frequency = <100>; in my dts. I 
>> would also expect a slower frequency would fit my "DON is for TX" 
>> narrative.
>>> Last relevant functional changes have been done almost 4 years ago.
>>> And yours is the first such report I see. So question is what could 
>>> be so
>>> special with your setup that it seems you're the only one being 
>>> affected.
>>> The scenarios you describe are standard, therefore much more people
>>> should be affected in case of a driver bug.
>> Agreed. But even on my hardware (which may have a latent issue 
>> despite being in the field for going on 5 years) the issue only 
>> triggers under some fairly specific circumstances.
>>> You said that kernel config impacts how frequently the issue happens.
>>> Therefore question is what's the diff in kernel config, and how could
>>> the differences be related to SPI.
>>
>> It did seem to be somewhat random. Things like CONFIG_PREEMPT have an 
>> impact but every time I found something that seemed to be having an 
>> impact I've been able to disprove it. I actually think its about how 
>> busy the system is which may or may not affect when we get round to 
>> processing the interrupts.
>>
>> I have managed to get the 'Transfer done but SPIE_DON isn't set!' to 
>> occur on the T2080RDB.
>>
>> I've had to add the following to expose the environment as a mtd 
>> partition
>>
>> diff --git a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi 
>> b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>> index ff87e67c70da..fbf95fc1fd68 100644
>> --- a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>> +++ b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
>> @@ -1

Re: fsl_espi errors on v5.7.15

2020-08-25 Thread Chris Packham

On 26/08/20 10:22 am, Chris Packham wrote:
> On 25/08/20 7:22 pm, Heiner Kallweit wrote:
>
> 
>> I've been staring at spi-fsl-espi.c for while now and I think I've
>>> identified a couple of deficiencies that may or may not be related 
>>> to my
>>> issue.
>>>
>>> First I think the 'Transfer done but SPIE_DON isn't set' message can be
>>> generated spuriously. In fsl_espi_irq() we read the ESPI_SPIE register.
>>> We also write back to it to clear the current events. We re-read it in
>>> fsl_espi_cpu_irq() and complain when SPIE_DON is not set. But we can
>>> naturally end up in that situation if we're doing a large read. 
>>> Consider
>>> the messages for reading a block of data from a spi-nor chip
>>>
>>>    tx = READ_OP + ADDR
>>>    rx = data
>>>
>>> We setup the transfer and pump out the tx_buf. The first interrupt goes
>>> off and ESPI_SPIE has SPIM_DON and SPIM_RXT set. We empty the rx fifo,
>>> clear ESPI_SPIE and wait for the next interrupt. The next interrupt
>>> fires and this time we have ESPI_SPIE with just SPIM_RXT set. This
>>> continues until we've received all the data and we finish with 
>>> ESPI_SPIE
>>> having only SPIM_RXT set. When we re-read it we complain that SPIE_DON
>>> isn't set.
>>>
>>> The other deficiency is that we only get an interrupt when the 
>>> amount of
>>> data in the rx fifo is above FSL_ESPI_RXTHR. If there are fewer than
>>> FSL_ESPI_RXTHR left to be received we will never pull them out of 
>>> the fifo.
>>>
>> SPIM_DON will trigger an interrupt once the last characters have been
>> transferred, and read the remaining characters from the FIFO.
>
> The T2080RM that I have says the following about the DON bit
>
> "Last character was transmitted. The last character was transmitted 
> and a new command can be written for the next frame."
>
> That does at least seem to fit with my assertion that it's all about 
> the TX direction. But the fact that it doesn't happen all the time 
> throws some doubt on it.
>
>> I think the reason I'm seeing some variability is because of how fast
>>> (or slow) the interrupts get processed and how fast the spi-nor chip 
>>> can
>>> fill the CPUs rx fifo.
>>>
>> To rule out timing issues at high bus frequencies I initially asked
>> for re-testing at lower frequencies. If you e.g. limit the bus to 1 MHz
>> or even less, then timing shouldn't be an issue.
> Yes I've currently got spi-max-frequency = <100>; in my dts. I 
> would also expect a slower frequency would fit my "DON is for TX" 
> narrative.
>> Last relevant functional changes have been done almost 4 years ago.
>> And yours is the first such report I see. So question is what could 
>> be so
>> special with your setup that it seems you're the only one being 
>> affected.
>> The scenarios you describe are standard, therefore much more people
>> should be affected in case of a driver bug.
> Agreed. But even on my hardware (which may have a latent issue despite 
> being in the field for going on 5 years) the issue only triggers under 
> some fairly specific circumstances.
>> You said that kernel config impacts how frequently the issue happens.
>> Therefore question is what's the diff in kernel config, and how could
>> the differences be related to SPI.
>
> It did seem to be somewhat random. Things like CONFIG_PREEMPT have an 
> impact but every time I found something that seemed to be having an 
> impact I've been able to disprove it. I actually think its about how 
> busy the system is which may or may not affect when we get round to 
> processing the interrupts.
>
> I have managed to get the 'Transfer done but SPIE_DON isn't set!' to 
> occur on the T2080RDB.
>
> I've had to add the following to expose the environment as a mtd 
> partition
>
> diff --git a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi 
> b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
> index ff87e67c70da..fbf95fc1fd68 100644
> --- a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
> +++ b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
> @@ -116,6 +116,15 @@ flash@0 {
>     compatible = "micron,n25q512ax3", 
> "jedec,spi-nor";
>     reg = <0>;
>     spi-max-frequency = <1000>; /* 
> input clock */
> +
> +   partition@u-boot {
> +    reg = <0x 0x0010>;
> +   

Re: fsl_espi errors on v5.7.15

2020-08-25 Thread Chris Packham
On 25/08/20 7:22 pm, Heiner Kallweit wrote:


> I've been staring at spi-fsl-espi.c for while now and I think I've
>> identified a couple of deficiencies that may or may not be related to my
>> issue.
>>
>> First I think the 'Transfer done but SPIE_DON isn't set' message can be
>> generated spuriously. In fsl_espi_irq() we read the ESPI_SPIE register.
>> We also write back to it to clear the current events. We re-read it in
>> fsl_espi_cpu_irq() and complain when SPIE_DON is not set. But we can
>> naturally end up in that situation if we're doing a large read. Consider
>> the messages for reading a block of data from a spi-nor chip
>>
>>    tx = READ_OP + ADDR
>>    rx = data
>>
>> We setup the transfer and pump out the tx_buf. The first interrupt goes
>> off and ESPI_SPIE has SPIM_DON and SPIM_RXT set. We empty the rx fifo,
>> clear ESPI_SPIE and wait for the next interrupt. The next interrupt
>> fires and this time we have ESPI_SPIE with just SPIM_RXT set. This
>> continues until we've received all the data and we finish with ESPI_SPIE
>> having only SPIM_RXT set. When we re-read it we complain that SPIE_DON
>> isn't set.
>>
>> The other deficiency is that we only get an interrupt when the amount of
>> data in the rx fifo is above FSL_ESPI_RXTHR. If there are fewer than
>> FSL_ESPI_RXTHR left to be received we will never pull them out of the fifo.
>>
> SPIM_DON will trigger an interrupt once the last characters have been
> transferred, and read the remaining characters from the FIFO.

The T2080RM that I have says the following about the DON bit

"Last character was transmitted. The last character was transmitted and 
a new command can be written for the next frame."

That does at least seem to fit with my assertion that it's all about the 
TX direction. But the fact that it doesn't happen all the time throws 
some doubt on it.

> I think the reason I'm seeing some variability is because of how fast
>> (or slow) the interrupts get processed and how fast the spi-nor chip can
>> fill the CPUs rx fifo.
>>
> To rule out timing issues at high bus frequencies I initially asked
> for re-testing at lower frequencies. If you e.g. limit the bus to 1 MHz
> or even less, then timing shouldn't be an issue.
Yes I've currently got spi-max-frequency = <100>; in my dts. I would 
also expect a slower frequency would fit my "DON is for TX" narrative.
> Last relevant functional changes have been done almost 4 years ago.
> And yours is the first such report I see. So question is what could be so
> special with your setup that it seems you're the only one being affected.
> The scenarios you describe are standard, therefore much more people
> should be affected in case of a driver bug.
Agreed. But even on my hardware (which may have a latent issue despite 
being in the field for going on 5 years) the issue only triggers under 
some fairly specific circumstances.
> You said that kernel config impacts how frequently the issue happens.
> Therefore question is what's the diff in kernel config, and how could
> the differences be related to SPI.

It did seem to be somewhat random. Things like CONFIG_PREEMPT have an 
impact but every time I found something that seemed to be having an 
impact I've been able to disprove it. I actually think its about how 
busy the system is which may or may not affect when we get round to 
processing the interrupts.

I have managed to get the 'Transfer done but SPIE_DON isn't set!' to 
occur on the T2080RDB.

I've had to add the following to expose the environment as a mtd partition

diff --git a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi 
b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
index ff87e67c70da..fbf95fc1fd68 100644
--- a/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
+++ b/arch/powerpc/boot/dts/fsl/t208xrdb.dtsi
@@ -116,6 +116,15 @@ flash@0 {
     compatible = "micron,n25q512ax3", 
"jedec,spi-nor";
     reg = <0>;
     spi-max-frequency = <1000>; /* 
input clock */
+
+   partition@u-boot {
+    reg = <0x 0x0010>;
+    label = "u-boot";
+    };
+    partition@u-boot-env {
+    reg = <0x0010 0x0001>;
+    label = "u-boot-env";
+    };
     };
     };

And I'm using the following script to poke at the environment (warning 
if anyone does try this and the bug hits it can render your u-boot 
environment invalid).

cat flash/fw_env_test.sh
#!/bin/sh

generate_fw_env_config()
{
   cat /proc/mtd | sed 's/[:"]//g' | while read dev size erasesize name ; do
  echo "$dev $size $erasesize $name"
  [ "$name" = "u-boot-env" ] && echo "/dev/$dev 0x 0x2000 
$erasesize" >/flash/fw_env.config
   done
}

cycles=10
[ $# -ge 

Re: fsl_espi errors on v5.7.15

2020-08-24 Thread Chris Packham

On 25/08/20 10:04 am, Chris Packham wrote:
>
> On 20/08/20 9:08 am, Chris Packham wrote:
>>
>> On 19/08/20 6:15 pm, Heiner Kallweit wrote:
>>> On 19.08.2020 00:44, Chris Packham wrote:
>>>> Hi Again,
>>>>
>>>> On 17/08/20 9:09 am, Chris Packham wrote:
>>>>
>>>>> On 14/08/20 6:19 pm, Heiner Kallweit wrote:
>>>>>> On 14.08.2020 04:48, Chris Packham wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm seeing a problem with accessing spi-nor after upgrading a T2081
>>>>>>> based system to linux v5.7.15
>>>>>>>
>>>>>>> For this board u-boot and the u-boot environment live on spi-nor.
>>>>>>>
>>>>>>> When I use fw_setenv from userspace I get the following kernel logs
>>>>>>>
>>>>>>> # fw_setenv foo=1
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>> empty!
>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>> empty!
>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>>>>>> empty!
>>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>>> ...
>>>>>>>
>>>>>> This error reporting doesn't exist yet in 4.4. So you may have an 
>>>>>> issue
>>>>>> under 4.4 too, it's just not reported.
>>>>>> Did you verify that under 4.4 fw_setenv actually has an effect?
>>>>> Just double checked and yes under 4.4 the setting does get saved.
>>>>>>> If I run fw_printenv (before getting it into a bad state) it is 
>>>>>>> able to
>>>>>>> display the content of the boards u-boot environment.
>>>>>>>
>>>>>> This might indicate an issue with spi being locked. I've seen 
>>>>>> related
>>>>>> questions, just use the search engine of your choice and check for
>>>>>> fw_setenv and locked.
>>>>> I'm running a version of fw_setenv which includes
>>>>> https://gitlab.denx.de/u-boot/u-boot/-/commit/db820159 so it 
>>>>> shouldn't
>>>>> be locking things unnecessarily.
>>>>>>> If been unsuccessful in producing a setup for bisecting the 
>>>>>>> issue. I do
>>>>>>> know the issue doesn't occur on the old 4.4.x based kernel but 
>>>>>>> that's
>>>>>>> probably not much help.
>>>>>>>
>>>>>>> Any pointers on what the issue (and/or solution) might be.
>>>> I finally managed to get our board running with a vanilla kernel. With
>>>> corenet64_smp_defconfig I occasionally see
>>>>
>>>>     fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>
>>>> other than the message things seem to be working.
>>>>
>>>> With a custom defconfig I see
>>>>
>>>>     fsl_espi ffe11.spi: Transfer done but 

Re: fsl_espi errors on v5.7.15

2020-08-24 Thread Chris Packham

On 20/08/20 9:08 am, Chris Packham wrote:
>
> On 19/08/20 6:15 pm, Heiner Kallweit wrote:
>> On 19.08.2020 00:44, Chris Packham wrote:
>>> Hi Again,
>>>
>>> On 17/08/20 9:09 am, Chris Packham wrote:
>>>
>>>> On 14/08/20 6:19 pm, Heiner Kallweit wrote:
>>>>> On 14.08.2020 04:48, Chris Packham wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm seeing a problem with accessing spi-nor after upgrading a T2081
>>>>>> based system to linux v5.7.15
>>>>>>
>>>>>> For this board u-boot and the u-boot environment live on spi-nor.
>>>>>>
>>>>>> When I use fw_setenv from userspace I get the following kernel logs
>>>>>>
>>>>>> # fw_setenv foo=1
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>>> ...
>>>>>>
>>>>> This error reporting doesn't exist yet in 4.4. So you may have an 
>>>>> issue
>>>>> under 4.4 too, it's just not reported.
>>>>> Did you verify that under 4.4 fw_setenv actually has an effect?
>>>> Just double checked and yes under 4.4 the setting does get saved.
>>>>>> If I run fw_printenv (before getting it into a bad state) it is 
>>>>>> able to
>>>>>> display the content of the boards u-boot environment.
>>>>>>
>>>>> This might indicate an issue with spi being locked. I've seen related
>>>>> questions, just use the search engine of your choice and check for
>>>>> fw_setenv and locked.
>>>> I'm running a version of fw_setenv which includes
>>>> https://gitlab.denx.de/u-boot/u-boot/-/commit/db820159 so it shouldn't
>>>> be locking things unnecessarily.
>>>>>> If been unsuccessful in producing a setup for bisecting the 
>>>>>> issue. I do
>>>>>> know the issue doesn't occur on the old 4.4.x based kernel but 
>>>>>> that's
>>>>>> probably not much help.
>>>>>>
>>>>>> Any pointers on what the issue (and/or solution) might be.
>>> I finally managed to get our board running with a vanilla kernel. With
>>> corenet64_smp_defconfig I occasionally see
>>>
>>>     fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>
>>> other than the message things seem to be working.
>>>
>>> With a custom defconfig I see
>>>
>>>     fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>     fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't 
>>> empty!
>>>     fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>     ...
>>>
>>> and access to the spi-nor does not work until the board is reset.
>>>
>>> I'll try and pick apart the differences between the two defconfigs.
>
> I now think my earlier testing is invalid. I have seen the problem 
> wi

Re: fsl_espi errors on v5.7.15

2020-08-19 Thread Chris Packham

On 19/08/20 6:15 pm, Heiner Kallweit wrote:
> On 19.08.2020 00:44, Chris Packham wrote:
>> Hi Again,
>>
>> On 17/08/20 9:09 am, Chris Packham wrote:
>>
>>> On 14/08/20 6:19 pm, Heiner Kallweit wrote:
>>>> On 14.08.2020 04:48, Chris Packham wrote:
>>>>> Hi,
>>>>>
>>>>> I'm seeing a problem with accessing spi-nor after upgrading a T2081
>>>>> based system to linux v5.7.15
>>>>>
>>>>> For this board u-boot and the u-boot environment live on spi-nor.
>>>>>
>>>>> When I use fw_setenv from userspace I get the following kernel logs
>>>>>
>>>>> # fw_setenv foo=1
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>>>> ...
>>>>>
>>>> This error reporting doesn't exist yet in 4.4. So you may have an issue
>>>> under 4.4 too, it's just not reported.
>>>> Did you verify that under 4.4 fw_setenv actually has an effect?
>>> Just double checked and yes under 4.4 the setting does get saved.
>>>>> If I run fw_printenv (before getting it into a bad state) it is able to
>>>>> display the content of the boards u-boot environment.
>>>>>
>>>> This might indicate an issue with spi being locked. I've seen related
>>>> questions, just use the search engine of your choice and check for
>>>> fw_setenv and locked.
>>> I'm running a version of fw_setenv which includes
>>> https://gitlab.denx.de/u-boot/u-boot/-/commit/db820159 so it shouldn't
>>> be locking things unnecessarily.
>>>>> If been unsuccessful in producing a setup for bisecting the issue. I do
>>>>> know the issue doesn't occur on the old 4.4.x based kernel but that's
>>>>> probably not much help.
>>>>>
>>>>> Any pointers on what the issue (and/or solution) might be.
>> I finally managed to get our board running with a vanilla kernel. With
>> corenet64_smp_defconfig I occasionally see
>>
>>     fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>
>> other than the message things seem to be working.
>>
>> With a custom defconfig I see
>>
>>     fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>     fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>     fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>     ...
>>
>> and access to the spi-nor does not work until the board is reset.
>>
>> I'll try and pick apart the differences between the two defconfigs.

I now think my earlier testing is invalid. I have seen the problem with 
either defconfig if I try hard enough. I had convinced myself that the 
problem was CONFIG_PREEMPT but that was before I found boot-to-boot 
differences with the same kernel.

It's possible that I'm chasing multiple issues with the same symptom.

The error I'm most concerned with is in the sequence
1. boot with old image
2. write environment
3. boot with new image
4. write environment
5. write fails and environment is corrupted

After I recover the system things sometimes seem fine. Until I repeat 
the sequence above.

> Also relevant may be:
> - Which dts are you using?
Custom but based heavily on the t2080rdb.
> - What's the spi-nor type, and at which frequency are you operating it?
The board has several alternate parts for the spi-nor so the dts just 
specifies compatible = "jedec,spi-nor" the actual chip detected on the 
board I have is "n25q032a (4096 Kbytes)". The dts sets spi-max-frequency 
= <1000> I haven't measured the actual frequency on the bus.
> - Does the issue still happen if you lower the frequency?
I did play around with the frequency initially but I should probably 
give that another go now that I have a better reproduction method.

Re: fsl_espi errors on v5.7.15

2020-08-18 Thread Chris Packham
Hi Again,

On 17/08/20 9:09 am, Chris Packham wrote:

>
> On 14/08/20 6:19 pm, Heiner Kallweit wrote:
>> On 14.08.2020 04:48, Chris Packham wrote:
>>> Hi,
>>>
>>> I'm seeing a problem with accessing spi-nor after upgrading a T2081
>>> based system to linux v5.7.15
>>>
>>> For this board u-boot and the u-boot environment live on spi-nor.
>>>
>>> When I use fw_setenv from userspace I get the following kernel logs
>>>
>>> # fw_setenv foo=1
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>>> ...
>>>
>> This error reporting doesn't exist yet in 4.4. So you may have an issue
>> under 4.4 too, it's just not reported.
>> Did you verify that under 4.4 fw_setenv actually has an effect?
> Just double checked and yes under 4.4 the setting does get saved.
>>> If I run fw_printenv (before getting it into a bad state) it is able to
>>> display the content of the boards u-boot environment.
>>>
>> This might indicate an issue with spi being locked. I've seen related
>> questions, just use the search engine of your choice and check for
>> fw_setenv and locked.
> I'm running a version of fw_setenv which includes 
> https://gitlab.denx.de/u-boot/u-boot/-/commit/db820159 so it shouldn't 
> be locking things unnecessarily.
>>> If been unsuccessful in producing a setup for bisecting the issue. I do
>>> know the issue doesn't occur on the old 4.4.x based kernel but that's
>>> probably not much help.
>>>
>>> Any pointers on what the issue (and/or solution) might be.

I finally managed to get our board running with a vanilla kernel. With 
corenet64_smp_defconfig I occasionally see

   fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!

other than the message things seem to be working.

With a custom defconfig I see

   fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
   fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
   fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
   ...

and access to the spi-nor does not work until the board is reset.

I'll try and pick apart the differences between the two defconfigs.


Re: fsl_espi errors on v5.7.15

2020-08-16 Thread Chris Packham

On 14/08/20 6:19 pm, Heiner Kallweit wrote:
> On 14.08.2020 04:48, Chris Packham wrote:
>> Hi,
>>
>> I'm seeing a problem with accessing spi-nor after upgrading a T2081
>> based system to linux v5.7.15
>>
>> For this board u-boot and the u-boot environment live on spi-nor.
>>
>> When I use fw_setenv from userspace I get the following kernel logs
>>
>> # fw_setenv foo=1
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>> fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
>> fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
>> ...
>>
> This error reporting doesn't exist yet in 4.4. So you may have an issue
> under 4.4 too, it's just not reported.
> Did you verify that under 4.4 fw_setenv actually has an effect?
Just double checked and yes under 4.4 the setting does get saved.
>> If I run fw_printenv (before getting it into a bad state) it is able to
>> display the content of the boards u-boot environment.
>>
> This might indicate an issue with spi being locked. I've seen related
> questions, just use the search engine of your choice and check for
> fw_setenv and locked.
I'm running a version of fw_setenv which includes 
https://gitlab.denx.de/u-boot/u-boot/-/commit/db820159 so it shouldn't 
be locking things unnecessarily.
>> If been unsuccessful in producing a setup for bisecting the issue. I do
>> know the issue doesn't occur on the old 4.4.x based kernel but that's
>> probably not much help.
>>
>> Any pointers on what the issue (and/or solution) might be.
>>
>> Thanks,
>> Chris
>>
> Heiner

fsl_espi errors on v5.7.15

2020-08-13 Thread Chris Packham
Hi,

I'm seeing a problem with accessing spi-nor after upgrading a T2081 
based system to linux v5.7.15

For this board u-boot and the u-boot environment live on spi-nor.

When I use fw_setenv from userspace I get the following kernel logs

# fw_setenv foo=1
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but SPIE_DON isn't set!
fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
fsl_espi ffe11.spi: Transfer done but rx/tx fifo's aren't empty!
fsl_espi ffe11.spi: SPIE_RXCNT = 1, SPIE_TXCNT = 32
...

If I run fw_printenv (before getting it into a bad state) it is able to 
display the content of the boards u-boot environment.

If been unsuccessful in producing a setup for bisecting the issue. I do 
know the issue doesn't occur on the old 4.4.x based kernel but that's 
probably not much help.

Any pointers on what the issue (and/or solution) might be.

Thanks,
Chris


Re: [PATCH] powerpc/legacy_serial: Use early_ioremap()

2020-08-09 Thread Chris Packham

On 24/03/20 10:54 am, Chris Packham wrote:
> Hi Christophe,
>
> On Wed, 2020-02-05 at 12:03 +, Christophe Leroy wrote:
>> [0.00] ioremap() called early from
>> find_legacy_serial_ports+0x3cc/0x474. Use early_ioremap() instead
>>
> I was just about to dig into this error message and found you patch. I
> applied it to a v5.5 base.
>
>> find_legacy_serial_ports() is called early from setup_arch(), before
>> paging_init(). vmalloc is not available yet, ioremap shouldn't be
>> used that early.
>>
>> Use early_ioremap() and switch to a regular ioremap() later.
>>
>> Signed-off-by: Christophe Leroy 
> On my system (Freescale T2080 SOC) this seems to cause a crash/hang in
> early boot. Unfortunately because this is affecting the boot console I
> don't get any earlyprintk output.

I've been doing a bit more digging into why Christophe's patch didn't 
work for me. I noticed the powerpc specific early_ioremap_range() 
returns addresses around ioremap_bot. Yet the generic early_ioremap() 
uses addresses around FIXADDR_TOP. If I try the following hack I can 
make Christophe's patch work

diff --git a/arch/powerpc/include/asm/fixmap.h 
b/arch/powerpc/include/asm/fixmap.h
index 2ef155a3c821..7bc2f3f73c8b 100644
--- a/arch/powerpc/include/asm/fixmap.h
+++ b/arch/powerpc/include/asm/fixmap.h
@@ -27,7 +27,7 @@
  #include 
  #define FIXADDR_TOP    (KASAN_SHADOW_START - PAGE_SIZE)
  #else
-#define FIXADDR_TOP    ((unsigned long)(-PAGE_SIZE))
+#define FIXADDR_TOP    (IOREMAP_END - PAGE_SIZE)
  #endif

  /*

I'll admit to being out of my depth. It seems that the generic 
early_ioremap() is not quite correctly plumbed in for powerpc.

>> ---
>>   arch/powerpc/kernel/legacy_serial.c | 33 +
>> 
>>   1 file changed, 29 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/legacy_serial.c
>> b/arch/powerpc/kernel/legacy_serial.c
>> index f061e06e9f51..8b2c1a8553a0 100644
>> --- a/arch/powerpc/kernel/legacy_serial.c
>> +++ b/arch/powerpc/kernel/legacy_serial.c
>> @@ -15,6 +15,7 @@
>>   #include 
>>   #include 
>>   #include 
>> +#include 
>>   
>>   #undef DEBUG
>>   
>> @@ -34,6 +35,7 @@ static struct legacy_serial_info {
>>  unsigned intclock;
>>  int irq_check_parent;
>>  phys_addr_t taddr;
>> +void __iomem*early_addr;
>>   } legacy_serial_infos[MAX_LEGACY_SERIAL_PORTS];
>>   
>>   static const struct of_device_id legacy_serial_parents[] __initconst
>> = {
>> @@ -325,17 +327,16 @@ static void __init
>> setup_legacy_serial_console(int console)
>>   {
>>  struct legacy_serial_info *info =
>> _serial_infos[console];
>>  struct plat_serial8250_port *port =
>> _serial_ports[console];
>> -void __iomem *addr;
>>  unsigned int stride;
>>   
>>  stride = 1 << port->regshift;
>>   
>>  /* Check if a translated MMIO address has been found */
>>  if (info->taddr) {
>> -addr = ioremap(info->taddr, 0x1000);
>> -if (addr == NULL)
>> +info->early_addr = early_ioremap(info->taddr, 0x1000);
>> +if (info->early_addr == NULL)
>>  return;
>> -udbg_uart_init_mmio(addr, stride);
>> +udbg_uart_init_mmio(info->early_addr, stride);
>>  } else {
>>  /* Check if it's PIO and we support untranslated PIO */
>>  if (port->iotype == UPIO_PORT && isa_io_special)
>> @@ -353,6 +354,30 @@ static void __init
>> setup_legacy_serial_console(int console)
>>  udbg_uart_setup(info->speed, info->clock);
>>   }
>>   
>> +static int __init ioremap_legacy_serial_console(void)
>> +{
>> +struct legacy_serial_info *info =
>> _serial_infos[legacy_serial_console];
>> +struct plat_serial8250_port *port =
>> _serial_ports[legacy_serial_console];
>> +void __iomem *vaddr;
>> +
>> +if (legacy_serial_console < 0)
>> +return 0;
>> +
>> +if (!info->early_addr)
>> +return 0;
>> +
>> +vaddr = ioremap(info->taddr, 0x1000);
>> +if (WARN_ON(!vaddr))
>> +return -ENOMEM;
>> +
>> +udbg_uart_init_mmio(vaddr, 1 << port->regshift);
>> +early_iounmap(info->early_addr, 0x1000);
>> +info->early_addr = NULL;
>> +
>> +return 0;
>> +}
>> +early_initcall(ioremap_legacy_serial_console);
>> +
>>   /*
>>* This is called very early, as part of setup_system() or
>> eventually
>>* setup_arch(), basically before anything else in this file. This
>> function

Re: OF: Can't handle multiple dma-ranges with different offsets

2020-07-30 Thread Chris Packham

On 23/07/20 10:11 am, Chris Packham wrote:
>
> On 22/07/20 4:19 pm, Chris Packham wrote:
>> Hi,
>>
>> I've just fired up linux kernel v5.7 on a p2040 based system and I'm 
>> getting the following new warning
>>
>> OF: Can't handle multiple dma-ranges with different offsets on 
>> node(/pcie@ffe202000)
>> OF: Can't handle multiple dma-ranges with different offsets on 
>> node(/pcie@ffe202000)
>>
>> The warning itself was added in commit 9d55bebd9816 ("of/address: 
>> Support multiple 'dma-ranges' entries") but I gather it's pointing 
>> out something about the dts. My boards dts is based heavily on 
>> p2041rdb.dts and the relevant pci2 section is identical (reproduced 
>> below for reference).
>>
>>     pci2: pcie@ffe202000 {
>>         reg = <0xf 0xfe202000 0 0x1000>;
>>         ranges = <0x0200 0 0xe000 0xc 0x4000 0 0x2000
>>               0x0100 0 0x 0xf 0xf802 0 0x0001>;
>>         pcie@0 {
>>             ranges = <0x0200 0 0xe000
>>                   0x0200 0 0xe000
>>                   0 0x2000
>>
>>                   0x0100 0 0x
>>                   0x0100 0 0x
>>                   0 0x0001>;
>>         };
>>     };
>>
>> I haven't noticed any ill effect (aside from the scary message). I'm 
>> not sure if there's something missing in the dts or in the code that 
>> checks the ranges. Any guidance would be appreciated.
>
> I've also just checked the T2080RDB on v5.7.9 which shows a similar issue
>
> OF: Can't handle multiple dma-ranges with different offsets on 
> node(/pcie@ffe25)
> OF: Can't handle multiple dma-ranges with different offsets on 
> node(/pcie@ffe25)
> pcieport :00:00.0: Invalid size 0xf9 for dma-range
> pcieport :00:00.0: AER: enabled with IRQ 21
> OF: Can't handle multiple dma-ranges with different offsets on 
> node(/pcie@ffe27)
> OF: Can't handle multiple dma-ranges with different offsets on 
> node(/pcie@ffe27)
> pcieport 0001:00:00.0: Invalid size 0xf9 for dma-range
> pcieport 0001:00:00.0: AER: enabled with IRQ 23

I've been doing a bit more digging. The dma-ranges property is not in 
the dts/dtb. It's actually inserted by u-boot via ft_fsl_pci_setup().

Here's some output from my T2080RDB

root@linuxbox ~]# xxd -g4 
/sys/firmware/devicetree/base/pcie@ffe24/dma-ranges
000: 0200  df07 000f  
010: fe00  00f9 4200  B...
020:      
030:  df07 4300 0010  C...
040:    0001  
050:  

I'm still wondering how best to deal with this. Hopefully without 
needing to deploy a u-boot update.


Re: OF: Can't handle multiple dma-ranges with different offsets

2020-07-22 Thread Chris Packham

On 22/07/20 4:19 pm, Chris Packham wrote:
> Hi,
>
> I've just fired up linux kernel v5.7 on a p2040 based system and I'm 
> getting the following new warning
>
> OF: Can't handle multiple dma-ranges with different offsets on 
> node(/pcie@ffe202000)
> OF: Can't handle multiple dma-ranges with different offsets on 
> node(/pcie@ffe202000)
>
> The warning itself was added in commit 9d55bebd9816 ("of/address: 
> Support multiple 'dma-ranges' entries") but I gather it's pointing out 
> something about the dts. My boards dts is based heavily on 
> p2041rdb.dts and the relevant pci2 section is identical (reproduced 
> below for reference).
>
>     pci2: pcie@ffe202000 {
>         reg = <0xf 0xfe202000 0 0x1000>;
>         ranges = <0x0200 0 0xe000 0xc 0x4000 0 0x2000
>               0x0100 0 0x 0xf 0xf802 0 0x0001>;
>         pcie@0 {
>             ranges = <0x0200 0 0xe000
>                   0x0200 0 0xe000
>                   0 0x2000
>
>                   0x0100 0 0x
>                   0x0100 0 0x
>                   0 0x0001>;
>         };
>     };
>
> I haven't noticed any ill effect (aside from the scary message). I'm 
> not sure if there's something missing in the dts or in the code that 
> checks the ranges. Any guidance would be appreciated.

I've also just checked the T2080RDB on v5.7.9 which shows a similar issue

OF: Can't handle multiple dma-ranges with different offsets on 
node(/pcie@ffe25)
OF: Can't handle multiple dma-ranges with different offsets on 
node(/pcie@ffe25)
pcieport :00:00.0: Invalid size 0xf9 for dma-range
pcieport :00:00.0: AER: enabled with IRQ 21
OF: Can't handle multiple dma-ranges with different offsets on 
node(/pcie@ffe27)
OF: Can't handle multiple dma-ranges with different offsets on 
node(/pcie@ffe27)
pcieport 0001:00:00.0: Invalid size 0xf9 for dma-range
pcieport 0001:00:00.0: AER: enabled with IRQ 23




OF: Can't handle multiple dma-ranges with different offsets

2020-07-21 Thread Chris Packham
Hi,

I've just fired up linux kernel v5.7 on a p2040 based system and I'm 
getting the following new warning

OF: Can't handle multiple dma-ranges with different offsets on 
node(/pcie@ffe202000)
OF: Can't handle multiple dma-ranges with different offsets on 
node(/pcie@ffe202000)

The warning itself was added in commit 9d55bebd9816 ("of/address: 
Support multiple 'dma-ranges' entries") but I gather it's pointing out 
something about the dts. My boards dts is based heavily on p2041rdb.dts 
and the relevant pci2 section is identical (reproduced below for reference).

     pci2: pcie@ffe202000 {
         reg = <0xf 0xfe202000 0 0x1000>;
         ranges = <0x0200 0 0xe000 0xc 0x4000 0 0x2000
               0x0100 0 0x 0xf 0xf802 0 0x0001>;
         pcie@0 {
             ranges = <0x0200 0 0xe000
                   0x0200 0 0xe000
                   0 0x2000

                   0x0100 0 0x
                   0x0100 0 0x
                   0 0x0001>;
         };
     };

I haven't noticed any ill effect (aside from the scary message). I'm not 
sure if there's something missing in the dts or in the code that checks 
the ranges. Any guidance would be appreciated.

Thanks,
Chris


[PATCH v3 2/2] powerpc: configs: remove CMDLINE_BOOL

2020-06-11 Thread Chris Packham
Regenerate defconfigs to remove CONFIG_CMDLINE_BOOL and the default
CONFIG_CMDLINE where applicable.

Signed-off-by: Chris Packham 
---
Changes in v3:
- new

 arch/powerpc/configs/44x/akebono_defconfig | 2 --
 arch/powerpc/configs/44x/arches_defconfig  | 2 --
 arch/powerpc/configs/44x/bamboo_defconfig  | 2 --
 arch/powerpc/configs/44x/bluestone_defconfig   | 2 --
 arch/powerpc/configs/44x/canyonlands_defconfig | 2 --
 arch/powerpc/configs/44x/currituck_defconfig   | 2 --
 arch/powerpc/configs/44x/eiger_defconfig   | 2 --
 arch/powerpc/configs/44x/fsp2_defconfig| 1 -
 arch/powerpc/configs/44x/icon_defconfig| 2 --
 arch/powerpc/configs/44x/iss476-smp_defconfig  | 1 -
 arch/powerpc/configs/44x/katmai_defconfig  | 2 --
 arch/powerpc/configs/44x/rainier_defconfig | 2 --
 arch/powerpc/configs/44x/redwood_defconfig | 2 --
 arch/powerpc/configs/44x/sam440ep_defconfig| 2 --
 arch/powerpc/configs/44x/sequoia_defconfig | 2 --
 arch/powerpc/configs/44x/taishan_defconfig | 2 --
 arch/powerpc/configs/44x/warp_defconfig| 1 -
 arch/powerpc/configs/holly_defconfig   | 1 -
 arch/powerpc/configs/mvme5100_defconfig| 3 +--
 arch/powerpc/configs/ps3_defconfig | 2 --
 arch/powerpc/configs/skiroot_defconfig | 1 -
 arch/powerpc/configs/storcenter_defconfig  | 1 -
 22 files changed, 1 insertion(+), 38 deletions(-)

diff --git a/arch/powerpc/configs/44x/akebono_defconfig 
b/arch/powerpc/configs/44x/akebono_defconfig
index 7705a5c3f4ea..60d5fa2c3b93 100644
--- a/arch/powerpc/configs/44x/akebono_defconfig
+++ b/arch/powerpc/configs/44x/akebono_defconfig
@@ -19,8 +19,6 @@ CONFIG_HIGHMEM=y
 CONFIG_HZ_100=y
 CONFIG_IRQ_ALL_CPUS=y
 # CONFIG_COMPACTION is not set
-CONFIG_CMDLINE_BOOL=y
-CONFIG_CMDLINE=""
 # CONFIG_SUSPEND is not set
 CONFIG_NET=y
 CONFIG_PACKET=y
diff --git a/arch/powerpc/configs/44x/arches_defconfig 
b/arch/powerpc/configs/44x/arches_defconfig
index 82c6f49b8dcb..41d04e70d4fb 100644
--- a/arch/powerpc/configs/44x/arches_defconfig
+++ b/arch/powerpc/configs/44x/arches_defconfig
@@ -11,8 +11,6 @@ CONFIG_MODULE_UNLOAD=y
 # CONFIG_BLK_DEV_BSG is not set
 # CONFIG_EBONY is not set
 CONFIG_ARCHES=y
-CONFIG_CMDLINE_BOOL=y
-CONFIG_CMDLINE=""
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
diff --git a/arch/powerpc/configs/44x/bamboo_defconfig 
b/arch/powerpc/configs/44x/bamboo_defconfig
index 679213214a75..acbce718eaa8 100644
--- a/arch/powerpc/configs/44x/bamboo_defconfig
+++ b/arch/powerpc/configs/44x/bamboo_defconfig
@@ -9,8 +9,6 @@ CONFIG_MODULE_UNLOAD=y
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_BAMBOO=y
 # CONFIG_EBONY is not set
-CONFIG_CMDLINE_BOOL=y
-CONFIG_CMDLINE=""
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
diff --git a/arch/powerpc/configs/44x/bluestone_defconfig 
b/arch/powerpc/configs/44x/bluestone_defconfig
index 8006a5728afd..37088f250c9e 100644
--- a/arch/powerpc/configs/44x/bluestone_defconfig
+++ b/arch/powerpc/configs/44x/bluestone_defconfig
@@ -11,8 +11,6 @@ CONFIG_EXPERT=y
 # CONFIG_COMPAT_BRK is not set
 CONFIG_BLUESTONE=y
 # CONFIG_EBONY is not set
-CONFIG_CMDLINE_BOOL=y
-CONFIG_CMDLINE=""
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
diff --git a/arch/powerpc/configs/44x/canyonlands_defconfig 
b/arch/powerpc/configs/44x/canyonlands_defconfig
index ccc14eb7a2f1..61776ade572b 100644
--- a/arch/powerpc/configs/44x/canyonlands_defconfig
+++ b/arch/powerpc/configs/44x/canyonlands_defconfig
@@ -11,8 +11,6 @@ CONFIG_MODULE_UNLOAD=y
 # CONFIG_BLK_DEV_BSG is not set
 # CONFIG_EBONY is not set
 CONFIG_CANYONLANDS=y
-CONFIG_CMDLINE_BOOL=y
-CONFIG_CMDLINE=""
 CONFIG_NET=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
diff --git a/arch/powerpc/configs/44x/currituck_defconfig 
b/arch/powerpc/configs/44x/currituck_defconfig
index be76e066df01..34c86b3abecb 100644
--- a/arch/powerpc/configs/44x/currituck_defconfig
+++ b/arch/powerpc/configs/44x/currituck_defconfig
@@ -17,8 +17,6 @@ CONFIG_HIGHMEM=y
 CONFIG_HZ_100=y
 CONFIG_MATH_EMULATION=y
 CONFIG_IRQ_ALL_CPUS=y
-CONFIG_CMDLINE_BOOL=y
-CONFIG_CMDLINE=""
 # CONFIG_SUSPEND is not set
 CONFIG_NET=y
 CONFIG_PACKET=y
diff --git a/arch/powerpc/configs/44x/eiger_defconfig 
b/arch/powerpc/configs/44x/eiger_defconfig
index 1abaa63e067f..509300f400e2 100644
--- a/arch/powerpc/configs/44x/eiger_defconfig
+++ b/arch/powerpc/configs/44x/eiger_defconfig
@@ -10,8 +10,6 @@ CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 # CONFIG_EBONY is not set
 CONFIG_EIGER=y
-CONFIG_CMDLINE_BOOL=y
-CONFIG_CMDLINE=""
 CONFIG_PCIEPORTBUS=y
 # CONFIG_PCIEASPM is not set
 CONFIG_NET=y
diff --git a/arch/powerpc/configs/44x/fsp2_defconfig 
b/arch/powerpc/configs/44x/fsp2_defconfig
index e67fc041ca3e..30845ce0885a 100644
--- a/arch/powerpc/configs/44x/fsp2_defconfig
+++ b/arch/powerpc/configs/44x/fsp2_defconfig
@@ -28,7 +28,6 @@ CONFIG_476FPE_ERR46=y
 CONFIG_SWIOTLB=y
 CONFIG_KEXEC=y
 CONFIG_CRASH_DUMP=y
-CONFIG_CMDLINE_BOOL=y
 CONFIG_CMDLINE="ip=on 

[PATCH v3 1/2] powerpc: Remove inaccessible CMDLINE default

2020-06-11 Thread Chris Packham
Since commit cbe46bd4f510 ("powerpc: remove CONFIG_CMDLINE #ifdef mess")
CONFIG_CMDLINE has always had a value regardless of CONFIG_CMDLINE_BOOL.

For example:

 $ make ARCH=powerpc defconfig
 $ cat .config
 # CONFIG_CMDLINE_BOOL is not set
 CONFIG_CMDLINE=""

When enabling CONFIG_CMDLINE_BOOL this value is kept making the 'default
"..." if CONFIG_CMDLINE_BOOL' ineffective.

 $ ./scripts/config --enable CONFIG_CMDLINE_BOOL
 $ cat .config
 CONFIG_CMDLINE_BOOL=y
 CONFIG_CMDLINE=""

Remove CONFIG_CMDLINE_BOOL and the inaccessible default.

Signed-off-by: Chris Packham 
Reviewed-by: Christophe Leroy 
---

Changes in v3:
- none

Changes in v2:
- Rebase on top of Linus's tree
- Fix some typos in commit message
- Add review from Christophe
- Remove CONFIG_CMDLINE_BOOL

 arch/powerpc/Kconfig | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9fa23eb320ff..51abc59c3334 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -859,12 +859,8 @@ config PPC_DENORMALISATION
  Add support for handling denormalisation of single precision
  values.  Useful for bare metal only.  If unsure say Y here.
 
-config CMDLINE_BOOL
-   bool "Default bootloader kernel arguments"
-
 config CMDLINE
-   string "Initial kernel command string" if CMDLINE_BOOL
-   default "console=ttyS0,9600 console=tty0 root=/dev/sda2" if CMDLINE_BOOL
+   string "Initial kernel command string"
default ""
help
  On some platforms, there is currently no way for the boot loader to
-- 
2.27.0



[PATCH v3 0/2] powerpc: CMDLINE config cleanup

2020-06-11 Thread Chris Packham
This series cleans up the config options related to the boot command line.

Chris Packham (2):
  powerpc: Remove inaccessible CMDLINE default
  powerpc: configs: remove CMDLINE_BOOL

 arch/powerpc/Kconfig   | 6 +-
 arch/powerpc/configs/44x/akebono_defconfig | 2 --
 arch/powerpc/configs/44x/arches_defconfig  | 2 --
 arch/powerpc/configs/44x/bamboo_defconfig  | 2 --
 arch/powerpc/configs/44x/bluestone_defconfig   | 2 --
 arch/powerpc/configs/44x/canyonlands_defconfig | 2 --
 arch/powerpc/configs/44x/currituck_defconfig   | 2 --
 arch/powerpc/configs/44x/eiger_defconfig   | 2 --
 arch/powerpc/configs/44x/fsp2_defconfig| 1 -
 arch/powerpc/configs/44x/icon_defconfig| 2 --
 arch/powerpc/configs/44x/iss476-smp_defconfig  | 1 -
 arch/powerpc/configs/44x/katmai_defconfig  | 2 --
 arch/powerpc/configs/44x/rainier_defconfig | 2 --
 arch/powerpc/configs/44x/redwood_defconfig | 2 --
 arch/powerpc/configs/44x/sam440ep_defconfig| 2 --
 arch/powerpc/configs/44x/sequoia_defconfig | 2 --
 arch/powerpc/configs/44x/taishan_defconfig | 2 --
 arch/powerpc/configs/44x/warp_defconfig| 1 -
 arch/powerpc/configs/holly_defconfig   | 1 -
 arch/powerpc/configs/mvme5100_defconfig| 3 +--
 arch/powerpc/configs/ps3_defconfig | 2 --
 arch/powerpc/configs/skiroot_defconfig | 1 -
 arch/powerpc/configs/storcenter_defconfig  | 1 -
 23 files changed, 2 insertions(+), 43 deletions(-)

-- 
2.27.0



Re: [PATCH v2] powerpc: Remove inaccessible CMDLINE default

2020-06-11 Thread Chris Packham

On 11/06/20 5:46 pm, Christophe Leroy wrote:
>
>
> Le 11/06/2020 à 05:41, Chris Packham a écrit :
>> Since commit cbe46bd4f510 ("powerpc: remove CONFIG_CMDLINE #ifdef mess")
>> CONFIG_CMDLINE has always had a value regardless of CONFIG_CMDLINE_BOOL.
>>
>> For example:
>>
>>   $ make ARCH=powerpc defconfig
>>   $ cat .config
>>   # CONFIG_CMDLINE_BOOL is not set
>>   CONFIG_CMDLINE=""
>>
>> When enabling CONFIG_CMDLINE_BOOL this value is kept making the 'default
>> "..." if CONFIG_CMDLINE_BOOL' ineffective.
>>
>>   $ ./scripts/config --enable CONFIG_CMDLINE_BOOL
>>   $ cat .config
>>   CONFIG_CMDLINE_BOOL=y
>>   CONFIG_CMDLINE=""
>>
>> Remove CONFIG_CMDLINE_BOOL and the inaccessible default.
>
> You also have to remove all CONFIG_CMDLINE_BOOL from the defconfigs

OK. I'll do so as a follow-up patch and send a v3.

>
> Christophe
>
>>
>> Signed-off-by: Chris Packham 
>> Reviewed-by: Christophe Leroy 
>> ---
>> It took me a while to get round to sending a v2, for a refresher v1 
>> can be found here:
>>
>> http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20190802050232.22978-1-chris.pack...@alliedtelesis.co.nz/
>>  
>>
>>
>> Changes in v2:
>> - Rebase on top of Linus's tree
>> - Fix some typos in commit message
>> - Add review from Christophe
>> - Remove CONFIG_CMDLINE_BOOL
>>
>>   arch/powerpc/Kconfig | 6 +-
>>   1 file changed, 1 insertion(+), 5 deletions(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 9fa23eb320ff..51abc59c3334 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -859,12 +859,8 @@ config PPC_DENORMALISATION
>>     Add support for handling denormalisation of single precision
>>     values.  Useful for bare metal only.  If unsure say Y here.
>>   -config CMDLINE_BOOL
>> -    bool "Default bootloader kernel arguments"
>> -
>>   config CMDLINE
>> -    string "Initial kernel command string" if CMDLINE_BOOL
>> -    default "console=ttyS0,9600 console=tty0 root=/dev/sda2" if 
>> CMDLINE_BOOL
>> +    string "Initial kernel command string"
>>   default ""
>>   help
>>     On some platforms, there is currently no way for the boot 
>> loader to
>>

[PATCH v2] powerpc: Remove inaccessible CMDLINE default

2020-06-10 Thread Chris Packham
Since commit cbe46bd4f510 ("powerpc: remove CONFIG_CMDLINE #ifdef mess")
CONFIG_CMDLINE has always had a value regardless of CONFIG_CMDLINE_BOOL.

For example:

 $ make ARCH=powerpc defconfig
 $ cat .config
 # CONFIG_CMDLINE_BOOL is not set
 CONFIG_CMDLINE=""

When enabling CONFIG_CMDLINE_BOOL this value is kept making the 'default
"..." if CONFIG_CMDLINE_BOOL' ineffective.

 $ ./scripts/config --enable CONFIG_CMDLINE_BOOL
 $ cat .config
 CONFIG_CMDLINE_BOOL=y
 CONFIG_CMDLINE=""

Remove CONFIG_CMDLINE_BOOL and the inaccessible default.

Signed-off-by: Chris Packham 
Reviewed-by: Christophe Leroy 
---
It took me a while to get round to sending a v2, for a refresher v1 can be 
found here:

http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20190802050232.22978-1-chris.pack...@alliedtelesis.co.nz/

Changes in v2:
- Rebase on top of Linus's tree
- Fix some typos in commit message
- Add review from Christophe
- Remove CONFIG_CMDLINE_BOOL

 arch/powerpc/Kconfig | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9fa23eb320ff..51abc59c3334 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -859,12 +859,8 @@ config PPC_DENORMALISATION
  Add support for handling denormalisation of single precision
  values.  Useful for bare metal only.  If unsure say Y here.
 
-config CMDLINE_BOOL
-   bool "Default bootloader kernel arguments"
-
 config CMDLINE
-   string "Initial kernel command string" if CMDLINE_BOOL
-   default "console=ttyS0,9600 console=tty0 root=/dev/sda2" if CMDLINE_BOOL
+   string "Initial kernel command string"
default ""
help
  On some platforms, there is currently no way for the boot loader to
-- 
2.27.0



[PATCH v3] powerpc/setup_64: Set cache-line-size based on cache-block-size

2020-04-16 Thread Chris Packham
If {i,d}-cache-block-size is set and {i,d}-cache-line-size is not, use
the block-size value for both. Per the devicetree spec cache-line-size
is only needed if it differs from the block size.

Originally the code would fallback from block size to line size. An
error message was printed if both properties were missing.

Later the code was refactored to use clearer names and logic but it
inadvertently made line size a required property. This caused the
default values to be used and in turn leads to Power9 systems using the
wrong size.

Fixes: bd067f83b084 ("powerpc/64: Fix naming of cache block vs. cache line")
Signed-off-by: Chris Packham 
---
It looks as though the bsizep = lsizep is not required per the spec but it's
probably safer to retain it.

Changes in v3:
- Rebase against 5.7.0-rc1
- Add Fixes tag
- Add more information to commit message

Changes in v2:
- Scott pointed out that u-boot should be filling in the cache properties
  (which it does). But it does not specify a cache-line-size because it
  provides a cache-block-size and the spec says you don't have to if they are
  the same. So the error is in the parsing not in the devicetree itself.

 arch/powerpc/kernel/setup_64.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 438a9befce41..8105010b0e76 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -534,6 +534,8 @@ static bool __init parse_cache_info(struct device_node *np,
lsizep = of_get_property(np, propnames[3], NULL);
if (bsizep == NULL)
bsizep = lsizep;
+   if (lsizep == NULL)
+   lsizep = bsizep;
if (lsizep != NULL)
lsize = be32_to_cpu(*lsizep);
if (bsizep != NULL)
-- 
2.25.1



Re: [PATCH v2] powerpc/setup_64: Set cache-line-size based on cache-block-size

2020-04-16 Thread Chris Packham
On Thu, 2020-04-16 at 21:43 +1000, Michael Ellerman wrote:
> Chris Packham  writes:
> > Hi All,
> > 
> > On Wed, 2020-03-25 at 16:18 +1300, Chris Packham wrote:
> > > If {i,d}-cache-block-size is set and {i,d}-cache-line-size is
> > > not,
> > > use
> > > the block-size value for both. Per the devicetree spec cache-
> > > line-
> > > size
> > > is only needed if it differs from the block size.
> > > 
> > > Signed-off-by: Chris Packham 
> > > ---
> > > It looks as though the bsizep = lsizep is not required per the
> > > spec
> > > but it's
> > > probably safer to retain it.
> > > 
> > > Changes in v2:
> > > - Scott pointed out that u-boot should be filling in the cache
> > > properties
> > >   (which it does). But it does not specify a cache-line-size
> > > because
> > > it
> > >   provides a cache-block-size and the spec says you don't have to
> > > if
> > > they are
> > >   the same. So the error is in the parsing not in the devicetree
> > > itself.
> > > 
> > 
> > Ping? This thread went kind of quiet.
> 
> I replied in the other thread:
> 
>   
> https://lore.kernel.org/linuxppc-dev/87369xx99u@mpe.ellerman.id.au/
> 
> But then the merge window happened which is a busy time.
> 

Yeah I figured that was the case.

> What I'd really like is a v3 that incorporates the info I wrote in
> the
> other thread and a Fixes tag.
> 
> If you feel like doing that, that would be great. Otherwise I'll do
> it
> tomorrow.

I'll rebase against Linus's tree and have a go a adding some more words
to the commit message.

> 
> cheers


Re: [PATCH v2] powerpc/setup_64: Set cache-line-size based on cache-block-size

2020-04-15 Thread Chris Packham
Hi All,

On Wed, 2020-03-25 at 16:18 +1300, Chris Packham wrote:
> If {i,d}-cache-block-size is set and {i,d}-cache-line-size is not,
> use
> the block-size value for both. Per the devicetree spec cache-line-
> size
> is only needed if it differs from the block size.
> 
> Signed-off-by: Chris Packham 
> ---
> It looks as though the bsizep = lsizep is not required per the spec
> but it's
> probably safer to retain it.
> 
> Changes in v2:
> - Scott pointed out that u-boot should be filling in the cache
> properties
>   (which it does). But it does not specify a cache-line-size because
> it
>   provides a cache-block-size and the spec says you don't have to if
> they are
>   the same. So the error is in the parsing not in the devicetree
> itself.
> 

Ping? This thread went kind of quiet.

>  arch/powerpc/kernel/setup_64.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/setup_64.c
> b/arch/powerpc/kernel/setup_64.c
> index e05e6dd67ae6..dd8a238b54b8 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -516,6 +516,8 @@ static bool __init parse_cache_info(struct
> device_node *np,
>   lsizep = of_get_property(np, propnames[3], NULL);
>   if (bsizep == NULL)
>   bsizep = lsizep;
> + if (lsizep == NULL)
> + lsizep = bsizep;
>   if (lsizep != NULL)
>   lsize = be32_to_cpu(*lsizep);
>   if (bsizep != NULL)


[PATCH v2] powerpc/setup_64: Set cache-line-size based on cache-block-size

2020-03-24 Thread Chris Packham
If {i,d}-cache-block-size is set and {i,d}-cache-line-size is not, use
the block-size value for both. Per the devicetree spec cache-line-size
is only needed if it differs from the block size.

Signed-off-by: Chris Packham 
---
It looks as though the bsizep = lsizep is not required per the spec but it's
probably safer to retain it.

Changes in v2:
- Scott pointed out that u-boot should be filling in the cache properties
  (which it does). But it does not specify a cache-line-size because it
  provides a cache-block-size and the spec says you don't have to if they are
  the same. So the error is in the parsing not in the devicetree itself.

 arch/powerpc/kernel/setup_64.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index e05e6dd67ae6..dd8a238b54b8 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -516,6 +516,8 @@ static bool __init parse_cache_info(struct device_node *np,
lsizep = of_get_property(np, propnames[3], NULL);
if (bsizep == NULL)
bsizep = lsizep;
+   if (lsizep == NULL)
+   lsizep = bsizep;
if (lsizep != NULL)
lsize = be32_to_cpu(*lsizep);
if (bsizep != NULL)
-- 
2.25.1



Re: [PATCH] powerpc/fsl: Add cache properties for T2080/T2081

2020-03-24 Thread Chris Packham
On Wed, 2020-03-25 at 15:38 +1300, Chris Packham wrote:
> On Tue, 2020-03-24 at 21:08 -0500, Scott Wood wrote:
> > On Wed, 2020-03-25 at 12:59 +1100, Michael Ellerman wrote:
> > > Chris Packham  writes:
> > > > Add the d-cache/i-cache properties for the T208x SoCs. The L1
> > > > cache on
> > > > these SoCs is 32KiB and is split into 64 byte blocks (lines).
> > > > 
> > > > Signed-off-by: Chris Packham  > > > >
> > > > ---
> > > >  arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi | 16
> > > > 
> > > >  1 file changed, 16 insertions(+)
> > > 
> > > LGTM.
> > > 
> > > I'll wait a few days to see if Scott wants to ack it.
> > > 
> > > cheers
> > > 
> > > 
> > > > diff --git a/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
> > > > b/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
> > > > index 3f745de44284..2ad27e16ac16 100644
> > > > --- a/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
> > > > +++ b/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
> > > > @@ -81,6 +81,10 @@ cpus {
> > > > cpu0: PowerPC,e6500@0 {
> > > > device_type = "cpu";
> > > > reg = <0 1>;
> > > > +   d-cache-line-size = <64>;
> > > > +   i-cache-line-size = <64>;
> > > > +   d-cache-size = <32768>;
> > > > +   i-cache-size = <32768>;
> > > > clocks = < 1 0>;
> > > > next-level-cache = <_1>;
> > > > fsl,portid-mapping = <0x8000>;
> > 
> > U-Boot should be setting d/i-cache-size and d/i-cache-block-size --
> > are you
> > using something else?
> 
> Nope it is u-boot specifically
> 
> U-Boot 2017.01-rc3-dirty
> 
> I'm pretty sure the '-dirty' is just a change to use a different
> cross-
> compiler but I can't confirm and I'm a little hesitant to try
> updating
> as I've only got remote access to the board right now.
> 
> > 
> > The line size is the same as the block size so we don't need a
> > separate d/i-
> > cache-line-size.
> > 
> 
> I'm not sure that'll work looking at the code[1]. It has logic to set
> bsizep to lsizep if no block size is set but not the other way round.
> Looking at the spec from devicetree.org this actually seems wrong.
> Perhaps that is the real source of the error.

Sure enough without my change

# ls /sys/firmware/devicetree/base/cpus/PowerPC,e6500@0/
bus-frequency   d-cache-sizename
cache-stash-id  device_type next-level-cache
clock-frequency enable-method   phandle
clocks  fsl,portid-mapping  reg
cpu-release-addri-cache-block-size  status
d-cache-block-size  i-cache-setstimebase-frequency
d-cache-setsi-cache-size

So it's the lack of handling the optional line-size. Different patch
incomming.

> 
> --
> [1] - 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/setup_64.c#n510
> 
> 


Re: [PATCH] powerpc/fsl: Add cache properties for T2080/T2081

2020-03-24 Thread Chris Packham
On Tue, 2020-03-24 at 21:08 -0500, Scott Wood wrote:
> On Wed, 2020-03-25 at 12:59 +1100, Michael Ellerman wrote:
> > Chris Packham  writes:
> > > Add the d-cache/i-cache properties for the T208x SoCs. The L1
> > > cache on
> > > these SoCs is 32KiB and is split into 64 byte blocks (lines).
> > > 
> > > Signed-off-by: Chris Packham 
> > > ---
> > >  arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi | 16 
> > >  1 file changed, 16 insertions(+)
> > 
> > LGTM.
> > 
> > I'll wait a few days to see if Scott wants to ack it.
> > 
> > cheers
> > 
> > 
> > > diff --git a/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
> > > b/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
> > > index 3f745de44284..2ad27e16ac16 100644
> > > --- a/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
> > > +++ b/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
> > > @@ -81,6 +81,10 @@ cpus {
> > >   cpu0: PowerPC,e6500@0 {
> > >   device_type = "cpu";
> > >   reg = <0 1>;
> > > + d-cache-line-size = <64>;
> > > + i-cache-line-size = <64>;
> > > + d-cache-size = <32768>;
> > > + i-cache-size = <32768>;
> > >   clocks = < 1 0>;
> > >   next-level-cache = <_1>;
> > >   fsl,portid-mapping = <0x8000>;
> 
> U-Boot should be setting d/i-cache-size and d/i-cache-block-size --
> are you
> using something else?

Nope it is u-boot specifically

U-Boot 2017.01-rc3-dirty

I'm pretty sure the '-dirty' is just a change to use a different cross-
compiler but I can't confirm and I'm a little hesitant to try updating
as I've only got remote access to the board right now.

> 
> The line size is the same as the block size so we don't need a
> separate d/i-
> cache-line-size.
> 

I'm not sure that'll work looking at the code[1]. It has logic to set
bsizep to lsizep if no block size is set but not the other way round.
Looking at the spec from devicetree.org this actually seems wrong.
Perhaps that is the real source of the error.

--
[1] - 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/kernel/setup_64.c#n510




[PATCH] powerpc/fsl: Add cache properties for T2080/T2081

2020-03-24 Thread Chris Packham
Add the d-cache/i-cache properties for the T208x SoCs. The L1 cache on
these SoCs is 32KiB and is split into 64 byte blocks (lines).

Signed-off-by: Chris Packham 
---
 arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi 
b/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
index 3f745de44284..2ad27e16ac16 100644
--- a/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
+++ b/arch/powerpc/boot/dts/fsl/t208xsi-pre.dtsi
@@ -81,6 +81,10 @@ cpus {
cpu0: PowerPC,e6500@0 {
device_type = "cpu";
reg = <0 1>;
+   d-cache-line-size = <64>;
+   i-cache-line-size = <64>;
+   d-cache-size = <32768>;
+   i-cache-size = <32768>;
clocks = < 1 0>;
next-level-cache = <_1>;
fsl,portid-mapping = <0x8000>;
@@ -88,6 +92,10 @@ cpu0: PowerPC,e6500@0 {
cpu1: PowerPC,e6500@2 {
device_type = "cpu";
reg = <2 3>;
+   d-cache-line-size = <64>;
+   i-cache-line-size = <64>;
+   d-cache-size = <32768>;
+   i-cache-size = <32768>;
clocks = < 1 0>;
next-level-cache = <_1>;
fsl,portid-mapping = <0x8000>;
@@ -95,6 +103,10 @@ cpu1: PowerPC,e6500@2 {
cpu2: PowerPC,e6500@4 {
device_type = "cpu";
reg = <4 5>;
+   d-cache-line-size = <64>;
+   i-cache-line-size = <64>;
+   d-cache-size = <32768>;
+   i-cache-size = <32768>;
clocks = < 1 0>;
next-level-cache = <_1>;
fsl,portid-mapping = <0x8000>;
@@ -102,6 +114,10 @@ cpu2: PowerPC,e6500@4 {
cpu3: PowerPC,e6500@6 {
device_type = "cpu";
reg = <6 7>;
+   d-cache-line-size = <64>;
+   i-cache-line-size = <64>;
+   d-cache-size = <32768>;
+   i-cache-size = <32768>;
clocks = < 1 0>;
next-level-cache = <_1>;
fsl,portid-mapping = <0x8000>;
-- 
2.25.1



Re: Argh, can't find dcache properties !

2020-03-24 Thread Chris Packham
On Tue, 2020-03-24 at 15:47 +1100, Michael Ellerman wrote:
> Chris Packham  writes:
> > Hi All,
> > 
> > Just booting up v5.5.11 on a Freescale T2080RDB and I'm seeing the
> > following mesage.
> > 
> > kern.warning linuxbox kernel: Argh, can't find dcache properties !
> > kern.warning linuxbox kernel: Argh, can't find icache properties !
> > 
> > This was changed from DBG() to pr_warn() in commit 3b9176e9a874
> > ("powerpc/setup_64: fix -Wempty-body warnings") but the message
> > seems
> > to be much older than that. So it's probably been an issue on the
> > T2080
> > (and other QorIQ SoCs) for a while.
> 
> That's an e6500 I think? So 64-bit Book3E.
> 

Yes that's correct.

> You'll be getting the default values, which is 64 bytes so I guess
> that
> works in practice.
> 
> > Looking at the code the t208x doesn't specifiy any of the d-cache-
> > size/i-cache-size properties. Should I add them to silence the
> > warning
> > or switch it to pr_debug()/pr_info()?
> 
> Yeah ideally you'd add them to the device tree(s) for those boards.
> 

I think the info I need is in the block diagram[0]. I'll whip up
a patch.

--
[1] - 
https://www.nxp.com/products/processors-and-microcontrollers/power-architecture/qoriq-communication-processors/t-series/qoriq-t2080-and-t2081-multicore-communications-processors:T2080


Re: [PATCH] powerpc/legacy_serial: Use early_ioremap()

2020-03-23 Thread Chris Packham
Hi Christophe,

On Wed, 2020-02-05 at 12:03 +, Christophe Leroy wrote:
> [0.00] ioremap() called early from
> find_legacy_serial_ports+0x3cc/0x474. Use early_ioremap() instead
> 

I was just about to dig into this error message and found you patch. I
applied it to a v5.5 base.

> find_legacy_serial_ports() is called early from setup_arch(), before
> paging_init(). vmalloc is not available yet, ioremap shouldn't be
> used that early.
> 
> Use early_ioremap() and switch to a regular ioremap() later.
> 
> Signed-off-by: Christophe Leroy 

On my system (Freescale T2080 SOC) this seems to cause a crash/hang in
early boot. Unfortunately because this is affecting the boot console I
don't get any earlyprintk output.

> ---
>  arch/powerpc/kernel/legacy_serial.c | 33 +
> 
>  1 file changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/legacy_serial.c
> b/arch/powerpc/kernel/legacy_serial.c
> index f061e06e9f51..8b2c1a8553a0 100644
> --- a/arch/powerpc/kernel/legacy_serial.c
> +++ b/arch/powerpc/kernel/legacy_serial.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #undef DEBUG
>  
> @@ -34,6 +35,7 @@ static struct legacy_serial_info {
>   unsigned intclock;
>   int irq_check_parent;
>   phys_addr_t taddr;
> + void __iomem*early_addr;
>  } legacy_serial_infos[MAX_LEGACY_SERIAL_PORTS];
>  
>  static const struct of_device_id legacy_serial_parents[] __initconst
> = {
> @@ -325,17 +327,16 @@ static void __init
> setup_legacy_serial_console(int console)
>  {
>   struct legacy_serial_info *info =
> _serial_infos[console];
>   struct plat_serial8250_port *port =
> _serial_ports[console];
> - void __iomem *addr;
>   unsigned int stride;
>  
>   stride = 1 << port->regshift;
>  
>   /* Check if a translated MMIO address has been found */
>   if (info->taddr) {
> - addr = ioremap(info->taddr, 0x1000);
> - if (addr == NULL)
> + info->early_addr = early_ioremap(info->taddr, 0x1000);
> + if (info->early_addr == NULL)
>   return;
> - udbg_uart_init_mmio(addr, stride);
> + udbg_uart_init_mmio(info->early_addr, stride);
>   } else {
>   /* Check if it's PIO and we support untranslated PIO */
>   if (port->iotype == UPIO_PORT && isa_io_special)
> @@ -353,6 +354,30 @@ static void __init
> setup_legacy_serial_console(int console)
>   udbg_uart_setup(info->speed, info->clock);
>  }
>  
> +static int __init ioremap_legacy_serial_console(void)
> +{
> + struct legacy_serial_info *info =
> _serial_infos[legacy_serial_console];
> + struct plat_serial8250_port *port =
> _serial_ports[legacy_serial_console];
> + void __iomem *vaddr;
> +
> + if (legacy_serial_console < 0)
> + return 0;
> +
> + if (!info->early_addr)
> + return 0;
> +
> + vaddr = ioremap(info->taddr, 0x1000);
> + if (WARN_ON(!vaddr))
> + return -ENOMEM;
> +
> + udbg_uart_init_mmio(vaddr, 1 << port->regshift);
> + early_iounmap(info->early_addr, 0x1000);
> + info->early_addr = NULL;
> +
> + return 0;
> +}
> +early_initcall(ioremap_legacy_serial_console);
> +
>  /*
>   * This is called very early, as part of setup_system() or
> eventually
>   * setup_arch(), basically before anything else in this file. This
> function


Argh, can't find dcache properties !

2020-03-22 Thread Chris Packham
Hi All,

Just booting up v5.5.11 on a Freescale T2080RDB and I'm seeing the
following mesage.

kern.warning linuxbox kernel: Argh, can't find dcache properties !
kern.warning linuxbox kernel: Argh, can't find icache properties !

This was changed from DBG() to pr_warn() in commit 3b9176e9a874
("powerpc/setup_64: fix -Wempty-body warnings") but the message seems
to be much older than that. So it's probably been an issue on the T2080
(and other QorIQ SoCs) for a while.

Looking at the code the t208x doesn't specifiy any of the d-cache-
size/i-cache-size properties. Should I add them to silence the warning
or switch it to pr_debug()/pr_info()?

Thanks,
Chris



Re: [PATCH v3] powerpc: Support CMDLINE_EXTEND

2019-11-06 Thread Chris Packham
Hi All,

On Fri, 2019-08-02 at 06:40 +0200, Christophe Leroy wrote:
> 
> Le 02/08/2019 à 00:50, Chris Packham a écrit :
> > Bring powerpc in line with other architectures that support extending or
> > overriding the bootloader provided command line.
> > 
> > The current behaviour is most like CMDLINE_FROM_BOOTLOADER where the
> > bootloader command line is preferred but the kernel config can provide a
> > fallback so CMDLINE_FROM_BOOTLOADER is the default. CMDLINE_EXTEND can
> > be used to append the CMDLINE from the kernel config to the one provided
> > by the bootloader.
> > 
> > Signed-off-by: Chris Packham 
> 
> Reviewed-by: Christophe Leroy 

Just going over some old patches this doesn't appear to be in next. Is
there anything stopping it from being accepted?

> 
> > ---
> > Changes in v3:
> > - don't use BUG_ON in prom_strlcat
> > - rearrange things to eliminate prom_strlcpy
> > 
> > Changes in v2:
> > - incorporate ideas from Christope's patch 
> > https://patchwork.ozlabs.org/patch/1074126/
> > - support CMDLINE_FORCE
> > 
> >   arch/powerpc/Kconfig| 20 +-
> >   arch/powerpc/kernel/prom_init.c | 36 ++---
> >   2 files changed, 43 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 77f6ebf97113..d413fe1b4058 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -852,15 +852,33 @@ config CMDLINE
> >   some command-line options at build time by entering them here.  In
> >   most cases you will need to specify the root device here.
> >   
> > +choice
> > +   prompt "Kernel command line type" if CMDLINE != ""
> > +   default CMDLINE_FROM_BOOTLOADER
> > +
> > +config CMDLINE_FROM_BOOTLOADER
> > +   bool "Use bootloader kernel arguments if available"
> > +   help
> > + Uses the command-line options passed by the boot loader. If
> > + the boot loader doesn't provide any, the default kernel command
> > + string provided in CMDLINE will be used.
> > +
> > +config CMDLINE_EXTEND
> > +   bool "Extend bootloader kernel arguments"
> > +   help
> > + The command-line arguments provided by the boot loader will be
> > + appended to the default kernel command string.
> > +
> >   config CMDLINE_FORCE
> > bool "Always use the default kernel command string"
> > -   depends on CMDLINE_BOOL
> > help
> >   Always use the default kernel command string, even if the boot
> >   loader passes other arguments to the kernel.
> >   This is useful if you cannot or don't want to change the
> >   command-line options your boot loader passes to the kernel.
> >   
> > +endchoice
> > +
> >   config EXTRA_TARGETS
> > string "Additional default image types"
> > help
> > diff --git a/arch/powerpc/kernel/prom_init.c 
> > b/arch/powerpc/kernel/prom_init.c
> > index 514707ef6779..1c7010cc6ec9 100644
> > --- a/arch/powerpc/kernel/prom_init.c
> > +++ b/arch/powerpc/kernel/prom_init.c
> > @@ -298,16 +298,24 @@ static char __init *prom_strstr(const char *s1, const 
> > char *s2)
> > return NULL;
> >   }
> >   
> > -static size_t __init prom_strlcpy(char *dest, const char *src, size_t size)
> > -{
> > -   size_t ret = prom_strlen(src);
> > +static size_t __init prom_strlcat(char *dest, const char *src, size_t 
> > count)
> > +{
> > +   size_t dsize = prom_strlen(dest);
> > +   size_t len = prom_strlen(src);
> > +   size_t res = dsize + len;
> > +
> > +   /* This would be a bug */
> > +   if (dsize >= count)
> > +   return count;
> > +
> > +   dest += dsize;
> > +   count -= dsize;
> > +   if (len >= count)
> > +   len = count-1;
> > +   memcpy(dest, src, len);
> > +   dest[len] = 0;
> > +   return res;
> >   
> > -   if (size) {
> > -   size_t len = (ret >= size) ? size - 1 : ret;
> > -   memcpy(dest, src, len);
> > -   dest[len] = '\0';
> > -   }
> > -   return ret;
> >   }
> >   
> >   #ifdef CONFIG_PPC_PSERIES
> > @@ -759,10 +767,14 @@ static void __init early_cmdline_parse(void)
> >   
> > prom_cmd_line[0] = 0;
> > p = prom_cmd_line;
> > -   if ((long)prom.chosen > 0)
> > +
> > +   if (!IS_ENABLED(CONFIG_CMDLINE_FORCE) && (long)prom.chosen > 0)
> > l = prom_getprop(prom.chosen, "bootargs", p, 
> > COMMAND_LINE_SIZE-1);
> > -   if (IS_ENABLED(CONFIG_CMDLINE_BOOL) && (l <= 0 || p[0] == '\0')) /* dbl 
> > check */
> > -   prom_strlcpy(prom_cmd_line, CONFIG_CMDLINE, 
> > sizeof(prom_cmd_line));
> > +
> > +   if (IS_ENABLED(CONFIG_CMDLINE_EXTEND) || l <= 0 || p[0] == '\0')
> > +   prom_strlcat(prom_cmd_line, " " CONFIG_CMDLINE,
> > +sizeof(prom_cmd_line));
> > +
> > prom_printf("command line: %s\n", prom_cmd_line);
> >   
> >   #ifdef CONFIG_PPC64
> > 


Re: [PATCH v3] powerpc: Support CMDLINE_EXTEND

2019-09-26 Thread Chris Packham
Hi All,

On Fri, 2019-08-02 at 06:40 +0200, Christophe Leroy wrote:
> 
> Le 02/08/2019 à 00:50, Chris Packham a écrit :
> > Bring powerpc in line with other architectures that support extending or
> > overriding the bootloader provided command line.
> > 
> > The current behaviour is most like CMDLINE_FROM_BOOTLOADER where the
> > bootloader command line is preferred but the kernel config can provide a
> > fallback so CMDLINE_FROM_BOOTLOADER is the default. CMDLINE_EXTEND can
> > be used to append the CMDLINE from the kernel config to the one provided
> > by the bootloader.
> > 
> > Signed-off-by: Chris Packham 
> 
> Reviewed-by: Christophe Leroy 

I see this hasn't hit Linus's tree is it waiting for me to do something
or just fallen off the radar?

> 
> > ---
> > Changes in v3:
> > - don't use BUG_ON in prom_strlcat
> > - rearrange things to eliminate prom_strlcpy
> > 
> > Changes in v2:
> > - incorporate ideas from Christope's patch 
> > https://patchwork.ozlabs.org/patch/1074126/
> > - support CMDLINE_FORCE
> > 
> >   arch/powerpc/Kconfig| 20 +-
> >   arch/powerpc/kernel/prom_init.c | 36 ++---
> >   2 files changed, 43 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 77f6ebf97113..d413fe1b4058 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -852,15 +852,33 @@ config CMDLINE
> >   some command-line options at build time by entering them here.  In
> >   most cases you will need to specify the root device here.
> >   
> > +choice
> > +   prompt "Kernel command line type" if CMDLINE != ""
> > +   default CMDLINE_FROM_BOOTLOADER
> > +
> > +config CMDLINE_FROM_BOOTLOADER
> > +   bool "Use bootloader kernel arguments if available"
> > +   help
> > + Uses the command-line options passed by the boot loader. If
> > + the boot loader doesn't provide any, the default kernel command
> > + string provided in CMDLINE will be used.
> > +
> > +config CMDLINE_EXTEND
> > +   bool "Extend bootloader kernel arguments"
> > +   help
> > + The command-line arguments provided by the boot loader will be
> > + appended to the default kernel command string.
> > +
> >   config CMDLINE_FORCE
> > bool "Always use the default kernel command string"
> > -   depends on CMDLINE_BOOL
> > help
> >   Always use the default kernel command string, even if the boot
> >   loader passes other arguments to the kernel.
> >   This is useful if you cannot or don't want to change the
> >   command-line options your boot loader passes to the kernel.
> >   
> > +endchoice
> > +
> >   config EXTRA_TARGETS
> > string "Additional default image types"
> > help
> > diff --git a/arch/powerpc/kernel/prom_init.c 
> > b/arch/powerpc/kernel/prom_init.c
> > index 514707ef6779..1c7010cc6ec9 100644
> > --- a/arch/powerpc/kernel/prom_init.c
> > +++ b/arch/powerpc/kernel/prom_init.c
> > @@ -298,16 +298,24 @@ static char __init *prom_strstr(const char *s1, const 
> > char *s2)
> > return NULL;
> >   }
> >   
> > -static size_t __init prom_strlcpy(char *dest, const char *src, size_t size)
> > -{
> > -   size_t ret = prom_strlen(src);
> > +static size_t __init prom_strlcat(char *dest, const char *src, size_t 
> > count)
> > +{
> > +   size_t dsize = prom_strlen(dest);
> > +   size_t len = prom_strlen(src);
> > +   size_t res = dsize + len;
> > +
> > +   /* This would be a bug */
> > +   if (dsize >= count)
> > +   return count;
> > +
> > +   dest += dsize;
> > +   count -= dsize;
> > +   if (len >= count)
> > +   len = count-1;
> > +   memcpy(dest, src, len);
> > +   dest[len] = 0;
> > +   return res;
> >   
> > -   if (size) {
> > -   size_t len = (ret >= size) ? size - 1 : ret;
> > -   memcpy(dest, src, len);
> > -   dest[len] = '\0';
> > -   }
> > -   return ret;
> >   }
> >   
> >   #ifdef CONFIG_PPC_PSERIES
> > @@ -759,10 +767,14 @@ static void __init early_cmdline_parse(void)
> >   
> > prom_cmd_line[0] = 0;
> > p = prom_cmd_line;
> > -   if ((long)prom.chosen > 0)
> > +
> > +   if (!IS_ENABLED(CONFIG_CMDLINE_FORCE) && (long)prom.chosen > 0)
> > l = prom_getprop(prom.chosen, "bootargs", p, 
> > COMMAND_LINE_SIZE-1);
> > -   if (IS_ENABLED(CONFIG_CMDLINE_BOOL) && (l <= 0 || p[0] == '\0')) /* dbl 
> > check */
> > -   prom_strlcpy(prom_cmd_line, CONFIG_CMDLINE, 
> > sizeof(prom_cmd_line));
> > +
> > +   if (IS_ENABLED(CONFIG_CMDLINE_EXTEND) || l <= 0 || p[0] == '\0')
> > +   prom_strlcat(prom_cmd_line, " " CONFIG_CMDLINE,
> > +sizeof(prom_cmd_line));
> > +
> > prom_printf("command line: %s\n", prom_cmd_line);
> >   
> >   #ifdef CONFIG_PPC64
> > 


Re: [PATCH] powerpc/64e: drop stale call to smp_processor_id() which hangs SMP startup

2019-08-08 Thread Chris Packham
Hi Christophe,

On Thu, 2019-08-08 at 12:48 +, Christophe Leroy wrote:
> Santa commit ebb9d30a6a74 ("powerpc/mm: any thread in one core can be
> the first to setup TLB1") removed the need to know the cpu_id in
> early_init_this_mmu(), but the call to smp_processor_id() which was
> marked __maybe_used remained.
> 
> Since commit ed1cd6deb013 ("powerpc: Activate
> CONFIG_THREAD_INFO_IN_TASK") thread_info cannot be reached before mmu
> is properly set up.
> 
> Drop this stale call to smp_processor_id() which make SMP hang
> when CONFIG_PREEMPT is set.
> 
> Reported-by: Chris Packham 
> Fixes: ebb9d30a6a74 ("powerpc/mm: any thread in one core can be the
> first to setup TLB1")
> Link: https://github.com/linuxppc/issues/issues/264
> Signed-off-by: Christophe Leroy 
> Cc: sta...@vger.kernel.org

Many thanks for your help.

Tested-by: Chris Packham 

> ---
>  arch/powerpc/mm/nohash/tlb.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/powerpc/mm/nohash/tlb.c
> b/arch/powerpc/mm/nohash/tlb.c
> index d4acf6fa0596..bf60983a58c7 100644
> --- a/arch/powerpc/mm/nohash/tlb.c
> +++ b/arch/powerpc/mm/nohash/tlb.c
> @@ -630,7 +630,6 @@ static void early_init_this_mmu(void)
>  #ifdef CONFIG_PPC_FSL_BOOK3E
>   if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) {
>   unsigned int num_cams;
> - int __maybe_unused cpu = smp_processor_id();
>   bool map = true;
>  
>   /* use a quarter of the TLBCAM for bolted linear map
> */

Re: SMP lockup at boot on Freescale/NXP T2080 (powerpc 64)

2019-08-06 Thread Chris Packham
On Wed, 2019-08-07 at 11:13 +1000, Michael Ellerman wrote:
> Chris Packham  writes:
> > 
> > On Tue, 2019-08-06 at 21:32 +1000, Michael Ellerman wrote:
> > > 
> > > Chris Packham  writes:
> > > > 
> > > > On Mon, 2019-08-05 at 14:06 +1200, Chris Packham wrote:
> > > > > 
> > > > > 
> > > > > Hi All,
> > > > > 
> > > > > I have a custom board that uses the Freescale/NXP T2080 SoC.
> > > > > 
> > > > > The board boots fine using v4.19.60 but when I use v5.1.21 it
> > > > > locks
> > > > > up
> > > > > waiting for the other CPUs to come online (earlyprintk output
> > > > > below).
> > > > > If I set maxcpus=0 then the system boots all the way through
> > > > > to
> > > > > userland. The same thing happens with 5.3-rc2.
> > > > > 
> > > > > The defconfig I'm using is 
> > > > > https://gist.github.com/cpackham/f24d0b426f3
> > > > > de0eaaba17b82c3528a9d it was updated from the working
> > > > > v4.19.60
> > > > > defconfig using make olddefconfig.
> > > > > 
> > > > > Does this ring any bells for anyone?
> > > > > 
> > > > > I haven't dug into the differences between the working an
> > > > > non-
> > > > > working
> > > > > versions yet. I'll start looking now.
> > > > I've bisected this to the following commit
> > > Thanks that's super helpful.
> > > 
> > > > 
> > > > 
> > > > commit ed1cd6deb013a11959d17a94e35ce159197632da
> > > > Author: Christophe Leroy 
> > > > Date:   Thu Jan 31 10:08:58 2019 +
> > > > 
> > > > powerpc: Activate CONFIG_THREAD_INFO_IN_TASK
> > > > 
> > > > This patch activates CONFIG_THREAD_INFO_IN_TASK which
> > > > moves the thread_info into task_struct.
> > > > 
> > > > I'll be the first to admit this is well beyond my area of
> > > > knowledge
> > > > so
> > > > I'm unsure what about this patch is problematic but I can be
> > > > fairly
> > > > sure that a build immediately before this patch works while a
> > > > build
> > > > with this patch hangs.
> > > It makes a pretty fundamental change to the way the kernel stores
> > > some
> > > information about each task, moving it off the stack and into the
> > > task
> > > struct.
> > > 
> > > It definitely has the potential to break things, but I thought we
> > > had
> > > reasonable test coverage of the Book3E platforms, I have a
> > > p5020ds
> > > (e5500) that I boot as part of my CI.
> > > 
> > > Aha. If I take your config and try to boot it on my p5020ds I get
> > > the
> > > same behaviour, stuck at SMP bringup. So it seems it's something
> > > in
> > > your
> > > config vs corenet64_smp_defconfig that is triggering the bug.
> > > 
> > > Can you try bisecting what in the config triggers it?
> > > 
> > > To do that you checkout ed1cd6deb013a11959d17a94e35ce159197632da,
> > > then
> > > you build/boot with corenet64_smp_defconfig to confirm it works.
> > > Then
> > > you use tools/testing/ktest/config-bisect.pl to bisect the
> > > changes in
> > > the .config.
> > > 
> > The difference between a working and non working defconfig is
> > CONFIG_PREEMPT specifically CONFIG_PREEMPT=y makes my system hang
> > at
> > boot.
> > 
> > Is that now intentionally prohibited on 64-bit powerpc?
> It's not prohibitied, but it probably should be because no one really
> tests it properly. I have a handful of IBM machines where I boot a
> PREEMPT kernel but that's about it.
> 
> The corenet configs don't have PREEMPT enabled, which suggests it was
> never really supported on those machines.
> 
> But maybe someone from NXP can tell me otherwise.
> 

I think our workloads need CONFIG_PREEMPT=y because our systems have
switch ASIC drivers implemented in userland and we need to be able to
react quickly to network events in order to prevent loops. We have seen
instances of this not happening simply because some other process is in
the middle of a syscall.

One thing I am working on here is a setup with a few vendor boards and
some of our own kit that we can test the upstream kernels on. Hopefully
that'd make these kinds of reports more timely rather than just
whenever we decide to move to a new kernel version.




Re: SMP lockup at boot on Freescale/NXP T2080 (powerpc 64)

2019-08-06 Thread Chris Packham
On Tue, 2019-08-06 at 21:32 +1000, Michael Ellerman wrote:
> Chris Packham  writes:
> > 
> > On Mon, 2019-08-05 at 14:06 +1200, Chris Packham wrote:
> > > 
> > > Hi All,
> > > 
> > > I have a custom board that uses the Freescale/NXP T2080 SoC.
> > > 
> > > The board boots fine using v4.19.60 but when I use v5.1.21 it
> > > locks
> > > up
> > > waiting for the other CPUs to come online (earlyprintk output
> > > below).
> > > If I set maxcpus=0 then the system boots all the way through to
> > > userland. The same thing happens with 5.3-rc2.
> > > 
> > > The defconfig I'm using is 
> > > https://gist.github.com/cpackham/f24d0b426f3
> > > de0eaaba17b82c3528a9d it was updated from the working v4.19.60
> > > defconfig using make olddefconfig.
> > > 
> > > Does this ring any bells for anyone?
> > > 
> > > I haven't dug into the differences between the working an non-
> > > working
> > > versions yet. I'll start looking now.
> > I've bisected this to the following commit
> Thanks that's super helpful.
> 
> > 
> > commit ed1cd6deb013a11959d17a94e35ce159197632da
> > Author: Christophe Leroy 
> > Date:   Thu Jan 31 10:08:58 2019 +
> > 
> > powerpc: Activate CONFIG_THREAD_INFO_IN_TASK
> > 
> > This patch activates CONFIG_THREAD_INFO_IN_TASK which
> > moves the thread_info into task_struct.
> > 
> > I'll be the first to admit this is well beyond my area of knowledge
> > so
> > I'm unsure what about this patch is problematic but I can be fairly
> > sure that a build immediately before this patch works while a build
> > with this patch hangs.
> It makes a pretty fundamental change to the way the kernel stores
> some
> information about each task, moving it off the stack and into the
> task
> struct.
> 
> It definitely has the potential to break things, but I thought we had
> reasonable test coverage of the Book3E platforms, I have a p5020ds
> (e5500) that I boot as part of my CI.
> 
> Aha. If I take your config and try to boot it on my p5020ds I get the
> same behaviour, stuck at SMP bringup. So it seems it's something in
> your
> config vs corenet64_smp_defconfig that is triggering the bug.
> 
> Can you try bisecting what in the config triggers it?
> 
> To do that you checkout ed1cd6deb013a11959d17a94e35ce159197632da,
> then
> you build/boot with corenet64_smp_defconfig to confirm it works. Then
> you use tools/testing/ktest/config-bisect.pl to bisect the changes in
> the .config.
> 
> cheers
> 

The difference between a working and non working defconfig is
CONFIG_PREEMPT specifically CONFIG_PREEMPT=y makes my system hang at
boot.

Is that now intentionally prohibited on 64-bit powerpc?

> > 
> > > 
> > > Booting...
> > > MMU: Supported page sizes
> > >  4 KB as direct
> > >   2048 KB as direct & indirect
> > >   4096 KB as direct
> > >  16384 KB as direct
> > >  65536 KB as direct
> > > 262144 KB as direct
> > >    1048576 KB as direct
> > > MMU: Book3E HW tablewalk enabled
> > > Linux version 5.1.21-at1+ (@chrisp-dl) (gcc version 4.9.3
> > > (crosstool-
> > > NG 
> > > crosstool-ng-1.22.0)) #24 SMP PREEMPT Mon Aug 5 01:42:00 UTC 2019
> > > Found initrd at 0xc0002f045000:0xc0003000
> > > Using CoreNet Generic machine description
> > > Found legacy serial port 0 for /soc@ffe00/serial@11c500
> > >   mem=ffe11c500, taddr=ffe11c500, irq=0, clk=3, speed=0
> > > Found legacy serial port 1 for /soc@ffe00/serial@11c600
> > >   mem=ffe11c600, taddr=ffe11c600, irq=0, clk=3, speed=0
> > > Found legacy serial port 2 for /soc@ffe00/serial@11d500
> > >   mem=ffe11d500, taddr=ffe11d500, irq=0, clk=3, speed=0
> > > Found legacy serial port 3 for /soc@ffe00/serial@11d600
> > >   mem=ffe11d600, taddr=ffe11d600, irq=0, clk=3, speed=0
> > > printk: bootconsole [udbg0] enabled
> > > CPU maps initialized for 2 threads per core
> > >  (thread shift is 1)
> > > Allocated 1856 bytes for 8 pacas
> > > -
> > > phys_mem_size = 0x1
> > > dcache_bsize  = 0x40
> > > icache_bsize  = 0x40
> > > cpu_features  = 0x0003009003b6
> > >   possible= 0x0003009003b6
> > >   always  = 0x0003008003b4
> > > cpu_user_features 

Re: SMP lockup at boot on Freescale/NXP T2080 (powerpc 64)

2019-08-05 Thread Chris Packham
On Mon, 2019-08-05 at 14:06 +1200, Chris Packham wrote:
> Hi All,
> 
> I have a custom board that uses the Freescale/NXP T2080 SoC.
> 
> The board boots fine using v4.19.60 but when I use v5.1.21 it locks
> up
> waiting for the other CPUs to come online (earlyprintk output below).
> If I set maxcpus=0 then the system boots all the way through to
> userland. The same thing happens with 5.3-rc2.
> 
> The defconfig I'm using is 
> https://gist.github.com/cpackham/f24d0b426f3
> de0eaaba17b82c3528a9d it was updated from the working v4.19.60
> defconfig using make olddefconfig.
> 
> Does this ring any bells for anyone?
> 
> I haven't dug into the differences between the working an non-working
> versions yet. I'll start looking now.

I've bisected this to the following commit

commit ed1cd6deb013a11959d17a94e35ce159197632da
Author: Christophe Leroy 
Date:   Thu Jan 31 10:08:58 2019 +

powerpc: Activate CONFIG_THREAD_INFO_IN_TASK

This patch activates CONFIG_THREAD_INFO_IN_TASK which
moves the thread_info into task_struct.

I'll be the first to admit this is well beyond my area of knowledge so
I'm unsure what about this patch is problematic but I can be fairly
sure that a build immediately before this patch works while a build
with this patch hangs.

> 
> Booting...
> MMU: Supported page sizes
>  4 KB as direct
>   2048 KB as direct & indirect
>   4096 KB as direct
>  16384 KB as direct
>  65536 KB as direct
> 262144 KB as direct
>1048576 KB as direct
> MMU: Book3E HW tablewalk enabled
> Linux version 5.1.21-at1+ (@chrisp-dl) (gcc version 4.9.3 (crosstool-
> NG 
> crosstool-ng-1.22.0)) #24 SMP PREEMPT Mon Aug 5 01:42:00 UTC 2019
> Found initrd at 0xc0002f045000:0xc0003000
> Using CoreNet Generic machine description
> Found legacy serial port 0 for /soc@ffe00/serial@11c500
>   mem=ffe11c500, taddr=ffe11c500, irq=0, clk=3, speed=0
> Found legacy serial port 1 for /soc@ffe00/serial@11c600
>   mem=ffe11c600, taddr=ffe11c600, irq=0, clk=3, speed=0
> Found legacy serial port 2 for /soc@ffe00/serial@11d500
>   mem=ffe11d500, taddr=ffe11d500, irq=0, clk=3, speed=0
> Found legacy serial port 3 for /soc@ffe00/serial@11d600
>   mem=ffe11d600, taddr=ffe11d600, irq=0, clk=3, speed=0
> printk: bootconsole [udbg0] enabled
> CPU maps initialized for 2 threads per core
>  (thread shift is 1)
> Allocated 1856 bytes for 8 pacas
> -
> phys_mem_size = 0x1
> dcache_bsize  = 0x40
> icache_bsize  = 0x40
> cpu_features  = 0x0003009003b6
>   possible= 0x0003009003b6
>   always  = 0x0003008003b4
> cpu_user_features = 0xdc008000 0x0800
> mmu_features  = 0x000a0010
> firmware_features = 0x
> -
> CoreNet Generic board
> barrier-nospec: using isync; sync as speculation barrier
> barrier-nospec: patched 412 locations
> Top of RAM: 0x1, Total RAM: 0x1
> Memory hole size: 0MB
> Zone ranges:
>   DMA  [mem 0x-0x7fffefff]
>   Normal   [mem 0x7000-0x]
> Movable zone start for each node
> Early memory node ranges
>   node   0: [mem 0x-0x]
> Initmem setup node 0 [mem 0x-0x]
> On node 0 totalpages: 1048576
>   DMA zone: 7168 pages used for memmap
>   DMA zone: 0 pages reserved
>   DMA zone: 524287 pages, LIFO batch:63
>   Normal zone: 7169 pages used for memmap
>   Normal zone: 524289 pages, LIFO batch:63
> MMU: Allocated 2112 bytes of context maps for 255 contexts
> percpu: Embedded 22 pages/cpu s49304 r0 d40808 u131072
> pcpu-alloc: s49304 r0 d40808 u131072 alloc=1*1048576
> pcpu-alloc: [0] 0 1 2 3 4 5 6 7 
> Built 1 zonelists, mobility grouping on.  Total pages: 1034239
> Kernel command line: console=ttyS0,115200 root=/dev/ram0
> releasefile=linuxbox_ppc64_e6500mc-tb233.rel bootversion=6.2.7
> loglevel=8 mtdoops.mtddev=errlog
> mtdparts=fff80.flash:4088M(user),8M(errlog)
> earlyprintk=ttyS0,115200 real_init=
> /bin/sh securitylevel=1 reladdr=0x100,1522523
> printk: log_buf_len individual max cpu contribution: 4096 bytes
> printk: log_buf_len total cpu_extra contributions: 28672 bytes
> printk: log_buf_len min size: 16384 bytes
> printk: log_buf_len: 65536 bytes
> printk: early log buf free: 12412(75%)
> Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
> Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
> Memory: 3979284K/4194304K available (8704K kernel code, 1584K rwdata,
> 2496K rodata, 472K init

SMP lockup at boot on Freescale/NXP T2080 (powerpc 64)

2019-08-04 Thread Chris Packham
Hi All,

I have a custom board that uses the Freescale/NXP T2080 SoC.

The board boots fine using v4.19.60 but when I use v5.1.21 it locks up
waiting for the other CPUs to come online (earlyprintk output below).
If I set maxcpus=0 then the system boots all the way through to
userland. The same thing happens with 5.3-rc2.

The defconfig I'm using is https://gist.github.com/cpackham/f24d0b426f3
de0eaaba17b82c3528a9d it was updated from the working v4.19.60
defconfig using make olddefconfig.

Does this ring any bells for anyone?

I haven't dug into the differences between the working an non-working
versions yet. I'll start looking now.

Booting...
MMU: Supported page sizes
 4 KB as direct
  2048 KB as direct & indirect
  4096 KB as direct
 16384 KB as direct
 65536 KB as direct
262144 KB as direct
   1048576 KB as direct
MMU: Book3E HW tablewalk enabled
Linux version 5.1.21-at1+ (@chrisp-dl) (gcc version 4.9.3 (crosstool-NG 
crosstool-ng-1.22.0)) #24 SMP PREEMPT Mon Aug 5 01:42:00 UTC 2019
Found initrd at 0xc0002f045000:0xc0003000
Using CoreNet Generic machine description
Found legacy serial port 0 for /soc@ffe00/serial@11c500
  mem=ffe11c500, taddr=ffe11c500, irq=0, clk=3, speed=0
Found legacy serial port 1 for /soc@ffe00/serial@11c600
  mem=ffe11c600, taddr=ffe11c600, irq=0, clk=3, speed=0
Found legacy serial port 2 for /soc@ffe00/serial@11d500
  mem=ffe11d500, taddr=ffe11d500, irq=0, clk=3, speed=0
Found legacy serial port 3 for /soc@ffe00/serial@11d600
  mem=ffe11d600, taddr=ffe11d600, irq=0, clk=3, speed=0
printk: bootconsole [udbg0] enabled
CPU maps initialized for 2 threads per core
 (thread shift is 1)
Allocated 1856 bytes for 8 pacas
-
phys_mem_size = 0x1
dcache_bsize  = 0x40
icache_bsize  = 0x40
cpu_features  = 0x0003009003b6
  possible= 0x0003009003b6
  always  = 0x0003008003b4
cpu_user_features = 0xdc008000 0x0800
mmu_features  = 0x000a0010
firmware_features = 0x
-
CoreNet Generic board
barrier-nospec: using isync; sync as speculation barrier
barrier-nospec: patched 412 locations
Top of RAM: 0x1, Total RAM: 0x1
Memory hole size: 0MB
Zone ranges:
  DMA  [mem 0x-0x7fffefff]
  Normal   [mem 0x7000-0x]
Movable zone start for each node
Early memory node ranges
  node   0: [mem 0x-0x]
Initmem setup node 0 [mem 0x-0x]
On node 0 totalpages: 1048576
  DMA zone: 7168 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 524287 pages, LIFO batch:63
  Normal zone: 7169 pages used for memmap
  Normal zone: 524289 pages, LIFO batch:63
MMU: Allocated 2112 bytes of context maps for 255 contexts
percpu: Embedded 22 pages/cpu s49304 r0 d40808 u131072
pcpu-alloc: s49304 r0 d40808 u131072 alloc=1*1048576
pcpu-alloc: [0] 0 1 2 3 4 5 6 7 
Built 1 zonelists, mobility grouping on.  Total pages: 1034239
Kernel command line: console=ttyS0,115200 root=/dev/ram0
releasefile=linuxbox_ppc64_e6500mc-tb233.rel bootversion=6.2.7
loglevel=8 mtdoops.mtddev=errlog
mtdparts=fff80.flash:4088M(user),8M(errlog)
earlyprintk=ttyS0,115200 real_init=
/bin/sh securitylevel=1 reladdr=0x100,1522523
printk: log_buf_len individual max cpu contribution: 4096 bytes
printk: log_buf_len total cpu_extra contributions: 28672 bytes
printk: log_buf_len min size: 16384 bytes
printk: log_buf_len: 65536 bytes
printk: early log buf free: 12412(75%)
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Memory: 3979284K/4194304K available (8704K kernel code, 1584K rwdata,
2496K rodata, 472K init, 299K bss, 215020K reserved, 0K cma-reserved)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=8, Nodes=1
rcu: Preemptible hierarchical RCU implementation.
rcu:RCU event tracing is enabled.
Tasks RCU enabled.
rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
mpic: Setting up MPIC " OpenPIC  " version 1.2 at ffe04, max 8 CPUs
mpic: ISU size: 512, shift: 9, mask: 1ff
mpic: Initializing for 512 sources
time_init: decrementer frequency = 37.50 MHz
time_init: processor frequency   = 1500.00 MHz
clocksource: timebase: mask: 0x max_cycles:
0x8a60dd6a9, max_idle_ns: 440795204056 ns
clocksource: timebase mult[1aab] shift[24] registered
clockevent: decrementer mult[99a] shift[32] cpu[0]
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 8192 (order: 4, 65536 bytes)
Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes)
e6500 family performance monitor hardware support registered
rcu: Hierarchical SRCU implementation.
smp: Bringing up secondary 

Re: [PATCH] powerpc: Remove inaccessible CMDLINE default

2019-08-04 Thread Chris Packham
On Fri, 2019-08-02 at 07:18 +0200, Christophe Leroy wrote:
> 
> Le 02/08/2019 à 07:02, Chris Packham a écrit :
> > 
> > Since commit cbe46bd4f510 ("powerpc: remove CONFIG_CMDLINE #ifdef
> > mess")
> > CONFIG_CMDLINE has always had a value regardless of
> > CONNIG_CMDLINE_BOOL.
> s/CONNIG/CONFIG/
> 
> > 
> > 
> > For example:
> > 
> >   $ make ARCH=powerpc defconfig
> >   $ cat .config
> >   # CONFIG_CMDLINE_BOOL is not set
> >   CONFIG_CMDLINE=""
> > 
> > When enabling CONNIG_CMDLINE_BOOL this value is kept making the
> > 'default
> > "..." if CONNIG_CMDLINE_BOOL' ineffective.
> s/CONNIG/CONFIG/
> 

Will fix in v2.

> > 
> > 
> >   $ ./scripts/config --enable CONFIG_CMDLINE_BOOL
> >   $ cat .config
> >   CONFIG_CMDLINE_BOOL=y
> >   CONFIG_CMDLINE=""
> > 
> > Additionally all the in-tree powerpc defconfigs that set
> > CONFIG_CMDLINE_BOOL=y also set CONFIG_CMDLINE to something else.
> > For
> > these reasons remove the inaccessible default.
> > 
> > Signed-off-by: Chris Packham 
> Reviewed-by: Christophe Leroy 
> 
> > 
> > ---
> > This should be independent of http://patchwork.ozlabs.org/patch/114
> > 0811/ but
> > I've generated this patch on a stream that has it applied locally.
> > 
> >   arch/powerpc/Kconfig | 1 -
> >   1 file changed, 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index d413fe1b4058..6fca6eba6aee 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -844,7 +844,6 @@ config CMDLINE_BOOL
> >   
> >   config CMDLINE
> >     string "Initial kernel command string" if CMDLINE_BOOL
> > -   default "console=ttyS0,9600 console=tty0 root=/dev/sda2"
> > if CMDLINE_BOOL
> >     default ""
> >     help
> >       On some platforms, there is currently no way for the
> > boot loader to
> > 
> I think we could also get rid of CMDLINE_BOOL totally and use CMDLINE
> != 
> "" instead.

The only reason I can see to keep CMDLINE_BOOL is that it hides the
text input for CMDLINE which seems to be a pattern in Kconfig. Happy to
remove it if that's the consensus. 

I'll wait for the dust to settle on my other patch before sending a v2
of this one.

> 
> Christophe

[PATCH] powerpc: Remove inaccessible CMDLINE default

2019-08-01 Thread Chris Packham
Since commit cbe46bd4f510 ("powerpc: remove CONFIG_CMDLINE #ifdef mess")
CONFIG_CMDLINE has always had a value regardless of CONNIG_CMDLINE_BOOL.

For example:

 $ make ARCH=powerpc defconfig
 $ cat .config
 # CONFIG_CMDLINE_BOOL is not set
 CONFIG_CMDLINE=""

When enabling CONNIG_CMDLINE_BOOL this value is kept making the 'default
"..." if CONNIG_CMDLINE_BOOL' ineffective.

 $ ./scripts/config --enable CONFIG_CMDLINE_BOOL
 $ cat .config
 CONFIG_CMDLINE_BOOL=y
 CONFIG_CMDLINE=""

Additionally all the in-tree powerpc defconfigs that set
CONFIG_CMDLINE_BOOL=y also set CONFIG_CMDLINE to something else. For
these reasons remove the inaccessible default.

Signed-off-by: Chris Packham 
---
This should be independent of http://patchwork.ozlabs.org/patch/1140811/ but
I've generated this patch on a stream that has it applied locally.

 arch/powerpc/Kconfig | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index d413fe1b4058..6fca6eba6aee 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -844,7 +844,6 @@ config CMDLINE_BOOL
 
 config CMDLINE
string "Initial kernel command string" if CMDLINE_BOOL
-   default "console=ttyS0,9600 console=tty0 root=/dev/sda2" if CMDLINE_BOOL
default ""
help
  On some platforms, there is currently no way for the boot loader to
-- 
2.22.0



[PATCH v3] powerpc: Support CMDLINE_EXTEND

2019-08-01 Thread Chris Packham
Bring powerpc in line with other architectures that support extending or
overriding the bootloader provided command line.

The current behaviour is most like CMDLINE_FROM_BOOTLOADER where the
bootloader command line is preferred but the kernel config can provide a
fallback so CMDLINE_FROM_BOOTLOADER is the default. CMDLINE_EXTEND can
be used to append the CMDLINE from the kernel config to the one provided
by the bootloader.

Signed-off-by: Chris Packham 
---
Changes in v3:
- don't use BUG_ON in prom_strlcat
- rearrange things to eliminate prom_strlcpy

Changes in v2:
- incorporate ideas from Christope's patch 
https://patchwork.ozlabs.org/patch/1074126/
- support CMDLINE_FORCE

 arch/powerpc/Kconfig| 20 +-
 arch/powerpc/kernel/prom_init.c | 36 ++---
 2 files changed, 43 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 77f6ebf97113..d413fe1b4058 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -852,15 +852,33 @@ config CMDLINE
  some command-line options at build time by entering them here.  In
  most cases you will need to specify the root device here.
 
+choice
+   prompt "Kernel command line type" if CMDLINE != ""
+   default CMDLINE_FROM_BOOTLOADER
+
+config CMDLINE_FROM_BOOTLOADER
+   bool "Use bootloader kernel arguments if available"
+   help
+ Uses the command-line options passed by the boot loader. If
+ the boot loader doesn't provide any, the default kernel command
+ string provided in CMDLINE will be used.
+
+config CMDLINE_EXTEND
+   bool "Extend bootloader kernel arguments"
+   help
+ The command-line arguments provided by the boot loader will be
+ appended to the default kernel command string.
+
 config CMDLINE_FORCE
bool "Always use the default kernel command string"
-   depends on CMDLINE_BOOL
help
  Always use the default kernel command string, even if the boot
  loader passes other arguments to the kernel.
  This is useful if you cannot or don't want to change the
  command-line options your boot loader passes to the kernel.
 
+endchoice
+
 config EXTRA_TARGETS
string "Additional default image types"
help
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 514707ef6779..1c7010cc6ec9 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -298,16 +298,24 @@ static char __init *prom_strstr(const char *s1, const 
char *s2)
return NULL;
 }
 
-static size_t __init prom_strlcpy(char *dest, const char *src, size_t size)
-{
-   size_t ret = prom_strlen(src);
+static size_t __init prom_strlcat(char *dest, const char *src, size_t count)
+{
+   size_t dsize = prom_strlen(dest);
+   size_t len = prom_strlen(src);
+   size_t res = dsize + len;
+
+   /* This would be a bug */
+   if (dsize >= count)
+   return count;
+
+   dest += dsize;
+   count -= dsize;
+   if (len >= count)
+   len = count-1;
+   memcpy(dest, src, len);
+   dest[len] = 0;
+   return res;
 
-   if (size) {
-   size_t len = (ret >= size) ? size - 1 : ret;
-   memcpy(dest, src, len);
-   dest[len] = '\0';
-   }
-   return ret;
 }
 
 #ifdef CONFIG_PPC_PSERIES
@@ -759,10 +767,14 @@ static void __init early_cmdline_parse(void)
 
prom_cmd_line[0] = 0;
p = prom_cmd_line;
-   if ((long)prom.chosen > 0)
+
+   if (!IS_ENABLED(CONFIG_CMDLINE_FORCE) && (long)prom.chosen > 0)
l = prom_getprop(prom.chosen, "bootargs", p, 
COMMAND_LINE_SIZE-1);
-   if (IS_ENABLED(CONFIG_CMDLINE_BOOL) && (l <= 0 || p[0] == '\0')) /* dbl 
check */
-   prom_strlcpy(prom_cmd_line, CONFIG_CMDLINE, 
sizeof(prom_cmd_line));
+
+   if (IS_ENABLED(CONFIG_CMDLINE_EXTEND) || l <= 0 || p[0] == '\0')
+   prom_strlcat(prom_cmd_line, " " CONFIG_CMDLINE,
+sizeof(prom_cmd_line));
+
prom_printf("command line: %s\n", prom_cmd_line);
 
 #ifdef CONFIG_PPC64
-- 
2.22.0



Re: [PATCH v2] powerpc: Support CMDLINE_EXTEND

2019-08-01 Thread Chris Packham
On Thu, 2019-08-01 at 08:14 +0200, Christophe Leroy wrote:
> 
> Le 01/08/2019 à 04:12, Chris Packham a écrit :
> > 
> > Bring powerpc in line with other architectures that support
> > extending or
> > overriding the bootloader provided command line.
> > 
> > The current behaviour is most like CMDLINE_FROM_BOOTLOADER where
> > the
> > bootloader command line is preferred but the kernel config can
> > provide a
> > fallback so CMDLINE_FROM_BOOTLOADER is the default. CMDLINE_EXTEND
> > can
> > be used to append the CMDLINE from the kernel config to the one
> > provided
> > by the bootloader.
> > 
> > Signed-off-by: Chris Packham 
> > ---
> > While I'm at it does anyone think it's worth getting rid of the
> > default CMDLINE
> > value if CMDLINE_BOOL and maybe CMDLINE_BOOL? Every defconfig in
> > the kernel
> > that sets CMDLINE_BOOL=y also sets CMDLINE to something other than
> > "console=ttyS0,9600 console=tty0 root=/dev/sda2". Removing
> > CMDLINE_BOOL and
> > unconditionally setting the default value of CMDLINE to "" would
> > clean up the
> > Kconfig even more.
> Note 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/co
> mmit/?id=cbe46bd4f5104552b612505b73d366f66efc2341 
> which is already a step forward.
> 
> I guess that default is for users selecting this option manually to
> get 
> a first sensitive CMDLINE. But is it really worth it ?
> 

I'm not even sure if it is working as intended right now. Even without
my changes if I use menuconfig and select CMDLINE_BOOL I end up with 
CONFIG_CMDLINE="" in the resulting .config.

> > 
> > 
> > Changes in v2:
> > - incorporate ideas from Christope's patch https://patchwork.ozlabs
> > .org/patch/1074126/
> > 
> >   arch/powerpc/Kconfig| 20 +++-
> >   arch/powerpc/kernel/prom_init.c | 26 +-
> >   2 files changed, 44 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 77f6ebf97113..d413fe1b4058 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -852,15 +852,33 @@ config CMDLINE
> >       some command-line options at build time by entering
> > them here.  In
> >       most cases you will need to specify the root device
> > here.
> >   
> > +choice
> > +   prompt "Kernel command line type" if CMDLINE != ""
> > +   default CMDLINE_FROM_BOOTLOADER
> > +
> > +config CMDLINE_FROM_BOOTLOADER
> > +   bool "Use bootloader kernel arguments if available"
> > +   help
> > +     Uses the command-line options passed by the boot loader.
> > If
> > +     the boot loader doesn't provide any, the default kernel
> > command
> > +     string provided in CMDLINE will be used.
> > +
> > +config CMDLINE_EXTEND
> > +   bool "Extend bootloader kernel arguments"
> > +   help
> > +     The command-line arguments provided by the boot loader
> > will be
> > +     appended to the default kernel command string.
> > +
> >   config CMDLINE_FORCE
> >     bool "Always use the default kernel command string"
> > -   depends on CMDLINE_BOOL
> >     help
> >       Always use the default kernel command string, even if
> > the boot
> >       loader passes other arguments to the kernel.
> >       This is useful if you cannot or don't want to change
> > the
> >       command-line options your boot loader passes to the
> > kernel.
> >   
> > +endchoice
> > +
> >   config EXTRA_TARGETS
> >     string "Additional default image types"
> >     help
> > diff --git a/arch/powerpc/kernel/prom_init.c
> > b/arch/powerpc/kernel/prom_init.c
> > index 514707ef6779..df29f141dbd2 100644
> > --- a/arch/powerpc/kernel/prom_init.c
> > +++ b/arch/powerpc/kernel/prom_init.c
> > @@ -310,6 +310,25 @@ static size_t __init prom_strlcpy(char *dest,
> > const char *src, size_t size)
> >     return ret;
> >   }
> >   
> > +static size_t __init prom_strlcat(char *dest, const char *src,
> > size_t count)
> > +{
> > +   size_t dsize = prom_strlen(dest);
> > +   size_t len = prom_strlen(src);
> > +   size_t res = dsize + len;
> > +
> > +   /* This would be a bug */
> > +   BUG_ON(dsize >= count);
> Has you pointed in another mail, BUG_ON() should be avoided here.
> I guess if the destination is already full, just return count:
> 
> if (ds

[PATCH v2] powerpc: Support CMDLINE_EXTEND

2019-07-31 Thread Chris Packham
Bring powerpc in line with other architectures that support extending or
overriding the bootloader provided command line.

The current behaviour is most like CMDLINE_FROM_BOOTLOADER where the
bootloader command line is preferred but the kernel config can provide a
fallback so CMDLINE_FROM_BOOTLOADER is the default. CMDLINE_EXTEND can
be used to append the CMDLINE from the kernel config to the one provided
by the bootloader.

Signed-off-by: Chris Packham 
---
While I'm at it does anyone think it's worth getting rid of the default CMDLINE
value if CMDLINE_BOOL and maybe CMDLINE_BOOL? Every defconfig in the kernel
that sets CMDLINE_BOOL=y also sets CMDLINE to something other than
"console=ttyS0,9600 console=tty0 root=/dev/sda2". Removing CMDLINE_BOOL and
unconditionally setting the default value of CMDLINE to "" would clean up the
Kconfig even more.

Changes in v2:
- incorporate ideas from Christope's patch 
https://patchwork.ozlabs.org/patch/1074126/

 arch/powerpc/Kconfig| 20 +++-
 arch/powerpc/kernel/prom_init.c | 26 +-
 2 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 77f6ebf97113..d413fe1b4058 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -852,15 +852,33 @@ config CMDLINE
  some command-line options at build time by entering them here.  In
  most cases you will need to specify the root device here.
 
+choice
+   prompt "Kernel command line type" if CMDLINE != ""
+   default CMDLINE_FROM_BOOTLOADER
+
+config CMDLINE_FROM_BOOTLOADER
+   bool "Use bootloader kernel arguments if available"
+   help
+ Uses the command-line options passed by the boot loader. If
+ the boot loader doesn't provide any, the default kernel command
+ string provided in CMDLINE will be used.
+
+config CMDLINE_EXTEND
+   bool "Extend bootloader kernel arguments"
+   help
+ The command-line arguments provided by the boot loader will be
+ appended to the default kernel command string.
+
 config CMDLINE_FORCE
bool "Always use the default kernel command string"
-   depends on CMDLINE_BOOL
help
  Always use the default kernel command string, even if the boot
  loader passes other arguments to the kernel.
  This is useful if you cannot or don't want to change the
  command-line options your boot loader passes to the kernel.
 
+endchoice
+
 config EXTRA_TARGETS
string "Additional default image types"
help
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 514707ef6779..df29f141dbd2 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -310,6 +310,25 @@ static size_t __init prom_strlcpy(char *dest, const char 
*src, size_t size)
return ret;
 }
 
+static size_t __init prom_strlcat(char *dest, const char *src, size_t count)
+{
+   size_t dsize = prom_strlen(dest);
+   size_t len = prom_strlen(src);
+   size_t res = dsize + len;
+
+   /* This would be a bug */
+   BUG_ON(dsize >= count);
+
+   dest += dsize;
+   count -= dsize;
+   if (len >= count)
+   len = count-1;
+   memcpy(dest, src, len);
+   dest[len] = 0;
+   return res;
+
+}
+
 #ifdef CONFIG_PPC_PSERIES
 static int __init prom_strtobool(const char *s, bool *res)
 {
@@ -761,8 +780,13 @@ static void __init early_cmdline_parse(void)
p = prom_cmd_line;
if ((long)prom.chosen > 0)
l = prom_getprop(prom.chosen, "bootargs", p, 
COMMAND_LINE_SIZE-1);
-   if (IS_ENABLED(CONFIG_CMDLINE_BOOL) && (l <= 0 || p[0] == '\0')) /* dbl 
check */
+
+   if (IS_ENABLED(CONFIG_CMDLINE_FORCE) || l <= 0 || p[0] == '\0')
prom_strlcpy(prom_cmd_line, CONFIG_CMDLINE, 
sizeof(prom_cmd_line));
+   else if (IS_ENABLED(CONFIG_CMDLINE_EXTEND))
+   prom_strlcat(prom_cmd_line, " " CONFIG_CMDLINE,
+sizeof(prom_cmd_line));
+
prom_printf("command line: %s\n", prom_cmd_line);
 
 #ifdef CONFIG_PPC64
-- 
2.22.0



Re: [PATCH] powerpc: Support CMDLINE_EXTEND

2019-07-31 Thread Chris Packham
On Wed, 2019-07-31 at 09:23 +0200, Christophe Leroy wrote:
> 
> Le 30/07/2019 à 23:10, Chris Packham a écrit :
> > 
> > Hi Christophe,
> > 
> > On Tue, 2019-07-30 at 09:02 +0200, Christophe Leroy wrote:
> > > 
> > > 
> > > Le 24/07/2019 à 07:33, Chris Packham a écrit :
> > > > 
> > > > 
> > > > Device tree aware platforms can make use of CMDLINE_EXTEND to
> > > > extend the
> > > > kernel command line provided by the bootloader. This is
> > > > particularly
> > > > useful to set parameters for built-in modules that would
> > > > otherwise
> > > > be
> > > > done at module insertion. Add support for this in the powerpc
> > > > architecture.
> > > > 
> > > > Signed-off-by: Chris Packham  > > > >
> > > > ---
> > > >    arch/powerpc/Kconfig | 12 
> > > I think you also have to implement some stuff in
> > > early_cmdline_parse()
> > > in arch/powerpc/kernel/prom_init.c
> > I my case I didn't need to since the generic code
> > in drivers/of/fdt.c
> > did what I need. For early options or platforms that don't use a
> > device
> > tree then I can see why I'd need the update to update to prom_init.
> > 
> > > 
> > > 
> > > Maybe look at https://patchwork.ozlabs.org/patch/1074126/
> > > 
> > Do you mind if I take this and fold it into a v2 of my patch? Any
> > particular reason it didn't get picked up in April?
> Sure, take it, I don't mind.
> 
> Two reasons it was not picked up in April I believe:
> - It was part of a larger series 
> (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=10051
> 8) 
> and was intended to challenge the series proposed by Daniel 
> (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=98106
> ) 
> but nothing happened.
> - It was conflicting with the ongoing changes for implementing KASAN.
> 
> What you will have to do is to define prom_strlcat() in the same
> spirit 
> as https://patchwork.ozlabs.org/patch/1091621/ by copying it from 
> lib/string.c and 

Is it OK to use BUG_ON in prom_init? If I copy it verbatim then the
code from lib/string.c has a BUG_ON. I could probably change that to
if(x) return -1 if BUG_ON is not appropriate.

> I think you'll be able to drop prom_strlcpy() as that 
> function what only used there.

I think I need to keep prom_strlcpy to handle the CMDLINE_FORCE case.

> 
> Christophe
> 
> > 
> > 
> > > 
> > > Christophe
> > > 
> > > > 
> > > > 
> > > >    1 file changed, 12 insertions(+)
> > > > 
> > > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > > > index d8dcd8820369..cd9b3974aa36 100644
> > > > --- a/arch/powerpc/Kconfig
> > > > +++ b/arch/powerpc/Kconfig
> > > > @@ -851,6 +851,11 @@ config CMDLINE
> > > >       some command-line options at build time by
> > > > entering
> > > > them here.  In
> > > >       most cases you will need to specify the root
> > > > device
> > > > here.
> > > >    
> > > > +choice
> > > > +   prompt "Kernel command line type" if CMDLINE != ""
> > > > +   default CMDLINE_FORCE
> > > > +   depends on CMDLINE_BOOL
> > > > +
> > > >    config CMDLINE_FORCE
> > > >     bool "Always use the default kernel command string"
> > > >     depends on CMDLINE_BOOL
> > > > @@ -860,6 +865,13 @@ config CMDLINE_FORCE
> > > >       This is useful if you cannot or don't want to
> > > > change
> > > > the
> > > >       command-line options your boot loader passes to
> > > > the
> > > > kernel.
> > > >    
> > > > +config CMDLINE_EXTEND
> > > > +   bool "Extend bootloader kernel arguments"
> > > > +   help
> > > > +     The command-line arguments provided by the boot
> > > > loader
> > > > will be
> > > > +     appended to the default kernel command string.
> > > > +endchoice
> > > > +
> > > >    config EXTRA_TARGETS
> > > >     string "Additional default image types"
> > > >     help

Re: [PATCH] powerpc: Support CMDLINE_EXTEND

2019-07-30 Thread Chris Packham
Hi Christophe,

On Tue, 2019-07-30 at 09:02 +0200, Christophe Leroy wrote:
> 
> Le 24/07/2019 à 07:33, Chris Packham a écrit :
> > 
> > Device tree aware platforms can make use of CMDLINE_EXTEND to
> > extend the
> > kernel command line provided by the bootloader. This is
> > particularly
> > useful to set parameters for built-in modules that would otherwise
> > be
> > done at module insertion. Add support for this in the powerpc
> > architecture.
> > 
> > Signed-off-by: Chris Packham 
> > ---
> >   arch/powerpc/Kconfig | 12 
> I think you also have to implement some stuff in
> early_cmdline_parse() 
> in arch/powerpc/kernel/prom_init.c

I my case I didn't need to since the generic code in drivers/of/fdt.c
did what I need. For early options or platforms that don't use a device
tree then I can see why I'd need the update to update to prom_init.

> 
> Maybe look at https://patchwork.ozlabs.org/patch/1074126/
> 

Do you mind if I take this and fold it into a v2 of my patch? Any
particular reason it didn't get picked up in April?

> Christophe
> 
> > 
> >   1 file changed, 12 insertions(+)
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index d8dcd8820369..cd9b3974aa36 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -851,6 +851,11 @@ config CMDLINE
> >       some command-line options at build time by entering
> > them here.  In
> >       most cases you will need to specify the root device
> > here.
> >   
> > +choice
> > +   prompt "Kernel command line type" if CMDLINE != ""
> > +   default CMDLINE_FORCE
> > +   depends on CMDLINE_BOOL
> > +
> >   config CMDLINE_FORCE
> >     bool "Always use the default kernel command string"
> >     depends on CMDLINE_BOOL
> > @@ -860,6 +865,13 @@ config CMDLINE_FORCE
> >       This is useful if you cannot or don't want to change
> > the
> >       command-line options your boot loader passes to the
> > kernel.
> >   
> > +config CMDLINE_EXTEND
> > +   bool "Extend bootloader kernel arguments"
> > +   help
> > +     The command-line arguments provided by the boot loader
> > will be
> > +     appended to the default kernel command string.
> > +endchoice
> > +
> >   config EXTRA_TARGETS
> >     string "Additional default image types"
> >     help
> > 

[PATCH] powerpc: Support CMDLINE_EXTEND

2019-07-23 Thread Chris Packham
Device tree aware platforms can make use of CMDLINE_EXTEND to extend the
kernel command line provided by the bootloader. This is particularly
useful to set parameters for built-in modules that would otherwise be
done at module insertion. Add support for this in the powerpc
architecture.

Signed-off-by: Chris Packham 
---
 arch/powerpc/Kconfig | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index d8dcd8820369..cd9b3974aa36 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -851,6 +851,11 @@ config CMDLINE
  some command-line options at build time by entering them here.  In
  most cases you will need to specify the root device here.
 
+choice
+   prompt "Kernel command line type" if CMDLINE != ""
+   default CMDLINE_FORCE
+   depends on CMDLINE_BOOL
+
 config CMDLINE_FORCE
bool "Always use the default kernel command string"
depends on CMDLINE_BOOL
@@ -860,6 +865,13 @@ config CMDLINE_FORCE
  This is useful if you cannot or don't want to change the
  command-line options your boot loader passes to the kernel.
 
+config CMDLINE_EXTEND
+   bool "Extend bootloader kernel arguments"
+   help
+ The command-line arguments provided by the boot loader will be
+ appended to the default kernel command string.
+endchoice
+
 config EXTRA_TARGETS
string "Additional default image types"
help
-- 
2.22.0



pci memory resources not being assigned to bridge

2019-07-03 Thread Chris Packham
Hi,

I'm in the process of updating our products from a 4.4 based kernel to 
5.1 (and probably 5.2 since that release is imminent).

On one product which uses a Freescale/NXP P2041 CPU, IDT pcie bridge and 
Marvell switch chip[1].

Annoyingly the hardware has a reset line that holds the switch chip in 
reset but the bridge chip has a separate reset line that is not held 
high by default. So when the system starts up the initial scan sees the 
bridge and assigns resource to its end points. Later once the switch 
chip reset is released (either via a gpio-hog or manually) and the bus 
rescanned the switch devices are detected but the host side of the pci 
bridge isn't assigned any memory space so pci accesses trigger a master 
abort which on that SoC leads to a machine check exception.

[root@linuxbox ~]# uname -a
Linux linuxbox 5.2.0-rc7-at1+ #8 SMP Thu Jul 4 04:26:18 UTC 2019 ppc 
GNU/Linu[root@linuxbox ~]# lspci -t
-+-[2000:00]---00.0-[01-06]00.0-[02-06]--+-02.0-[03]--
  |   +-03.0-[04]--
  |   +-04.0-[05]--
  |   \-05.0-[06]--
  \-[:00]-
[root@linuxbox ~]# lspci -v -s 2000:01:00
2000:01:00.0 Class 0604: Device 111d:803e (rev 0e)
 Flags: bus master, fast devsel, latency 0
 Bus: primary=01, secondary=02, subordinate=06, sec-latency=0
 I/O behind bridge: None
 Memory behind bridge: None
 Prefetchable memory behind bridge: None
 Capabilities: [40] Express Upstream Port, MSI 00
 Capabilities: [c0] Power Management version 3
 Capabilities: [100] Advanced Error Reporting
 Capabilities: [200] Virtual Channel
 Kernel driver in use: pcieport

[root@linuxbox ~]# echo 472 >/sys/class/gpio/export
[root@linuxbox ~]# echo out >/sys/class/gpio/gpio472/direction
[root@linuxbox ~]# echo 0 >/sys/class/gpio/gpio472/value
[root@linuxbox ~]# echo 1 >/sys/bus/pci/rescan
pci 2000:03:00.0: [11ab:e097] type 00 class 0x058000
pci 2000:03:00.0: reg 0x10: [mem 0xd000-0xd00f 64bit pref]
pci 2000:03:00.0: reg 0x18: [mem 0x-0x03ff 64bit]
pci 2000:03:00.0: Adding to iommu group 15
pci_bus 2000:03: busn_res: [bus 03] end is updated to 03
pci 2000:04:00.0: [11ab:e097] type 00 class 0x058000
pci 2000:04:00.0: reg 0x10: [mem 0xd000-0xd00f 64bit pref]
pci 2000:04:00.0: reg 0x18: [mem 0x-0x03ff 64bit]
pci 2000:04:00.0: Adding to iommu group 15
pci_bus 2000:04: busn_res: [bus 04] end is updated to 04
pci 2000:05:00.0: [11ab:e097] type 00 class 0x058000
pci 2000:05:00.0: reg 0x10: [mem 0xd000-0xd00f 64bit pref]
pci 2000:05:00.0: reg 0x18: [mem 0x-0x03ff 64bit]
pci 2000:05:00.0: Adding to iommu group 15
pci_bus 2000:05: busn_res: [bus 05] end is updated to 05
pci 2000:06:00.0: [11ab:e097] type 00 class 0x058000
pci 2000:06:00.0: reg 0x10: [mem 0xd000-0xd00f 64bit pref]
pci 2000:06:00.0: reg 0x18: [mem 0x-0x03ff 64bit]
pci 2000:06:00.0: Adding to iommu group 15
pci_bus 2000:06: busn_res: [bus 06] end is updated to 06
pci_bus 2000:02: busn_res: [bus 02-06] end is updated to 06
pci_bus 2000:01: busn_res: [bus 01-06] end is updated to 06
pcieport 2000:00:00.0: BAR 9: no space for [mem size 0x0040 64bit pref]
pcieport 2000:00:00.0: BAR 9: failed to assign [mem size 0x0040 
64bit pref]
pcieport 2000:01:00.0: BAR 8: assigned [mem 0xc4000-0xc4fff]
pcieport 2000:01:00.0: BAR 9: assigned [mem 0xc5000-0xc503f 
64bit pref]
pci 2000:02:02.0: BAR 8: assigned [mem 0xc4000-0xc43ff]
pci 2000:02:03.0: BAR 8: assigned [mem 0xc4400-0xc47ff]
pci 2000:02:04.0: BAR 8: assigned [mem 0xc4800-0xc4bff]
pci 2000:02:05.0: BAR 8: assigned [mem 0xc4c00-0xc4fff]
pci 2000:02:02.0: BAR 9: assigned [mem 0xc5000-0xc500f 64bit pref]
pci 2000:02:03.0: BAR 9: assigned [mem 0xc5010-0xc501f 64bit pref]
pci 2000:02:04.0: BAR 9: assigned [mem 0xc5020-0xc502f 64bit pref]
pci 2000:02:05.0: BAR 9: assigned [mem 0xc5030-0xc503f 64bit pref]
pci 2000:03:00.0: BAR 2: assigned [mem 0xc4000-0xc43ff 64bit]
pci 2000:03:00.0: BAR 0: assigned [mem 0xc5000-0xc500f 64bit pref]
pci 2000:02:02.0: PCI bridge to [bus 03]
pci 2000:02:02.0:   bridge window [mem 0xc4000-0xc43ff]
pci 2000:02:02.0:   bridge window [mem 0xc5000-0xc500f 64bit pref]
pci 2000:04:00.0: BAR 2: assigned [mem 0xc4400-0xc47ff 64bit]
pci 2000:04:00.0: BAR 0: assigned [mem 0xc5010-0xc501f 64bit pref]
pci 2000:02:03.0: PCI bridge to [bus 04]
pci 2000:02:03.0:   bridge window [mem 0xc4400-0xc47ff]
pci 2000:02:03.0:   bridge window [mem 0xc5010-0xc501f 64bit pref]
pci 2000:05:00.0: BAR 2: assigned [mem 0xc4800-0xc4bff 64bit]
pci 2000:05:00.0: BAR 0: assigned [mem 0xc5020-0xc502f 64bit pref]
pci 2000:02:04.0: PCI bridge to [bus 05]
pci 2000:02:04.0:   bridge window [mem 0xc4800-0xc4bff]

Re: [PATCH] EDAC, mv64x60: Remove some code duplication

2018-01-14 Thread Chris Packham
Hi Christophe,

On 14/01/18 06:17, Christophe JAILLET wrote:
> Le 13/01/2018 à 15:22, Borislav Petkov a écrit :
>> + Chris Packham who's been fixing some stuff in here too.
>>
>> On Sat, Jan 13, 2018 at 08:28:21AM +0100, Christophe JAILLET wrote:
>>> Reorder the error handling code in order to release the resources in
>>> reverse order than allocation.
>>>
>>> Introduce a new 'release_group' label in the error handling path and use
>>> it to void some code duplication.
>>>
>>> Signed-off-by: Christophe JAILLET <christophe.jail...@wanadoo.fr>
>>> ---
>>>drivers/edac/mv64x60_edac.c | 7 ---
>>>1 file changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
>>> index 3c68bb525d5d..aa5bc1d8f424 100644
>>> --- a/drivers/edac/mv64x60_edac.c
>>> +++ b/drivers/edac/mv64x60_edac.c
>>> @@ -450,8 +450,8 @@ static int mv64x60_cpu_err_probe(struct platform_device 
>>> *pdev)
>>>   "cpu", 1, NULL, 0, 0, NULL, 0,
>>>   edac_dev_idx);
>>> if (!edac_dev) {
>>> -   devres_release_group(>dev, mv64x60_cpu_err_probe);
>>> -   return -ENOMEM;
>>> +   res = -ENOMEM;
>>> +   goto release_group;
>>> }
>>>
>>> pdata = edac_dev->pvt_info;
>>> @@ -561,8 +561,9 @@ static int mv64x60_cpu_err_probe(struct platform_device 
>>> *pdev)
>>>err2:
>>> edac_device_del_device(>dev);
>>>err:
>>> -   devres_release_group(>dev, mv64x60_cpu_err_probe);
>>> edac_device_free_ctl_info(edac_dev);
>>> +release_group:
>>> +   devres_release_group(>dev, mv64x60_cpu_err_probe);
>>> return res;
>>>}
>>>
>>> -- 
>> Thanks, looks good. But looking at this driver, mv64x60_mc_err_probe()
>> and mv64x60_sram_err_probe() have the same problem too. Can you address them
>> with your patch too pls?
> Will do. mv64x60_pci_err_probe() also needs some tweaks.
> 
>> Also, if you feel like fixing more stuff in this driver, it doesn't use
>> the edac_printk() infrastructure but naked printk() calls. It could be
>> converted to it.
> I will only propose to remove a useless message and improve another one,
> but won't convert the whole driver, sorry.
> 

I take this you mean you have a system with a mv64x60 SoC? You might 
want to make yourself known to the linuxppc-dev list. A while back the 
prospects of dropping CONFIG_MV64X60 was raised[1]. I don't see anyone 
actually following through on this yet but I'm not really following 
linuxppc that closely.

[1] - https://marc.info/?l=linux-edac=149518763115206=2


Re: [PATCH v3 1/3] EDAC: mv64x60: check driver registration success

2017-06-06 Thread Chris Packham
Hi Borislav,

On 30/05/17 09:21, Chris Packham wrote:
> Check the return status of platform_driver_register() in
> mv64x60_edac_init(). Only output messages and initialise the
> edac_op_state if the registration is successful.
> 
> Signed-off-by: Chris Packham <chris.pack...@alliedtelesis.co.nz>
> ---
> Changes in v3:
> - catch the retval of platform_register_drivers and return early on failure
>(thanks Borislav).
> 
>   drivers/edac/mv64x60_edac.c | 8 ++--
>   1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
> index 14b7e7b71eaa..172081551a70 100644
> --- a/drivers/edac/mv64x60_edac.c
> +++ b/drivers/edac/mv64x60_edac.c
> @@ -853,7 +853,11 @@ static struct platform_driver * const drivers[] = {
>   
>   static int __init mv64x60_edac_init(void)
>   {
> - int ret = 0;
> + int ret;
> +
> + ret = platform_register_drivers(drivers, ARRAY_SIZE(drivers));
> + if (ret)
> + return ret;

I actually think this was wrong. The edac_op_state is set as a module 
param and affects the behaviour of the driver probe function which on 
success could be called before we sanity check it below.

I'm back to thinking that my original v1 of just removing the unused ret 
was appropriate. If we still consider the driver too noisy we could just 
remove the printks.

What do you think?

>   
>   printk(KERN_INFO "Marvell MV64x60 EDAC driver " MV64x60_REVISION "\n");
>   printk(KERN_INFO "\t(C) 2006-2007 MontaVista Software\n");
> @@ -867,7 +871,7 @@ static int __init mv64x60_edac_init(void)
>   break;
>   }
>   
> - return platform_register_drivers(drivers, ARRAY_SIZE(drivers));
> + return 0;
>   }
>   module_init(mv64x60_edac_init);
>   
> 



EDAC: simplify total memory calculation

2017-06-06 Thread Chris Packham
This take the approach used by cell_edac.c to obtain the total memory from
the devicetree and applies it to mv64x60_edac.c, altera_edac.c and
cpc925_edac.c which were all manually parsing the reg property. In the case
of mv64x60 this actually fixes cases where #address/size-cells != 1. For
altera and cpc925 this is just a cleanup.

Chris Packham (3):
  EDAC: mv64x60: calculate memory size correctly
  EDAC: altera: simplify calculation of total memory
  EDAC: cpc925: simplify calculation of total memory

 drivers/edac/altera_edac.c  | 24 
 drivers/edac/cpc925_edac.c  | 32 ++--
 drivers/edac/mv64x60_edac.c | 17 +++--
 3 files changed, 29 insertions(+), 44 deletions(-)

-- 
2.13.0



[PATCH v2 3/3] EDAC: cpc925: simplify calculation of total memory

2017-06-06 Thread Chris Packham
Use of_address_to_resource() and resource_size() instead of manually
parsing the "reg" property from the "memory" node(s).

Signed-off-by: Chris Packham <chris.pack...@alliedtelesis.co.nz>
---
Changes in v2:
- New

 drivers/edac/cpc925_edac.c | 32 ++--
 1 file changed, 10 insertions(+), 22 deletions(-)

diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 837b62c4993d..ea347cd7eb92 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -296,30 +297,17 @@ struct cpc925_dev_info {
 static void get_total_mem(struct cpc925_mc_pdata *pdata)
 {
struct device_node *np = NULL;
-   const unsigned int *reg, *reg_end;
-   int len, sw, aw;
-   unsigned long start, size;
+   struct resource res;
+   int ret;
 
-   np = of_find_node_by_type(NULL, "memory");
-   if (!np)
-   return;
+   for_each_node_by_type(np, "memory") {
+   ret = of_address_to_resource(np, 0, );
+   if (ret)
+   continue;
+
+   pdata->total_mem += resource_size();
+   }
 
-   aw = of_n_addr_cells(np);
-   sw = of_n_size_cells(np);
-   reg = (const unsigned int *)of_get_property(np, "reg", );
-   reg_end = reg + len/4;
-
-   pdata->total_mem = 0;
-   do {
-   start = of_read_number(reg, aw);
-   reg += aw;
-   size = of_read_number(reg, sw);
-   reg += sw;
-   edac_dbg(1, "start 0x%lx, size 0x%lx\n", start, size);
-   pdata->total_mem += size;
-   } while (reg < reg_end);
-
-   of_node_put(np);
edac_dbg(0, "total_mem 0x%lx\n", pdata->total_mem);
 }
 
-- 
2.13.0



[PATCH v2 2/3] EDAC: altera: simplify calculation of total memory

2017-06-06 Thread Chris Packham
Use of_address_to_resource() and resource_size() instead of manually
parsing the "reg" property from the "memory" node(s).

Signed-off-by: Chris Packham <chris.pack...@alliedtelesis.co.nz>
---
Changes in v2:
- New

 drivers/edac/altera_edac.c | 24 
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/drivers/edac/altera_edac.c b/drivers/edac/altera_edac.c
index 7717b094fabb..f8b623352627 100644
--- a/drivers/edac/altera_edac.c
+++ b/drivers/edac/altera_edac.c
@@ -214,24 +214,16 @@ static void altr_sdr_mc_create_debugfs_nodes(struct 
mem_ctl_info *mci)
 static unsigned long get_total_mem(void)
 {
struct device_node *np = NULL;
-   const unsigned int *reg, *reg_end;
-   int len, sw, aw;
-   unsigned long start, size, total_mem = 0;
+   struct resource res;
+   int ret;
+   unsigned long total_mem = 0;
 
for_each_node_by_type(np, "memory") {
-   aw = of_n_addr_cells(np);
-   sw = of_n_size_cells(np);
-   reg = (const unsigned int *)of_get_property(np, "reg", );
-   reg_end = reg + (len / sizeof(u32));
-
-   total_mem = 0;
-   do {
-   start = of_read_number(reg, aw);
-   reg += aw;
-   size = of_read_number(reg, sw);
-   reg += sw;
-   total_mem += size;
-   } while (reg < reg_end);
+   ret = of_address_to_resource(np, 0, );
+   if (ret)
+   continue;
+
+   total_mem += resource_size();
}
edac_dbg(0, "total_mem 0x%lx\n", total_mem);
return total_mem;
-- 
2.13.0



  1   2   >