Re: [RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K

2020-12-14 Thread Rob Herring
On Mon, Dec 14, 2020 at 4:43 AM Daniel Thompson
 wrote:
>
> On Fri, Dec 11, 2020 at 05:05:58PM +, Daniel Thompson wrote:
> > On Fri, Dec 11, 2020 at 08:37:40AM -0600, Rob Herring wrote:
> > > On Fri, Dec 11, 2020 at 6:15 AM Daniel Thompson
> > > > BTW I noticed many other pcie-designware drivers take advantage
> > > > of a function called dw_pcie_wait_for_link() in their init paths...
> > > > but my naive attempts to add it to the layerscape driver results
> > > > in non-booting systems so I haven't embarrassed myself by including
> > > > that in the patch!
> > >
> > > You need to look at what's pending for v5.11, because I reworked this
> > > to be more unified. The ordering of init is also possibly changed. The
> > > sequence is now like this:
> > >
> > > dw_pcie_setup_rc(pp);
> > > dw_pcie_msi_init(pp);
> > >
> > > if (!dw_pcie_link_up(pci) && pci->ops->start_link) {
> > > ret = pci->ops->start_link(pci);
> > > if (ret)
> > > goto err_free_msi;
> > > }
> > >
> > > /* Ignore errors, the link may come up later */
> > > dw_pcie_wait_for_link(pci);
> >
> > Thanks. That looks likely to fix it since IIUC dw_pcie_wait_for_link()
> > will end up waiting somewhat like the double check I added to
> > ls_pcie_link_up().
> >
> > I'll take a look at let you know.
>
> Yes. These changes have fixed the enumeration problems for me.
>
> I tested pci/next and I cherry picked your patch series onto v5.10 and
> both are working well.
>
> Given this fixes a bug for me, do you think there is any scope for me
> to whittle down your series into patches for the stable kernels or am
> I likely to find too many extra bits being pulled in?

I think I'd just go the adding a delay route. It's a fairly big series
and depends on my other clean-up done in 5.10. And there's at least
some possibility it regresses some platform given the limited testing
linux-next gets.

Rob


Re: [RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K

2020-12-14 Thread Daniel Thompson
On Fri, Dec 11, 2020 at 05:05:58PM +, Daniel Thompson wrote:
> On Fri, Dec 11, 2020 at 08:37:40AM -0600, Rob Herring wrote:
> > On Fri, Dec 11, 2020 at 6:15 AM Daniel Thompson
> > > BTW I noticed many other pcie-designware drivers take advantage
> > > of a function called dw_pcie_wait_for_link() in their init paths...
> > > but my naive attempts to add it to the layerscape driver results
> > > in non-booting systems so I haven't embarrassed myself by including
> > > that in the patch!
> > 
> > You need to look at what's pending for v5.11, because I reworked this
> > to be more unified. The ordering of init is also possibly changed. The
> > sequence is now like this:
> > 
> > dw_pcie_setup_rc(pp);
> > dw_pcie_msi_init(pp);
> > 
> > if (!dw_pcie_link_up(pci) && pci->ops->start_link) {
> > ret = pci->ops->start_link(pci);
> > if (ret)
> > goto err_free_msi;
> > }
> > 
> > /* Ignore errors, the link may come up later */
> > dw_pcie_wait_for_link(pci);
> 
> Thanks. That looks likely to fix it since IIUC dw_pcie_wait_for_link()
> will end up waiting somewhat like the double check I added to
> ls_pcie_link_up().
> 
> I'll take a look at let you know.

Yes. These changes have fixed the enumeration problems for me.

I tested pci/next and I cherry picked your patch series onto v5.10 and
both are working well.

Given this fixes a bug for me, do you think there is any scope for me
to whittle down your series into patches for the stable kernels or am
I likely to find too many extra bits being pulled in?


Daniel.


Re: [RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K

2020-12-11 Thread Daniel Thompson
On Fri, Dec 11, 2020 at 08:37:40AM -0600, Rob Herring wrote:
> On Fri, Dec 11, 2020 at 6:15 AM Daniel Thompson
>  wrote:
> >
> > I have been chasing down a problem enumerating an NVMe drive on a
> > Honeycomb LX2K (NXP LX2160A). Specifically the drive can only enumerate
> > successfully if the we are emitting lots of console messages via a UART.
> > If the system is booted with `quiet` set then enumeration fails.
> 
> We really need to get away from work-arounds for device X on host Y. I
> suspect they are not limited to that combination only...

No objection on that. This patch was essentially sharing the result of
an investigation where I got stuck at the "now fix it properly" stage!


> How exactly does it fail to enumerate? There's a (racy) linkup check
> on config accesses, is it reporting link down and skipping config
> accesses?

In dmesg terms it looked like this:

--- nvme_borked_gpu_working.linted.dmesg2020-11-18 15:28:35.544118213 
+
+++ nvme_working_gpu_working.linted.dmesg   2020-11-18 15:28:56.180076946 
+
@@ -241,11 +241,19 @@
 pci :00:00.0: reg 0x38: [mem 0x904800-0x9048ff pref]
 pci :00:00.0: supports D1 D2
 pci :00:00.0: PME# supported from D0 D1 D2 D3hot
+pci :01:00.0: [15b7:5009] type 00 class 0x010802
+pci :01:00.0: reg 0x10: [mem 0x904900-0x9049003fff 64bit]
+pci :01:00.0: reg 0x20: [mem 0x9049004000-0x90490040ff 64bit]
 pci :00:00.0: BAR 1: assigned [mem 0x904000-0x9043ff]
 pci :00:00.0: BAR 0: assigned [mem 0x904400-0x9044ff]
 pci :00:00.0: BAR 6: assigned [mem 0x904500-0x9045ff pref]
+pci :00:00.0: BAR 14: assigned [mem 0x904600-0x90460f]
+pci :01:00.0: BAR 0: assigned [mem 0x904600-0x9046003fff 64bit]
+pci :01:00.0: BAR 4: assigned [mem 0x9046004000-0x90460040ff 64bit]
 pci :00:00.0: PCI bridge to [bus 01-ff]
+pci :00:00.0:   bridge window [mem 0x904600-0x90460f]
 pci :00:00.0: Max Payload Size set to  256/ 256 (was  128), Max Read Rq  
256
+pci :01:00.0: Max Payload Size set to  256/ 512 (was  128), Max Read Rq  
256
 layerscape-pcie 380.pcie: host bridge /soc/pcie@380 ranges:
 layerscape-pcie 380.pcie:  MEM 0xa04000..0xa07fff -> 
0x004000
 layerscape-pcie 380.pcie: PCI host bridge to bus 0001:00

... and be aware that the last three lines here are another PCIe
controller coming up just fine and it is about to detect the GPU
(which like the NVMe is also gen3) just fine whether or not we
add a short delay.


> > I guessed this would be due to the timing impact of printk-to-UART and
> > tried to find out where a delay could be added to provoke a successful
> > enumeration.
> >
> > This patch contains the results. The delay length (1ms) was selected by
> > binary searching downwards until the delay was not effective for these
> > devices (Honeycomb LX2K and a Western Digital WD Blue SN550).
> >
> > I have also included the workaround twice (conditionally compiled). The
> > first change is the *latest* possible code path that we can deploy a
> > delay whilst the second is the earliest place I could find.
> >
> > The summary is that the critical window were we are currently relying on
> > a call to the console UART code can "mend" the driver runs from calling
> > dw_pcie_setup_rc() in host init to just before we read the state in the
> > link up callback.
> >
> > Signed-off-by: Daniel Thompson 
> > ---
> >
> > Notes:
> > This patch is RFC (and HACK) because I don't have much clue *why* this
> > patch works... merely that this is the smallest possible change I need
> > to replicate the current accidental printk() workaround.  Perhaps one
> > could argue that RFC here stands for request-for-clue.  All my
> > observations and changes here are empirical and I don't know how best to
> > turn them into something that is not a hack!
> >
> > BTW I noticed many other pcie-designware drivers take advantage
> > of a function called dw_pcie_wait_for_link() in their init paths...
> > but my naive attempts to add it to the layerscape driver results
> > in non-booting systems so I haven't embarrassed myself by including
> > that in the patch!
> 
> You need to look at what's pending for v5.11, because I reworked this
> to be more unified. The ordering of init is also possibly changed. The
> sequence is now like this:
> 
> dw_pcie_setup_rc(pp);
> dw_pcie_msi_init(pp);
> 
> if (!dw_pcie_link_up(pci) && pci->ops->start_link) {
> ret = pci->ops->start_link(pci);
> if (ret)
> goto err_free_msi;
> }
> 
> /* Ignore errors, the link may come up later */
> dw_pcie_wait_for_link(pci);

Thanks. That looks likely to fix it since IIUC dw_pcie_wait_for_link()
will end up waiting somewhat like the double check I added to
ls_pcie_link_up().

I'll take a look at let you know.


Daniel.


Re: [RFC HACK PATCH] PCI: dwc: layerscape: Hack around enumeration problems with Honeycomb LX2K

2020-12-11 Thread Rob Herring
On Fri, Dec 11, 2020 at 6:15 AM Daniel Thompson
 wrote:
>
> I have been chasing down a problem enumerating an NVMe drive on a
> Honeycomb LX2K (NXP LX2160A). Specifically the drive can only enumerate
> successfully if the we are emitting lots of console messages via a UART.
> If the system is booted with `quiet` set then enumeration fails.

We really need to get away from work-arounds for device X on host Y. I
suspect they are not limited to that combination only...

How exactly does it fail to enumerate? There's a (racy) linkup check
on config accesses, is it reporting link down and skipping config
accesses?

> I guessed this would be due to the timing impact of printk-to-UART and
> tried to find out where a delay could be added to provoke a successful
> enumeration.
>
> This patch contains the results. The delay length (1ms) was selected by
> binary searching downwards until the delay was not effective for these
> devices (Honeycomb LX2K and a Western Digital WD Blue SN550).
>
> I have also included the workaround twice (conditionally compiled). The
> first change is the *latest* possible code path that we can deploy a
> delay whilst the second is the earliest place I could find.
>
> The summary is that the critical window were we are currently relying on
> a call to the console UART code can "mend" the driver runs from calling
> dw_pcie_setup_rc() in host init to just before we read the state in the
> link up callback.
>
> Signed-off-by: Daniel Thompson 
> ---
>
> Notes:
> This patch is RFC (and HACK) because I don't have much clue *why* this
> patch works... merely that this is the smallest possible change I need
> to replicate the current accidental printk() workaround.  Perhaps one
> could argue that RFC here stands for request-for-clue.  All my
> observations and changes here are empirical and I don't know how best to
> turn them into something that is not a hack!
>
> BTW I noticed many other pcie-designware drivers take advantage
> of a function called dw_pcie_wait_for_link() in their init paths...
> but my naive attempts to add it to the layerscape driver results
> in non-booting systems so I haven't embarrassed myself by including
> that in the patch!

You need to look at what's pending for v5.11, because I reworked this
to be more unified. The ordering of init is also possibly changed. The
sequence is now like this:

dw_pcie_setup_rc(pp);
dw_pcie_msi_init(pp);

if (!dw_pcie_link_up(pci) && pci->ops->start_link) {
ret = pci->ops->start_link(pci);
if (ret)
goto err_free_msi;
}

/* Ignore errors, the link may come up later */
dw_pcie_wait_for_link(pci);


>  drivers/pci/controller/dwc/pci-layerscape.c | 35 +
>  1 file changed, 35 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pci-layerscape.c 
> b/drivers/pci/controller/dwc/pci-layerscape.c
> index f24f79a70d9a..c354904b90ef 100644
> --- a/drivers/pci/controller/dwc/pci-layerscape.c
> +++ b/drivers/pci/controller/dwc/pci-layerscape.c
> @@ -22,6 +22,8 @@
>
>  #include "pcie-designware.h"
>
> +#define WORKAROUND_LATEST_POSSIBLE
> +
>  /* PEX1/2 Misc Ports Status Register */
>  #define SCFG_PEXMSCPORTSR(pex_idx) (0x94 + (pex_idx) * 4)
>  #define LTSSM_STATE_SHIFT  20
> @@ -113,10 +115,31 @@ static int ls_pcie_link_up(struct dw_pcie *pci)
> struct ls_pcie *pcie = to_ls_pcie(pci);
> u32 state;
>
> +   /*
> +* Strictly speaking *this* (before the ioread32) is the latest
> +* point a simple delay can be effective. If we move the delay
> +* after the ioread32 then the NVMe does not enumerate.
> +*
> +* However this function appears to be frequently called so an
> +* unconditional delay here causes noticeable delay at boot
> +* time. Hence we implement the workaround by retrying the read
> +* after a short delay if we think we might need to return false.
> +*/
> +
> state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >>
>  pcie->drvdata->ltssm_shift) &
>  LTSSM_STATE_MASK;
>
> +#ifdef WORKAROUND_LATEST_POSSIBLE
> +   if (state < LTSSM_PCIE_L0) {
> +   /* see comment above */
> +   mdelay(1);
> +   state = (ioread32(pcie->lut + pcie->drvdata->lut_dbg) >>
> +pcie->drvdata->ltssm_shift) &
> +LTSSM_STATE_MASK;

Side note, I really wonder about the variety of ways in DWC drivers we
have to check link state. The default is a debug register... Are the
standard PCIe registers for this really broken in these cases?

> +   }
> +#endif
> +
> if (state < LTSSM_PCIE_L0)
> return 0;
>
> @@ -152,6 +175,18 @@ static int ls_pcie_host_init(struct pcie_port *pp)
>
> dw_pcie_setup_rc(pp);

Might be related to the PORT_LOGIC_SPEED_CHANGE we