Re: [REGRESSION, bisect] pci: artpec-6: imprecise external abort

2016-10-30 Thread Thorsten Leemhuis
FYI: I added this report to the list of regressions for Linux 4.9. I'll
watch this thread for further updates on this issue to document progress
in my weekly reports. Please let me know via regressi...@leemhuis.info
in case the discussion moves to a different place (bugzilla or another
mail thread for example). tia!

Current status (afaics): Patch available
(https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1249922.html )

Ciao, Thorsten

On 14.10.2016 18:11, Joao Pinto wrote:
> On 10/14/2016 4:24 PM, Niklas Cassel wrote:
>> On 10/14/2016 03:02 PM, Joao Pinto wrote:
>>> Hi Niklas,
>>>
>>>
>>> On 10/14/2016 1:41 PM, Niklas Cassel wrote:
 Hello

> 
> (snip)
> 
>> }
>> }
>>  
>> -   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
>> -
>> if (pp->ops->host_init)
>> pp->ops->host_init(pp);
>>  
>> @@ -809,6 +807,11 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
>>  {
>> u32 val;
>>  
>> +   /* get iATU unroll support */
>> +   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
>> +   dev_dbg(pp->dev, "iATU unroll: %s\n",
>> +pp->iatu_unroll_enabled ? "enabled" : "disabled");
>> +
>> /* set the number of lanes */
>> val = dw_pcie_readl_rc(pp, PCIE_PORT_LINK_CONTROL);
>> val &= ~PORT_LINK_MODE_MASK;
>>
> 
> Seems reasonable to me. Please make the an official patch and I get it tested
> for you.
> 
> Thanks.
> 
>>
>>
>> With my patch I get:
>>
>> [0.976044] OF: PCI: host bridge /pcie@f805 ranges:
>> [0.981307] OF: PCI:IO 0xc0002000..0xc0011fff -> 0x
>> [0.987240] OF: PCI:   MEM 0xc0012000..0xdfff -> 0xc0012000
>> [1.010590] artpec6-pcie f805.pcie: iATU unroll: disabled
>> [1.116381] artpec6-pcie f805.pcie: link up
>> [1.121044] artpec6-pcie f805.pcie: PCI host bridge to bus :00
>>
>> and no SIGBUS/imprecise external abort.
>>
>>
>>
>> The only users of dw_pcie_prog_outbound_atu is
>> dw_pcie_rd_conf, dw_pcie_wr_conf and dw_pcie_setup_rc.
>>
>> As long as dw_pcie_setup_rc calls dw_pcie_iatu_unroll_enabled
>> before calling dw_pcie_prog_outbound_atu,
>> we should be fine (as done in my patch).
>>
>> dw_pcie_rd_conf and dw_pcie_wr_conf is only used by
>> struct pci_ops dw_pcie_ops, which is only used as an argument
>> for pci_scan_root_bus_msi and pci_scan_root_bus
>> (both are called after pp->ops->host_init, i.e.,
>> after dw_pcie_setup_rc). (My patch should be fine for
>> this code path too.)
>>
>>
>> The only other solution would be to break out some code
>> from artpec6_pcie_establish_link and move that to
>> artpec6_pcie_probe.
>> But in that case I would highly recommend that all other
>> dwc-based drivers verify that they are still working after
>> a0601a470537 ("PCI: designware: Add iATU Unroll feature"),
>> since they might also first enable their PCI Express interface
>> module in pp->ops->host_init().
>>
>>
 From the ARTPEC-6 SoC manual:
 By default, the PCI Express interface shall be held in reset and 
 clock-gated.
 Software is required to enable the relevant modules
 (turns on clocks and de-asserts reset) before these modules can be used.

 Turning on the clocks and de-asserting reset is done in 
 pp->ops->host_init().
 We get an external abort when calling dw_pcie_iatu_unroll_enabled,
 since that function does a read from the IP before we are allowed to do
 AXI transfers (at least in the ARTPEC-6 case, might be the same for some
 other SoCs).

 It appears that dw_pcie_iatu_unroll_enabled was actually called _before_
 host_init() in v4 of Joao's patch, but was changed to after host_init() in 
 v5,
 unfortunately the patch doesn't state a reason for the move.

>>
> 
> 
> http://news.gmane.org/find-root.php?message_id=99a01d19-2b5e-19e4-7e73-286acf1684c4%40synopsys.com
>  
> http://mid.gmane.org/99a01d19-2b5e-19e4-7e73-286acf1684c4%40synopsys.com
> 


Re: [REGRESSION, bisect] pci: artpec-6: imprecise external abort

2016-10-14 Thread Joao Pinto
On 10/14/2016 4:24 PM, Niklas Cassel wrote:
> On 10/14/2016 03:02 PM, Joao Pinto wrote:
>> Hi Niklas,
>>
>>
>> On 10/14/2016 1:41 PM, Niklas Cassel wrote:
>>> Hello
>>>

(snip)

> }
> }
>  
> -   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
> -
> if (pp->ops->host_init)
> pp->ops->host_init(pp);
>  
> @@ -809,6 +807,11 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
>  {
> u32 val;
>  
> +   /* get iATU unroll support */
> +   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
> +   dev_dbg(pp->dev, "iATU unroll: %s\n",
> +pp->iatu_unroll_enabled ? "enabled" : "disabled");
> +
> /* set the number of lanes */
> val = dw_pcie_readl_rc(pp, PCIE_PORT_LINK_CONTROL);
> val &= ~PORT_LINK_MODE_MASK;
> 

Seems reasonable to me. Please make the an official patch and I get it tested
for you.

Thanks.

> 
> 
> With my patch I get:
> 
> [0.976044] OF: PCI: host bridge /pcie@f805 ranges:
> [0.981307] OF: PCI:IO 0xc0002000..0xc0011fff -> 0x
> [0.987240] OF: PCI:   MEM 0xc0012000..0xdfff -> 0xc0012000
> [1.010590] artpec6-pcie f805.pcie: iATU unroll: disabled
> [1.116381] artpec6-pcie f805.pcie: link up
> [1.121044] artpec6-pcie f805.pcie: PCI host bridge to bus :00
> 
> and no SIGBUS/imprecise external abort.
> 
> 
> 
> The only users of dw_pcie_prog_outbound_atu is
> dw_pcie_rd_conf, dw_pcie_wr_conf and dw_pcie_setup_rc.
> 
> As long as dw_pcie_setup_rc calls dw_pcie_iatu_unroll_enabled
> before calling dw_pcie_prog_outbound_atu,
> we should be fine (as done in my patch).
> 
> dw_pcie_rd_conf and dw_pcie_wr_conf is only used by
> struct pci_ops dw_pcie_ops, which is only used as an argument
> for pci_scan_root_bus_msi and pci_scan_root_bus
> (both are called after pp->ops->host_init, i.e.,
> after dw_pcie_setup_rc). (My patch should be fine for
> this code path too.)
> 
> 
> The only other solution would be to break out some code
> from artpec6_pcie_establish_link and move that to
> artpec6_pcie_probe.
> But in that case I would highly recommend that all other
> dwc-based drivers verify that they are still working after
> a0601a470537 ("PCI: designware: Add iATU Unroll feature"),
> since they might also first enable their PCI Express interface
> module in pp->ops->host_init().
> 
> 
>>> From the ARTPEC-6 SoC manual:
>>> By default, the PCI Express interface shall be held in reset and 
>>> clock-gated.
>>> Software is required to enable the relevant modules
>>> (turns on clocks and de-asserts reset) before these modules can be used.
>>>
>>> Turning on the clocks and de-asserting reset is done in 
>>> pp->ops->host_init().
>>> We get an external abort when calling dw_pcie_iatu_unroll_enabled,
>>> since that function does a read from the IP before we are allowed to do
>>> AXI transfers (at least in the ARTPEC-6 case, might be the same for some
>>> other SoCs).
>>>
>>> It appears that dw_pcie_iatu_unroll_enabled was actually called _before_
>>> host_init() in v4 of Joao's patch, but was changed to after host_init() in 
>>> v5,
>>> unfortunately the patch doesn't state a reason for the move.
>>>
> 



Re: [REGRESSION, bisect] pci: artpec-6: imprecise external abort

2016-10-14 Thread Niklas Cassel
On 10/14/2016 03:02 PM, Joao Pinto wrote:
> Hi Niklas,
>
>
> On 10/14/2016 1:41 PM, Niklas Cassel wrote:
>> Hello
>>
>> Because of recent changes to drivers/pci/host/pcie-artpec6.c,
>> I was going to try out Bjorn's tag pci-v4.9-changes-2,
>> however I was greeted by an imprecise external abort:
>>
>>
>> [0.613082] Trying to unpack rootfs image as initramfs...
>> [0.886577] Freeing initrd memory: 4724K (c290 - c2d9d000)
> (snip)
>
>
>> [1.282723] [] (driver_register) from [] 
>> (do_one_initcall+0x44/0x174)
>> [1.290919] [] (do_one_initcall) from [] 
>> (kernel_init_freeable+0x158/0x1e8)
>> [1.299636] [] (kernel_init_freeable) from [] 
>> (kernel_init+0x8/0x10c)
>> [1.307828] [] (kernel_init) from [] 
>> (ret_from_fork+0x14/0x3c)
>> [1.315404] Code: eafffef9 e5943008 e5930900 f57ff04f (ea69)
>> [1.321503] ---[ end trace b458093682b1fb9a ]---
>>
>>
>> a git bisect later and the cause appears to be a0601a470537 ("PCI: 
>> designware: Add iATU Unroll feature")
>>
>> the following patch gives me a working system again:
>>
>> diff --git a/drivers/pci/host/pcie-designware.c 
>> b/drivers/pci/host/pcie-designware.c
>> index 035f50c03281..74510508fafc 100644
>> --- a/drivers/pci/host/pcie-designware.c
>> +++ b/drivers/pci/host/pcie-designware.c
>> @@ -637,11 +637,11 @@ int dw_pcie_host_init(struct pcie_port *pp)
>> }
>> }
>>  
>> -   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
>> -
>> if (pp->ops->host_init)
>> pp->ops->host_init(pp);
>>  
>> +   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
>> +
>> pp->root_bus_nr = pp->busn->start;
>> if (IS_ENABLED(CONFIG_PCI_MSI)) {
>> bus = pci_scan_root_bus_msi(pp->dev, pp->root_bus_nr,
> Before invoking the host initialization routine, the pcie driver must check if
> it going to work in Unroll Mode or not. Your work around un fortunately is not
> valid, because you are forcing the host init to be always in legacy mode since
> pp->iatu_unroll_enabled will be 0 (Legacy).
>
> If you check the driver will consider the iATU mode to be Unroll if the 
> PortView
> register has the value 0x, which in 4.80 core means that the Core has
> Unroll activated. From what you are refering, I think that in your setup, the
> driver is assuming that your Core is in Unroll Mode for some reason. Could you
> please check the value of the PortView Register?

I cannot read the PortView register (call dw_pcie_iatu_unroll_enabled),
before calling pp->ops->host_init() (artpec6_pcie_host_init),
since that results in an imprecise external abort.
The value from my print is never displayed before crashing.

The reason why we get an imprecise external abort is because the
PCI Express interface module is by default disabled in the ARTPEC-6 SoC
system controller.
Doing an AXI transfer before the module is enabled will result in a
SIGBUS/imprecise external abort.

The PCI Express interface module gets enabled in artpec6_pcie_establish_link
(which is called from pp->ops->host_init() (artpec6_pcie_host_init)).



I can now see why we cannot move
pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
to after pp->ops->host_init().
pp->ops->host_init() calls dw_pcie_setup_rc, which calls
dw_pcie_prog_outbound_atu, which uses pp->iatu_unroll_enabled.


How about this:

diff --git a/drivers/pci/host/pcie-designware.c 
b/drivers/pci/host/pcie-designware.c
index 035f50c03281..09eca2c5601d 100644
--- a/drivers/pci/host/pcie-designware.c
+++ b/drivers/pci/host/pcie-designware.c
@@ -637,8 +637,6 @@ int dw_pcie_host_init(struct pcie_port *pp)
}
}
 
-   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
-
if (pp->ops->host_init)
pp->ops->host_init(pp);
 
@@ -809,6 +807,11 @@ void dw_pcie_setup_rc(struct pcie_port *pp)
 {
u32 val;
 
+   /* get iATU unroll support */
+   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
+   dev_dbg(pp->dev, "iATU unroll: %s\n",
+pp->iatu_unroll_enabled ? "enabled" : "disabled");
+
/* set the number of lanes */
val = dw_pcie_readl_rc(pp, PCIE_PORT_LINK_CONTROL);
val &= ~PORT_LINK_MODE_MASK;



With my patch I get:

[0.976044] OF: PCI: host bridge /pcie@f805 ranges:
[0.981307] OF: PCI:IO 0xc0002000..0xc0011fff -> 0x
[0.987240] OF: PCI:   MEM 0xc0012000..0xdfff -> 0xc0012000
[1.010590] artpec6-pcie f805.pcie: iATU unroll: disabled
[1.116381] artpec6-pcie f805.pcie: link up
[1.121044] artpec6-pcie f805.pcie: PCI host bridge to bus :00

and no SIGBUS/imprecise external abort.



The only users of dw_pcie_prog_outbound_atu is
dw_pcie_rd_conf, dw_pcie_wr_conf and dw_pcie_setup_rc.

As long as dw_pcie_setup_rc calls dw_pcie_iatu_unroll_enabled
before calling dw_pcie_prog_outbound_atu,
we should be fine (as done in my patch).

dw_pcie_rd_conf and dw

Re: [REGRESSION, bisect] pci: artpec-6: imprecise external abort

2016-10-14 Thread Joao Pinto
Hi Niklas,


On 10/14/2016 1:41 PM, Niklas Cassel wrote:
> Hello
> 
> Because of recent changes to drivers/pci/host/pcie-artpec6.c,
> I was going to try out Bjorn's tag pci-v4.9-changes-2,
> however I was greeted by an imprecise external abort:
> 
> 
> [0.613082] Trying to unpack rootfs image as initramfs...
> [0.886577] Freeing initrd memory: 4724K (c290 - c2d9d000)

(snip)


> [1.282723] [] (driver_register) from [] 
> (do_one_initcall+0x44/0x174)
> [1.290919] [] (do_one_initcall) from [] 
> (kernel_init_freeable+0x158/0x1e8)
> [1.299636] [] (kernel_init_freeable) from [] 
> (kernel_init+0x8/0x10c)
> [1.307828] [] (kernel_init) from [] 
> (ret_from_fork+0x14/0x3c)
> [1.315404] Code: eafffef9 e5943008 e5930900 f57ff04f (ea69)
> [1.321503] ---[ end trace b458093682b1fb9a ]---
> 
> 
> a git bisect later and the cause appears to be a0601a470537 ("PCI: 
> designware: Add iATU Unroll feature")
> 
> the following patch gives me a working system again:
> 
> diff --git a/drivers/pci/host/pcie-designware.c 
> b/drivers/pci/host/pcie-designware.c
> index 035f50c03281..74510508fafc 100644
> --- a/drivers/pci/host/pcie-designware.c
> +++ b/drivers/pci/host/pcie-designware.c
> @@ -637,11 +637,11 @@ int dw_pcie_host_init(struct pcie_port *pp)
> }
> }
>  
> -   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
> -
> if (pp->ops->host_init)
> pp->ops->host_init(pp);
>  
> +   pp->iatu_unroll_enabled = dw_pcie_iatu_unroll_enabled(pp);
> +
> pp->root_bus_nr = pp->busn->start;
> if (IS_ENABLED(CONFIG_PCI_MSI)) {
> bus = pci_scan_root_bus_msi(pp->dev, pp->root_bus_nr,

Before invoking the host initialization routine, the pcie driver must check if
it going to work in Unroll Mode or not. Your work around un fortunately is not
valid, because you are forcing the host init to be always in legacy mode since
pp->iatu_unroll_enabled will be 0 (Legacy).

If you check the driver will consider the iATU mode to be Unroll if the PortView
register has the value 0x, which in 4.80 core means that the Core has
Unroll activated. From what you are refering, I think that in your setup, the
driver is assuming that your Core is in Unroll Mode for some reason. Could you
please check the value of the PortView Register?

Thanks.

Joao

> 
> 
> From the ARTPEC-6 SoC manual:
> By default, the PCI Express interface shall be held in reset and clock-gated.
> Software is required to enable the relevant modules
> (turns on clocks and de-asserts reset) before these modules can be used.
> 
> Turning on the clocks and de-asserting reset is done in pp->ops->host_init().
> We get an external abort when calling dw_pcie_iatu_unroll_enabled,
> since that function does a read from the IP before we are allowed to do
> AXI transfers (at least in the ARTPEC-6 case, might be the same for some
> other SoCs).
> 
> It appears that dw_pcie_iatu_unroll_enabled was actually called _before_
> host_init() in v4 of Joao's patch, but was changed to after host_init() in v5,
> unfortunately the patch doesn't state a reason for the move.
>