Hi, Ben

Felix Radensky wrote:
Hi, Ben

Adding Feng Kan from AMCC to CC.

Benjamin Herrenschmidt wrote:
On Mon, 2009-12-28 at 12:51 +0200, Felix Radensky wrote:
Hi,

I'm running linux-2.6.33-rc2 on Canyonlands board. When PLX 6254 transparent PCI-PCI bridge is plugged into PCI slot the kernel simply resets the board without printing anything
to console. Without PLX bridge kernel boots fine.

Sorry for the late reply...

No need to apologize, I appreciate you help very much.

I've tracked down the problem to the following code in pci_scan_bridge() in drivers/pci/probe.c:

if (pcibios_assign_all_busses() || broken)
                /* Temporarily disable forwarding of the
                   configuration cycles on all bridges in
                   this bus segment to avoid possible
                   conflicts in the second pass between two
                   bridges programmed with overlapping
                   bus ranges. */
                pci_write_config_dword(dev, PCI_PRIMARY_BUS,
                               buses & ~0xffffff);

If test for broken is removed, kernel boots fine, detects the bridge, but
does not detect the device behind the bridge. The same device plugged
directly into PCI slot is detected correctly.

So we would have a similar mismatch between the initial setup and the
kernel...  However, I don't quite see yet why the kernel trying to fix
it up breaks things, that will need a bit more debugging here...

Can you give it a quick try with adding something like :

 ppc_pci_add_flags(PPC_PCI_REASSIGN_ALL_BUS);

Near the end of ppc4xx_pci.c ? It looks like another case of reset
not actually resetting bridges (are we not properly doing a fundamental
reset ? Stefan what's your take there ?)

The above will cause busses to be re-assigned which is risky because it
will allow the kernel to assign numbers beyond the limits of what
ppc4xx_pci.c supports (see my comments in the thread you quotes).

The good thing is that we now have a working fixmap infrastructure, so
we could/should just move ppc4xx_pci.c to use that, and just always
re-assign busses.

To remind you, tests for broken were added by commit a1c19894b786f10c76ac40e93c6b5d70c9b946d2, and were intended to solve device detection problem behind PCI-E switches, as discussed in this thread:
http://lists.ozlabs.org/pipermail/linuxppc-dev/2008-October/063939.html

PCI: Probing PCI hardware
pci_bus 0000:00: scanning bus
pci 0000:00:06.0: found [3388:0020] class 000604 header type 01
pci 0000:00:06.0: supports D1 D2
pci 0000:00:06.0: PME# supported from D0 D1 D2 D3hot
pci 0000:00:06.0: PME# disabled
pci_bus 0000:00: fixups for bus
pci 0000:00:06.0: scanning behind bridge, config 000000, pass 0
pci 0000:00:06.0: bus configuration invalid, reconfiguring

Ok so we hit a P2P bridge whose primary, secondary and subordinate bus
numbers are all 0, which is clearly unconfigured. I think this is the
root complex bridge

pci 0000:00:06.0: scanning behind bridge, config 000000, pass 1

Now this is when the bus should be reconfigured (pass 1). Sadly the code
doesn't print much debug.

Also from that point, it should renumber things and work...
pci_bus 0000:01: scanning bus

Which it does to some extent. It assigned bus number 1 to it afaik so we
now start looking below the RC bridge:

pci 0000:01:06.0: found [3388:0020] class 000604 header type 01

Hrm... class PCI bridge, vendor 3388 device 0020, is that your PLX ?
It's not the right vendor ID but maybe that's configurable by our OEM or
something...

The bridge and its evaluation board were manufactured by HiNT, later purchased by PLX.
3388:0020 is HiNT HB6 Universal PCI-PCI bridge in transparent mode.

pci 0000:01:06.0: supports D1 D2
pci 0000:01:06.0: PME# supported from D0 D1 D2 D3hot
pci 0000:01:06.0: PME# disabled
pci_bus 0000:01: fixups for bus
pci 0000:00:06.0: PCI bridge to [bus 01-ff]
pci 0000:00:06.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:06.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:06.0: bridge window [mem 0x00000000-0x000fffff 64bit pref]
pci 0000:01:06.0: scanning behind bridge, config ff0100, pass 0

Allright, that's where it gets interesting. It tries to scan behind the
bridge. It gets something it doesn't like. IE, it gets a secondary bus
number of 1 (what the heck ? I wonder what your firmware does) which
Linux is not happy about and decides to renumber it.

U-boot has problems with this bridge as well, so I had to completely disable PCI
support in u-boot to get linux running.
pci 0000:01:06.0: bus configuration invalid, reconfiguring

Now, that's where Linux should have written 000000 to the register,
which is what you commented out.

pci 0000:01:06.0: scanning behind bridge, config ff0100, pass 1
pci_bus 0000:01: bus scan returning with max=01
pci_bus 0000:00: bus scan returning with max=01

Because of that commenting out, it doesn't see the config as 000000 and
thus doesn't re-assign a bus number in pass 1, so from there you can't
see what's behind the bus.

So we have two things here:

 - It seems like the writing of 000000 to the register in pass 0 is
causing your crash. Can you verify that ? IE. Can you verify that it's
indeed crashing on this specific statement:

pci_write_config_dword(dev, PCI_PRIMARY_BUS,
                               buses & ~0xffffff);

When writing to the bridge, and that this seems to be causing a hard
reboot of the system ?

Yes, this particular statement causes hard reboot. With original broken tests restored and writing to bridge commented out the system boots. If writing to bridge happens
I get hard reset.

It might be useful to ask AMCC how that is possible in HW, ie what kind
of signal can be causing that. IE, even if the bridge is causing a PCIe
error, that should not cause a reboot ... right ?

Feng, can you please comment on this ?
 - You can test a quick hack workaround which consists of changing:

    /* Check if setup is sensible at all */
-    if (!pass &&
-    if (1 &&
((buses & 0xff) != bus->number || ((buses >> 8) & 0xff) <= bus->number)) { dev_dbg(&dev->dev, "bus configuration invalid, reconfiguring\n");
        broken = 1;
    }

In -addition- to your commenting out of the broken test. This will cause the second pass to go through the re-assign code path despite the fact that you
have not written 000000 to the bus numbers.

With this change and commented out broken test I still get hard reset.

I didn't try adding ppc_pci_add_flags(PPC_PCI_REASSIGN_ALL_BUS)
If you still want me to try this, please let me know. Should I leave broken
tests enabled in that case ?

Thanks a lot for your help.

Felix.
I now have a custom board with 460EX and the same PLX bridge, running 2.6.23-rc3 Things look better here, as u-boot is now able to properly detect PLX and device behind
it, but kernel still has problems. First, I'm still getting hard reset on

pci_write_config_dword(dev, PCI_PRIMARY_BUS,
                              buses & ~0xffffff);

If this line is removed, PLX is detected twice, see below. I also get hard reset
if pass test is modified as you requested and broken test removed.

Any ideas how to fix this ? I was suspecting PLX evaluation board, but
PLX on our custom board seems to be OK, so it looks like kernel needs fixing.

PCI: Probing PCI hardware
pci_bus 0000:00: scanning bus
pci 0000:00:02.0: found [3388:0020] class 000604 header type 01
pci 0000:00:02.0: calling pcibios_fixup_resources+0x0/0xf4
pci 0000:00:02.0: calling fixup_ppc4xx_pci_bridge+0x0/0x154
pci 0000:00:02.0: calling quirk_resource_alignment+0x0/0x200
pci 0000:00:02.0: supports D1 D2
pci 0000:00:02.0: PME# supported from D0 D1 D2 D3hot
pci 0000:00:02.0: PME# disabled
pci_bus 0000:00: fixups for bus
pci 0000:00:02.0: scanning behind bridge, config 010100, pass 0
pci_bus 0000:01: scanning bus
pci 0000:01:02.0: found [3388:0020] class 000604 header type 01
pci 0000:01:02.0: calling pcibios_fixup_resources+0x0/0xf4
pci 0000:01:02.0: calling fixup_ppc4xx_pci_bridge+0x0/0x154
pci 0000:01:02.0: calling quirk_resource_alignment+0x0/0x200
pci 0000:01:02.0: supports D1 D2
pci 0000:01:02.0: PME# supported from D0 D1 D2 D3hot
pci 0000:01:02.0: PME# disabled
pci_bus 0000:01: fixups for bus
pci 0000:00:02.0: PCI bridge to [bus 01-01]
pci 0000:01:02.0: scanning behind bridge, config 010100, pass 0
pci 0000:01:02.0: bus configuration invalid, reconfiguring
pci 0000:01:02.0: scanning behind bridge, config 010100, pass 1
pci_bus 0000:01: bus scan returning with max=01
pci 0000:00:02.0: scanning behind bridge, config 010100, pass 1
pci_bus 0000:00: bus scan returning with max=01
pci 0000:00:02.0: PCI bridge to [bus 01-01]
pci 0000:00:02.0:   bridge window [io  disabled]
pci 0000:00:02.0:   bridge window [mem disabled]
pci 0000:00:02.0:   bridge window [mem pref disabled]
pci_bus 0000:00: resource 0 [io  0x0000-0xffff]
pci_bus 0000:00: resource 1 [mem 0xd80000000-0xdffffffff]

Thanks.

Felix.

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to