> Date: Mon, 10 Sep 2012 13:22:35 +0200
> From: Christian Ehrhardt <[email protected]>
>
> Hi,
>
> first of all: Thanks for your reply!
>
> On Sat, Sep 08, 2012 at 04:04:41PM +0200, Mark Kettenis wrote:
> > > most modern x86 hardware includes more than one PCI root segement.
> > > E.g. a hardware that I have here has three PCI root segemnts with bus
> > > numbers 0, 0x7f, 0x80 and 0xff respectively. (0x7f and 0xff host the
> > > uncore devices of each processor).
> >
> > Well, machines with multiple host bridges have been around for a
> > while. Some of them even pre-date ACPI.
> >
> > >
> > > These segments are already detected by APCI but this information is not
> > > used when attaching PCI busses to the mainbus. Below is a patch that
> > > should solve the PCI bus detection problem in a robust way.
> >
> > I hate ACPI! It's full of lies; I don't trust it.
>
> Agreed.
>
> > It is much better
> > to detect these additional PCI busses by looking at actual hardware
> > registers.
>
> This approach (unfortunately) means that device detection breaks
> with each new chipset and I hate that :-)
Yeah, it does seem that Intel comes up with a different "solution" in
every CPU/chipset generation :(.
> > So before we go down this path, can you investigate that?
> > It seems the bus number for the additional "Uncore" devices of the
> > Xeon E5 1600/2600 CPUs is available in config register 0x108 of device
> > 0:5:0.
>
> Yes, I already figured that. It works and I can cook a patch if you want.
> The patch is not completely straight forward, tough, because the device is
> not just a bridge but a multi-purpose devices.
>
> > But I haven't figured how to find the bus number for the
> > second socket. Looking at the AML for this machine and determining
> > how it detects these additional busses might provide some clues. Feel
> > free to send me the acpidump output for this box.
>
> I will send you ACPI data in private. However, the second PCI bus is
> simply "hardwired" in DSDT, i.e. there does not seem to be code in ACPI that
> would try to detect its presence.
>
> I tend to believe that there is no bridge device that exposes the
> second PCI bus.
It seems so. I haven't really found any hints in the CPU
documentation either. It's probaby in some super-sekrit BIOS
developers guide that's only available under NDA :(.
Seems we have no choice other than relying on acpi.
> > > There's a small problem though: Some of the secondary PCI segments
> > > can show up as busses behind some kind of host bridge device dowstream
> > > of pci bus 0, too. These PCI busses would be attached twice, now. Thus we
> > > keep track of attached root segemnents in the ACPI code.
> >
> > Yup. Since we have code to detect additional host bridges some of
> > them may already have been attached. And it would be a good way to
> > defend against ACPI lying to us. However, please keep the ACPI hooks
> > out of the MI PCI code. It'd be better if you used pci_attach_hook(),
> > which lives in arch/pci/pci_machdep.c.
>
> Ok, will do that.
>
> However, there might be a different solution:
> We could track attached busses (not just host bridge busses but all busses)
> in the pci code (or via MD code and hooks but I think that would be a
> generic thing) and do nothing in pciattach if a bus is discovered for
> the second time.
>
> With this MD code could use whatever methods it wants to detect otherwise
> undiscovered host bridges. Possible methods might be magic numbers, hints
> from ACPI or even a linear scan of all possible bus numbers.
>
> Does this sound like a possible solution?
It actually got me thinking about proper bus number resource
accounting again. Something that's needed for properly hot-plugging
pci busses. Below is a diff that does this.
With that code, your acpi code can just try to grab the bus number
using extent_alloc_region() and only attach a bus if that succeeds.
Index: dev/pci/pci.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/pci.c,v
retrieving revision 1.95
diff -u -p -r1.95 pci.c
--- dev/pci/pci.c 7 Sep 2012 19:26:48 -0000 1.95
+++ dev/pci/pci.c 13 Sep 2012 20:47:25 -0000
@@ -175,6 +175,7 @@ pciattach(struct device *parent, struct
sc->sc_ioex = pba->pba_ioex;
sc->sc_memex = pba->pba_memex;
sc->sc_pmemex = pba->pba_pmemex;
+ sc->sc_busex = pba->pba_busex;
sc->sc_domain = pba->pba_domain;
sc->sc_bus = pba->pba_bus;
sc->sc_bridgetag = pba->pba_bridgetag;
@@ -182,7 +183,15 @@ pciattach(struct device *parent, struct
sc->sc_maxndevs = pci_bus_maxdevs(pba->pba_pc, pba->pba_bus);
sc->sc_intrswiz = pba->pba_intrswiz;
sc->sc_intrtag = pba->pba_intrtag;
+
+ /* Reserve our own bus number. */
+ if (sc->sc_busex)
+ extent_alloc_region(sc->sc_busex, sc->sc_bus, 1, EX_NOWAIT);
+
pci_enumerate_bus(sc, pci_reserve_resources, NULL);
+ if (sc->sc_busex)
+ extent_print(sc->sc_busex);
+
pci_enumerate_bus(sc, pci_count_vga, NULL);
if (pci_enumerate_bus(sc, pci_primary_vga, NULL))
pci_vga_pci = sc;
@@ -401,6 +410,7 @@ pci_probe_device(struct pci_softc *sc, p
pa.pa_ioex = sc->sc_ioex;
pa.pa_memex = sc->sc_memex;
pa.pa_pmemex = sc->sc_pmemex;
+ pa.pa_busex = sc->sc_busex;
pa.pa_domain = sc->sc_domain;
pa.pa_bus = bus;
pa.pa_device = device;
@@ -749,12 +759,16 @@ pci_reserve_resources(struct pci_attach_
{
pci_chipset_tag_t pc = pa->pa_pc;
pcitag_t tag = pa->pa_tag;
- pcireg_t bhlc, blr, type;
+ pcireg_t bhlc, blr, type, bir;
bus_addr_t base, limit;
bus_size_t size;
int reg, reg_start, reg_end;
+ int bus, dev, func;
+ int sec, sub;
int flags;
+ pci_decompose_tag(pc, tag, &bus, &dev, &func);
+
bhlc = pci_conf_read(pc, tag, PCI_BHLC_REG);
switch (PCI_HDRTYPE_TYPE(bhlc)) {
case 0:
@@ -795,8 +809,8 @@ pci_reserve_resources(struct pci_attach_
#endif
if (pa->pa_memex && extent_alloc_region(pa->pa_memex,
base, size, EX_NOWAIT)) {
- printf("mem address conflict 0x%x/0x%x\n",
- base, size);
+ printf("%d:%d:%d: mem address conflict
0x%x/0x%x\n",
+ bus, dev, func, base, size);
pci_conf_write(pc, tag, reg, 0);
if (type & PCI_MAPREG_MEM_TYPE_64BIT)
pci_conf_write(pc, tag, reg + 4, 0);
@@ -805,8 +819,8 @@ pci_reserve_resources(struct pci_attach_
case PCI_MAPREG_TYPE_IO:
if (pa->pa_ioex && extent_alloc_region(pa->pa_ioex,
base, size, EX_NOWAIT)) {
- printf("io address conflict 0x%x/0x%x\n",
- base, size);
+ printf("%d:%d:%d: io address conflict
0x%x/0x%x\n",
+ bus, dev, func, base, size);
pci_conf_write(pc, tag, reg, 0);
}
break;
@@ -832,8 +846,8 @@ pci_reserve_resources(struct pci_attach_
size = 0;
if (pa->pa_ioex && base > 0 && size > 0) {
if (extent_alloc_region(pa->pa_ioex, base, size, EX_NOWAIT)) {
- printf("bridge io address conflict 0x%x/0x%x\n",
- base, size);
+ printf("%d:%d:%d: bridge io address conflict
0x%x/0x%x\n",
+ bus, dev, func, base, size);
blr &= 0xffff0000;
blr |= 0x000000f0;
pci_conf_write(pc, tag, PPB_REG_IOSTATUS, blr);
@@ -850,8 +864,8 @@ pci_reserve_resources(struct pci_attach_
size = 0;
if (pa->pa_memex && base > 0 && size > 0) {
if (extent_alloc_region(pa->pa_memex, base, size, EX_NOWAIT)) {
- printf("bridge mem address conflict 0x%x/0x%x\n",
- base, size);
+ printf("%d:%d:%d: bridge mem address conflict
0x%x/0x%x\n",
+ bus, dev, func, base, size);
pci_conf_write(pc, tag, PPB_REG_MEM, 0x0000fff0);
}
}
@@ -866,15 +880,27 @@ pci_reserve_resources(struct pci_attach_
size = 0;
if (pa->pa_pmemex && base > 0 && size > 0) {
if (extent_alloc_region(pa->pa_pmemex, base, size, EX_NOWAIT)) {
- printf("bridge mem address conflict 0x%x/0x%x\n",
- base, size);
+ printf("%d:%d:%d: bridge mem address conflict
0x%x/0x%x\n",
+ bus, dev, func, base, size);
pci_conf_write(pc, tag, PPB_REG_PREFMEM, 0x0000fff0);
}
} else if (pa->pa_memex && base > 0 && size > 0) {
if (extent_alloc_region(pa->pa_memex, base, size, EX_NOWAIT)) {
- printf("bridge mem address conflict 0x%x/0x%x\n",
- base, size);
+ printf("%d:%d:%d: bridge mem address conflict
0x%x/0x%x\n",
+ bus, dev, func, base, size);
pci_conf_write(pc, tag, PPB_REG_PREFMEM, 0x0000fff0);
+ }
+ }
+
+ /* Figure out the bus range handled by the bridge. */
+ bir = pci_conf_read(pc, tag, PPB_REG_BUSINFO);
+ sec = PPB_BUSINFO_SECONDARY(bir);
+ sub = PPB_BUSINFO_SUBORDINATE(bir);
+ if (pa->pa_busex && sub >= sec) {
+ if (extent_alloc_region(pa->pa_busex, sec, sub - sec + 1,
+ EX_NOWAIT)) {
+ printf("%d:%d:%d: bridge bus conflict %d-%d\n",
+ bus, dev, func, sec, sub);
}
}
Index: dev/pci/pcivar.h
===================================================================
RCS file: /cvs/src/sys/dev/pci/pcivar.h,v
retrieving revision 1.66
diff -u -p -r1.66 pcivar.h
--- dev/pci/pcivar.h 30 May 2011 19:09:46 -0000 1.66
+++ dev/pci/pcivar.h 13 Sep 2012 20:47:25 -0000
@@ -95,6 +95,7 @@ struct pcibus_attach_args {
struct extent *pba_ioex;
struct extent *pba_memex;
struct extent *pba_pmemex;
+ struct extent *pba_busex;
int pba_domain; /* PCI domain */
int pba_bus; /* PCI bus number */
@@ -127,6 +128,7 @@ struct pci_attach_args {
struct extent *pa_ioex;
struct extent *pa_memex;
struct extent *pa_pmemex;
+ struct extent *pa_busex;
u_int pa_domain;
u_int pa_bus;
@@ -187,6 +189,7 @@ struct pci_softc {
struct extent *sc_ioex;
struct extent *sc_memex;
struct extent *sc_pmemex;
+ struct extent *sc_busex;
LIST_HEAD(, pci_dev) sc_devs;
int sc_domain, sc_bus, sc_maxndevs;
pcitag_t *sc_bridgetag;
Index: arch/i386/i386/mainbus.c
===================================================================
RCS file: /cvs/src/sys/arch/i386/i386/mainbus.c,v
retrieving revision 1.48
diff -u -p -r1.48 mainbus.c
--- arch/i386/i386/mainbus.c 3 Nov 2010 10:15:23 -0000 1.48
+++ arch/i386/i386/mainbus.c 13 Sep 2012 20:47:25 -0000
@@ -241,6 +241,7 @@ mainbus_attach(struct device *parent, st
mba.mba_pba.pba_dmat = &pci_bus_dma_tag;
mba.mba_pba.pba_ioex = pciio_ex;
mba.mba_pba.pba_memex = pcimem_ex;
+ mba.mba_pba.pba_busex = pcibus_ex;
mba.mba_pba.pba_domain = pci_ndomains++;
mba.mba_pba.pba_bus = 0;
config_found(self, &mba.mba_pba, mainbus_print);
Index: arch/i386/pci/pci_machdep.c
===================================================================
RCS file: /cvs/src/sys/arch/i386/pci/pci_machdep.c,v
retrieving revision 1.69
diff -u -p -r1.69 pci_machdep.c
--- arch/i386/pci/pci_machdep.c 7 Sep 2012 19:23:53 -0000 1.69
+++ arch/i386/pci/pci_machdep.c 13 Sep 2012 20:47:25 -0000
@@ -870,6 +870,7 @@ pci_intr_disestablish(pci_chipset_tag_t
struct extent *pciio_ex;
struct extent *pcimem_ex;
+struct extent *pcibus_ex;
void
pci_init_extents(void)
@@ -920,6 +921,11 @@ pci_init_extents(void)
/* Take out the video buffer area and BIOS areas. */
extent_alloc_region(pcimem_ex, IOM_BEGIN, IOM_SIZE,
EX_CONFLICTOK | EX_NOWAIT);
+ }
+
+ if (pcibus_ex == NULL) {
+ pcibus_ex = extent_create("pcibus", 0, 0xff, M_DEVBUF,
+ NULL, 0, EX_NOWAIT);
}
}
Index: arch/i386/pci/pci_machdep.h
===================================================================
RCS file: /cvs/src/sys/arch/i386/pci/pci_machdep.h,v
retrieving revision 1.24
diff -u -p -r1.24 pci_machdep.h
--- arch/i386/pci/pci_machdep.h 7 Sep 2012 19:23:53 -0000 1.24
+++ arch/i386/pci/pci_machdep.h 13 Sep 2012 20:47:25 -0000
@@ -82,6 +82,7 @@ int pci_mode_detect(void);
extern struct extent *pciio_ex;
extern struct extent *pcimem_ex;
+extern struct extent *pcibus_ex;
void pci_init_extents(void);
/*