> Date: Mon, 10 Sep 2012 13:22:35 +0200
> From: Christian Ehrhardt <[email protected]>
> 
> Hi,
> 
> first of all: Thanks for your reply!
> 
> On Sat, Sep 08, 2012 at 04:04:41PM +0200, Mark Kettenis wrote:
> > > most modern x86 hardware includes more than one PCI root segement.
> > > E.g. a hardware that I have here has three PCI root segemnts with bus
> > > numbers 0, 0x7f, 0x80 and 0xff respectively. (0x7f and 0xff host the
> > > uncore devices of each processor).
> > 
> > Well, machines with multiple host bridges have been around for a
> > while.  Some of them even pre-date ACPI.
> > 
> > > 
> > > These segments are already detected by APCI but this information is not
> > > used when attaching PCI busses to the mainbus. Below is a patch that
> > > should solve the PCI bus detection problem in a robust way.
> > 
> > I hate ACPI!  It's full of lies; I don't trust it.
> 
> Agreed.
> 
> > It is much better
> > to detect these additional PCI busses by looking at actual hardware
> > registers.
> 
> This approach (unfortunately) means that device detection breaks
> with each new chipset and I hate that :-)

Yeah, it does seem that Intel comes up with a different "solution" in
every CPU/chipset generation :(.

> > So before we go down this path, can you investigate that?
> > It seems the bus number for the additional "Uncore" devices of the
> > Xeon E5 1600/2600 CPUs is available in config register 0x108 of device
> > 0:5:0.
> 
> Yes, I already figured that. It works and I can cook a patch if you want.
> The patch is not completely straight forward, tough, because the device is
> not just a bridge but a multi-purpose devices.
> 
> > But I haven't figured how to find the bus number for the
> > second socket.  Looking at the AML for this machine and determining
> > how it detects these additional busses might provide some clues.  Feel
> > free to send me the acpidump output for this box.
> 
> I will send you ACPI data in private. However, the second PCI bus is
> simply "hardwired" in DSDT, i.e. there does not seem to be code in ACPI that
> would try to detect its presence.
> 
> I tend to believe that there is no bridge device that exposes the
> second PCI bus.

It seems so.  I haven't really found any hints in the CPU
documentation either.  It's probaby in some super-sekrit BIOS
developers guide that's only available under NDA :(.

Seems we have no choice other than relying on acpi.

> > > There's a small problem though: Some of the secondary PCI segments
> > > can show up as busses behind some kind of host bridge device dowstream
> > > of pci bus 0, too. These PCI busses would be attached twice, now. Thus we
> > > keep track of attached root segemnents in the ACPI code.
> > 
> > Yup.  Since we have code to detect additional host bridges some of
> > them may already have been attached.  And it would be a good way to
> > defend against ACPI lying to us.  However, please keep the ACPI hooks
> > out of the MI PCI code.  It'd be better if you used pci_attach_hook(),
> > which lives in arch/pci/pci_machdep.c.
> 
> Ok, will do that.
> 
> However, there might be a different solution:
> We could track attached busses (not just host bridge busses but all busses)
> in the pci code (or via MD code and hooks but I think that would be a
> generic thing) and do nothing in pciattach if a bus is discovered for
> the second time.
> 
> With this MD code could use whatever methods it wants to detect otherwise
> undiscovered host bridges. Possible methods might be magic numbers, hints
> from ACPI or even a linear scan of all possible bus numbers.
> 
> Does this sound like a possible solution?

It actually got me thinking about proper bus number resource
accounting again.  Something that's needed for properly hot-plugging
pci busses.  Below is a diff that does this.

With that code, your acpi code can just try to grab the bus number
using extent_alloc_region() and only attach a bus if that succeeds.


Index: dev/pci/pci.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/pci.c,v
retrieving revision 1.95
diff -u -p -r1.95 pci.c
--- dev/pci/pci.c       7 Sep 2012 19:26:48 -0000       1.95
+++ dev/pci/pci.c       13 Sep 2012 20:47:25 -0000
@@ -175,6 +175,7 @@ pciattach(struct device *parent, struct 
        sc->sc_ioex = pba->pba_ioex;
        sc->sc_memex = pba->pba_memex;
        sc->sc_pmemex = pba->pba_pmemex;
+       sc->sc_busex = pba->pba_busex;
        sc->sc_domain = pba->pba_domain;
        sc->sc_bus = pba->pba_bus;
        sc->sc_bridgetag = pba->pba_bridgetag;
@@ -182,7 +183,15 @@ pciattach(struct device *parent, struct 
        sc->sc_maxndevs = pci_bus_maxdevs(pba->pba_pc, pba->pba_bus);
        sc->sc_intrswiz = pba->pba_intrswiz;
        sc->sc_intrtag = pba->pba_intrtag;
+
+       /* Reserve our own bus number. */
+       if (sc->sc_busex)
+               extent_alloc_region(sc->sc_busex, sc->sc_bus, 1, EX_NOWAIT);
+
        pci_enumerate_bus(sc, pci_reserve_resources, NULL);
+       if (sc->sc_busex)
+               extent_print(sc->sc_busex);
+
        pci_enumerate_bus(sc, pci_count_vga, NULL);
        if (pci_enumerate_bus(sc, pci_primary_vga, NULL))
                pci_vga_pci = sc;
@@ -401,6 +410,7 @@ pci_probe_device(struct pci_softc *sc, p
        pa.pa_ioex = sc->sc_ioex;
        pa.pa_memex = sc->sc_memex;
        pa.pa_pmemex = sc->sc_pmemex;
+       pa.pa_busex = sc->sc_busex;
        pa.pa_domain = sc->sc_domain;
        pa.pa_bus = bus;
        pa.pa_device = device;
@@ -749,12 +759,16 @@ pci_reserve_resources(struct pci_attach_
 {
        pci_chipset_tag_t pc = pa->pa_pc;
        pcitag_t tag = pa->pa_tag;
-       pcireg_t bhlc, blr, type;
+       pcireg_t bhlc, blr, type, bir;
        bus_addr_t base, limit;
        bus_size_t size;
        int reg, reg_start, reg_end;
+       int bus, dev, func;
+       int sec, sub;
        int flags;
 
+       pci_decompose_tag(pc, tag, &bus, &dev, &func);
+
        bhlc = pci_conf_read(pc, tag, PCI_BHLC_REG);
        switch (PCI_HDRTYPE_TYPE(bhlc)) {
        case 0:
@@ -795,8 +809,8 @@ pci_reserve_resources(struct pci_attach_
 #endif
                        if (pa->pa_memex && extent_alloc_region(pa->pa_memex,
                            base, size, EX_NOWAIT)) {
-                               printf("mem address conflict 0x%x/0x%x\n",
-                                   base, size);
+                               printf("%d:%d:%d: mem address conflict 
0x%x/0x%x\n",
+                                   bus, dev, func, base, size);
                                pci_conf_write(pc, tag, reg, 0);
                                if (type & PCI_MAPREG_MEM_TYPE_64BIT)
                                        pci_conf_write(pc, tag, reg + 4, 0);
@@ -805,8 +819,8 @@ pci_reserve_resources(struct pci_attach_
                case PCI_MAPREG_TYPE_IO:
                        if (pa->pa_ioex && extent_alloc_region(pa->pa_ioex,
                            base, size, EX_NOWAIT)) {
-                               printf("io address conflict 0x%x/0x%x\n",
-                                   base, size);
+                               printf("%d:%d:%d: io address conflict 
0x%x/0x%x\n",
+                                   bus, dev, func, base, size);
                                pci_conf_write(pc, tag, reg, 0);
                        }
                        break;
@@ -832,8 +846,8 @@ pci_reserve_resources(struct pci_attach_
                size = 0;
        if (pa->pa_ioex && base > 0 && size > 0) {
                if (extent_alloc_region(pa->pa_ioex, base, size, EX_NOWAIT)) {
-                       printf("bridge io address conflict 0x%x/0x%x\n",
-                              base, size);
+                       printf("%d:%d:%d: bridge io address conflict 
0x%x/0x%x\n",
+                           bus, dev, func, base, size);
                        blr &= 0xffff0000;
                        blr |= 0x000000f0;
                        pci_conf_write(pc, tag, PPB_REG_IOSTATUS, blr);
@@ -850,8 +864,8 @@ pci_reserve_resources(struct pci_attach_
                size = 0;
        if (pa->pa_memex && base > 0 && size > 0) {
                if (extent_alloc_region(pa->pa_memex, base, size, EX_NOWAIT)) {
-                       printf("bridge mem address conflict 0x%x/0x%x\n",
-                              base, size);
+                       printf("%d:%d:%d: bridge mem address conflict 
0x%x/0x%x\n",
+                           bus, dev, func, base, size);
                        pci_conf_write(pc, tag, PPB_REG_MEM, 0x0000fff0);
                }
        }
@@ -866,15 +880,27 @@ pci_reserve_resources(struct pci_attach_
                size = 0;
        if (pa->pa_pmemex && base > 0 && size > 0) {
                if (extent_alloc_region(pa->pa_pmemex, base, size, EX_NOWAIT)) {
-                       printf("bridge mem address conflict 0x%x/0x%x\n",
-                              base, size);
+                       printf("%d:%d:%d: bridge mem address conflict 
0x%x/0x%x\n",
+                           bus, dev, func, base, size);
                        pci_conf_write(pc, tag, PPB_REG_PREFMEM, 0x0000fff0);
                }
        } else if (pa->pa_memex && base > 0 && size > 0) {
                if (extent_alloc_region(pa->pa_memex, base, size, EX_NOWAIT)) {
-                       printf("bridge mem address conflict 0x%x/0x%x\n",
-                              base, size);
+                       printf("%d:%d:%d: bridge mem address conflict 
0x%x/0x%x\n",
+                           bus, dev, func, base, size);
                        pci_conf_write(pc, tag, PPB_REG_PREFMEM, 0x0000fff0);
+               }
+       }
+
+       /* Figure out the bus range handled by the bridge. */
+       bir = pci_conf_read(pc, tag, PPB_REG_BUSINFO);
+       sec = PPB_BUSINFO_SECONDARY(bir);
+       sub = PPB_BUSINFO_SUBORDINATE(bir);
+       if (pa->pa_busex && sub >= sec) {
+               if (extent_alloc_region(pa->pa_busex, sec, sub - sec + 1,
+                   EX_NOWAIT)) {
+                       printf("%d:%d:%d: bridge bus conflict %d-%d\n",
+                           bus, dev, func, sec, sub);
                }
        }
 
Index: dev/pci/pcivar.h
===================================================================
RCS file: /cvs/src/sys/dev/pci/pcivar.h,v
retrieving revision 1.66
diff -u -p -r1.66 pcivar.h
--- dev/pci/pcivar.h    30 May 2011 19:09:46 -0000      1.66
+++ dev/pci/pcivar.h    13 Sep 2012 20:47:25 -0000
@@ -95,6 +95,7 @@ struct pcibus_attach_args {
        struct extent   *pba_ioex;
        struct extent   *pba_memex;
        struct extent   *pba_pmemex;
+       struct extent   *pba_busex;
 
        int             pba_domain;     /* PCI domain */
        int             pba_bus;        /* PCI bus number */
@@ -127,6 +128,7 @@ struct pci_attach_args {
        struct extent   *pa_ioex;
        struct extent   *pa_memex;
        struct extent   *pa_pmemex;
+       struct extent   *pa_busex;
 
        u_int           pa_domain;
        u_int           pa_bus;
@@ -187,6 +189,7 @@ struct pci_softc {
        struct extent *sc_ioex;
        struct extent *sc_memex;
        struct extent *sc_pmemex;
+       struct extent *sc_busex;
        LIST_HEAD(, pci_dev) sc_devs;
        int sc_domain, sc_bus, sc_maxndevs;
        pcitag_t *sc_bridgetag;
Index: arch/i386/i386/mainbus.c
===================================================================
RCS file: /cvs/src/sys/arch/i386/i386/mainbus.c,v
retrieving revision 1.48
diff -u -p -r1.48 mainbus.c
--- arch/i386/i386/mainbus.c    3 Nov 2010 10:15:23 -0000       1.48
+++ arch/i386/i386/mainbus.c    13 Sep 2012 20:47:25 -0000
@@ -241,6 +241,7 @@ mainbus_attach(struct device *parent, st
                mba.mba_pba.pba_dmat = &pci_bus_dma_tag;
                mba.mba_pba.pba_ioex = pciio_ex;
                mba.mba_pba.pba_memex = pcimem_ex;
+               mba.mba_pba.pba_busex = pcibus_ex;
                mba.mba_pba.pba_domain = pci_ndomains++;
                mba.mba_pba.pba_bus = 0;
                config_found(self, &mba.mba_pba, mainbus_print);
Index: arch/i386/pci/pci_machdep.c
===================================================================
RCS file: /cvs/src/sys/arch/i386/pci/pci_machdep.c,v
retrieving revision 1.69
diff -u -p -r1.69 pci_machdep.c
--- arch/i386/pci/pci_machdep.c 7 Sep 2012 19:23:53 -0000       1.69
+++ arch/i386/pci/pci_machdep.c 13 Sep 2012 20:47:25 -0000
@@ -870,6 +870,7 @@ pci_intr_disestablish(pci_chipset_tag_t 
 
 struct extent *pciio_ex;
 struct extent *pcimem_ex;
+struct extent *pcibus_ex;
 
 void
 pci_init_extents(void)
@@ -920,6 +921,11 @@ pci_init_extents(void)
                /* Take out the video buffer area and BIOS areas. */
                extent_alloc_region(pcimem_ex, IOM_BEGIN, IOM_SIZE,
                    EX_CONFLICTOK | EX_NOWAIT);
+       }
+
+       if (pcibus_ex == NULL) {
+               pcibus_ex = extent_create("pcibus", 0, 0xff, M_DEVBUF,
+                   NULL, 0, EX_NOWAIT);
        }
 }
 
Index: arch/i386/pci/pci_machdep.h
===================================================================
RCS file: /cvs/src/sys/arch/i386/pci/pci_machdep.h,v
retrieving revision 1.24
diff -u -p -r1.24 pci_machdep.h
--- arch/i386/pci/pci_machdep.h 7 Sep 2012 19:23:53 -0000       1.24
+++ arch/i386/pci/pci_machdep.h 13 Sep 2012 20:47:25 -0000
@@ -82,6 +82,7 @@ int           pci_mode_detect(void);
 
 extern struct extent *pciio_ex;
 extern struct extent *pcimem_ex;
+extern struct extent *pcibus_ex;
 void           pci_init_extents(void);
 
 /*

Reply via email to