Greetings!

We've been receiving this error for a while on our 64-core Interlagos AMD machines:

****************************************************************************
* hwloc has encountered what looks like an error from the operating system.
*
* Socket (P#2 cpuset 0x0000ffff,0x0) intersects with NUMANode (P#3 cpuset 0x0000ff00,0xff000000) without inclusion!
* Error occurred in topology.c line 940
*
* Please report this error message to the hwloc user's mailing list,
* along with the output+tarball generated by the hwloc-gather-topology script.
****************************************************************************

I've found some information in the hwloc list archives mentioning this is due to buggy AMD platform and the impact should be limited to hwloc missing L3 cache info (thanks Brice). If that's the case and processor representation is correct then I am sure we can live with this, but I still wanted to check with the list to confirm that (1) this is really harmless and (2) are there any known solutions other than upgrading BIOS/kernel?

The hwloc-gather-topology output is also attached.

Our schedulers (Torque/Moab) and MPI stacks highly rely on hwloc and I need to ensure that this is not a critical issue, so any suggestions will help.

Thank you!
-Mehmet


Machine (P#0 total=134199212KB DMIProductName="Altus 1804i" DMIProductVersion=" 
" DMIProductSerial=P1724391 DMIProductUUID=1AA86536-A5AE-E211-9A76-EE604CC27BCE 
DMIBoardVendor=Supermicro DMIBoardName=H8QG6 DMIBoardVersion=1234567890 
DMIBoardSerial=WM2BS70921 DMIBoardAssetTag=" " DMIChassisVendor=Supermicro 
DMIChassisType=17 DMIChassisVersion=1234567890 DMIChassisSerial=1234567890. 
DMIChassisAssetTag=" " DMIBIOSVendor="American Megatrends Inc." 
DMIBIOSVersion="3.0b      " DMIBIOSDate=02/01/2013 DMISysVendor="Penguin 
Computing" Backend=Linux LinuxCgroup=/)
  Group0 L#0 (total=67106732KB)
    NUMANode L#0 (P#1 local=33552300KB total=33552300KB)
      Socket L#0 (P#0 CPUModel="AMD Opteron(tm) Processor 6378                 
")
        L3Cache L#0 (size=6144KB linesize=64 ways=64)
          L2Cache L#0 (size=2048KB linesize=64 ways=16)
            L1iCache L#0 (size=64KB linesize=64 ways=2)
              L1dCache L#0 (size=16KB linesize=64 ways=4)
                Core L#0 (P#0)
                  PU L#0 (P#0)
              L1dCache L#1 (size=16KB linesize=64 ways=4)
                Core L#1 (P#1)
                  PU L#1 (P#1)
          L2Cache L#1 (size=2048KB linesize=64 ways=16)
            L1iCache L#1 (size=64KB linesize=64 ways=2)
              L1dCache L#2 (size=16KB linesize=64 ways=4)
                Core L#2 (P#2)
                  PU L#2 (P#2)
              L1dCache L#3 (size=16KB linesize=64 ways=4)
                Core L#3 (P#3)
                  PU L#3 (P#3)
          L2Cache L#2 (size=2048KB linesize=64 ways=16)
            L1iCache L#2 (size=64KB linesize=64 ways=2)
              L1dCache L#4 (size=16KB linesize=64 ways=4)
                Core L#4 (P#4)
                  PU L#4 (P#4)
              L1dCache L#5 (size=16KB linesize=64 ways=4)
                Core L#5 (P#5)
                  PU L#5 (P#5)
          L2Cache L#3 (size=2048KB linesize=64 ways=16)
            L1iCache L#3 (size=64KB linesize=64 ways=2)
              L1dCache L#6 (size=16KB linesize=64 ways=4)
                Core L#6 (P#6)
                  PU L#6 (P#6)
              L1dCache L#7 (size=16KB linesize=64 ways=4)
                Core L#7 (P#7)
                  PU L#7 (P#7)
        L3Cache L#1 (size=6144KB linesize=64 ways=64)
          L2Cache L#4 (size=2048KB linesize=64 ways=16)
            L1iCache L#4 (size=64KB linesize=64 ways=2)
              L1dCache L#8 (size=16KB linesize=64 ways=4)
                Core L#8 (P#0)
                  PU L#8 (P#8)
              L1dCache L#9 (size=16KB linesize=64 ways=4)
                Core L#9 (P#1)
                  PU L#9 (P#9)
          L2Cache L#5 (size=2048KB linesize=64 ways=16)
            L1iCache L#5 (size=64KB linesize=64 ways=2)
              L1dCache L#10 (size=16KB linesize=64 ways=4)
                Core L#10 (P#2)
                  PU L#10 (P#10)
              L1dCache L#11 (size=16KB linesize=64 ways=4)
                Core L#11 (P#3)
                  PU L#11 (P#11)
          L2Cache L#6 (size=2048KB linesize=64 ways=16)
            L1iCache L#6 (size=64KB linesize=64 ways=2)
              L1dCache L#12 (size=16KB linesize=64 ways=4)
                Core L#12 (P#4)
                  PU L#12 (P#12)
              L1dCache L#13 (size=16KB linesize=64 ways=4)
                Core L#13 (P#5)
                  PU L#13 (P#13)
          L2Cache L#7 (size=2048KB linesize=64 ways=16)
            L1iCache L#7 (size=64KB linesize=64 ways=2)
              L1dCache L#14 (size=16KB linesize=64 ways=4)
                Core L#14 (P#6)
                  PU L#14 (P#14)
              L1dCache L#15 (size=16KB linesize=64 ways=4)
                Core L#15 (P#7)
                  PU L#15 (P#15)
      L3Cache L#2 (size=6144KB linesize=64 ways=64)
        L2Cache L#8 (size=2048KB linesize=64 ways=16)
          L1iCache L#8 (size=64KB linesize=64 ways=2)
            L1dCache L#16 (size=16KB linesize=64 ways=4)
              Core L#16 (P#0)
                PU L#16 (P#16)
            L1dCache L#17 (size=16KB linesize=64 ways=4)
              Core L#17 (P#1)
                PU L#17 (P#17)
        L2Cache L#9 (size=2048KB linesize=64 ways=16)
          L1iCache L#9 (size=64KB linesize=64 ways=2)
            L1dCache L#18 (size=16KB linesize=64 ways=4)
              Core L#18 (P#2)
                PU L#18 (P#18)
            L1dCache L#19 (size=16KB linesize=64 ways=4)
              Core L#19 (P#3)
                PU L#19 (P#19)
        L2Cache L#10 (size=2048KB linesize=64 ways=16)
          L1iCache L#10 (size=64KB linesize=64 ways=2)
            L1dCache L#20 (size=16KB linesize=64 ways=4)
              Core L#20 (P#4)
                PU L#20 (P#20)
            L1dCache L#21 (size=16KB linesize=64 ways=4)
              Core L#21 (P#5)
                PU L#21 (P#21)
        L2Cache L#11 (size=2048KB linesize=64 ways=16)
          L1iCache L#11 (size=64KB linesize=64 ways=2)
            L1dCache L#22 (size=16KB linesize=64 ways=4)
              Core L#22 (P#6)
                PU L#22 (P#22)
            L1dCache L#23 (size=16KB linesize=64 ways=4)
              Core L#23 (P#7)
                PU L#23 (P#23)
    NUMANode L#1 (P#3 local=33554432KB total=33554432KB)
      L3Cache L#3 (size=6144KB linesize=64 ways=64)
        L2Cache L#12 (size=2048KB linesize=64 ways=16)
          L1iCache L#12 (size=64KB linesize=64 ways=2)
            L1dCache L#24 (size=16KB linesize=64 ways=4)
              Core L#24 (P#0)
                PU L#24 (P#24)
            L1dCache L#25 (size=16KB linesize=64 ways=4)
              Core L#25 (P#1)
                PU L#25 (P#25)
        L2Cache L#13 (size=2048KB linesize=64 ways=16)
          L1iCache L#13 (size=64KB linesize=64 ways=2)
            L1dCache L#26 (size=16KB linesize=64 ways=4)
              Core L#26 (P#2)
                PU L#26 (P#26)
            L1dCache L#27 (size=16KB linesize=64 ways=4)
              Core L#27 (P#3)
                PU L#27 (P#27)
        L2Cache L#14 (size=2048KB linesize=64 ways=16)
          L1iCache L#14 (size=64KB linesize=64 ways=2)
            L1dCache L#28 (size=16KB linesize=64 ways=4)
              Core L#28 (P#4)
                PU L#28 (P#28)
            L1dCache L#29 (size=16KB linesize=64 ways=4)
              Core L#29 (P#5)
                PU L#29 (P#29)
        L2Cache L#15 (size=2048KB linesize=64 ways=16)
          L1iCache L#15 (size=64KB linesize=64 ways=2)
            L1dCache L#30 (size=16KB linesize=64 ways=4)
              Core L#30 (P#6)
                PU L#30 (P#30)
            L1dCache L#31 (size=16KB linesize=64 ways=4)
              Core L#31 (P#7)
                PU L#31 (P#31)
      L3Cache L#4 (size=6144KB linesize=64 ways=64)
        L2Cache L#16 (size=2048KB linesize=64 ways=16)
          L1iCache L#16 (size=64KB linesize=64 ways=2)
            L1dCache L#32 (size=16KB linesize=64 ways=4)
              Core L#32 (P#0)
                PU L#32 (P#40)
            L1dCache L#33 (size=16KB linesize=64 ways=4)
              Core L#33 (P#1)
                PU L#33 (P#41)
        L2Cache L#17 (size=2048KB linesize=64 ways=16)
          L1iCache L#17 (size=64KB linesize=64 ways=2)
            L1dCache L#34 (size=16KB linesize=64 ways=4)
              Core L#34 (P#2)
                PU L#34 (P#42)
            L1dCache L#35 (size=16KB linesize=64 ways=4)
              Core L#35 (P#3)
                PU L#35 (P#43)
        L2Cache L#18 (size=2048KB linesize=64 ways=16)
          L1iCache L#18 (size=64KB linesize=64 ways=2)
            L1dCache L#36 (size=16KB linesize=64 ways=4)
              Core L#36 (P#4)
                PU L#36 (P#44)
            L1dCache L#37 (size=16KB linesize=64 ways=4)
              Core L#37 (P#5)
                PU L#37 (P#45)
        L2Cache L#19 (size=2048KB linesize=64 ways=16)
          L1iCache L#19 (size=64KB linesize=64 ways=2)
            L1dCache L#38 (size=16KB linesize=64 ways=4)
              Core L#38 (P#6)
                PU L#38 (P#46)
            L1dCache L#39 (size=16KB linesize=64 ways=4)
              Core L#39 (P#7)
                PU L#39 (P#47)
  Group0 L#1 (total=67092480KB)
    NUMANode L#2 (P#4 local=33554432KB total=33554432KB)
      L3Cache L#5 (size=6144KB linesize=64 ways=64)
        L2Cache L#20 (size=2048KB linesize=64 ways=16)
          L1iCache L#20 (size=64KB linesize=64 ways=2)
            L1dCache L#40 (size=16KB linesize=64 ways=4)
              Core L#40 (P#0)
                PU L#40 (P#32)
            L1dCache L#41 (size=16KB linesize=64 ways=4)
              Core L#41 (P#1)
                PU L#41 (P#33)
        L2Cache L#21 (size=2048KB linesize=64 ways=16)
          L1iCache L#21 (size=64KB linesize=64 ways=2)
            L1dCache L#42 (size=16KB linesize=64 ways=4)
              Core L#42 (P#2)
                PU L#42 (P#34)
            L1dCache L#43 (size=16KB linesize=64 ways=4)
              Core L#43 (P#3)
                PU L#43 (P#35)
        L2Cache L#22 (size=2048KB linesize=64 ways=16)
          L1iCache L#22 (size=64KB linesize=64 ways=2)
            L1dCache L#44 (size=16KB linesize=64 ways=4)
              Core L#44 (P#4)
                PU L#44 (P#36)
            L1dCache L#45 (size=16KB linesize=64 ways=4)
              Core L#45 (P#5)
                PU L#45 (P#37)
        L2Cache L#23 (size=2048KB linesize=64 ways=16)
          L1iCache L#23 (size=64KB linesize=64 ways=2)
            L1dCache L#46 (size=16KB linesize=64 ways=4)
              Core L#46 (P#6)
                PU L#46 (P#38)
            L1dCache L#47 (size=16KB linesize=64 ways=4)
              Core L#47 (P#7)
                PU L#47 (P#39)
    NUMANode L#3 (P#6 local=33538048KB total=33538048KB)
      Socket L#1 (P#3 CPUModel="AMD Opteron(tm) Processor 6378                 
")
        L3Cache L#6 (size=6144KB linesize=64 ways=64)
          L2Cache L#24 (size=2048KB linesize=64 ways=16)
            L1iCache L#24 (size=64KB linesize=64 ways=2)
              L1dCache L#48 (size=16KB linesize=64 ways=4)
                Core L#48 (P#0)
                  PU L#48 (P#48)
              L1dCache L#49 (size=16KB linesize=64 ways=4)
                Core L#49 (P#1)
                  PU L#49 (P#49)
          L2Cache L#25 (size=2048KB linesize=64 ways=16)
            L1iCache L#25 (size=64KB linesize=64 ways=2)
              L1dCache L#50 (size=16KB linesize=64 ways=4)
                Core L#50 (P#2)
                  PU L#50 (P#50)
              L1dCache L#51 (size=16KB linesize=64 ways=4)
                Core L#51 (P#3)
                  PU L#51 (P#51)
          L2Cache L#26 (size=2048KB linesize=64 ways=16)
            L1iCache L#26 (size=64KB linesize=64 ways=2)
              L1dCache L#52 (size=16KB linesize=64 ways=4)
                Core L#52 (P#4)
                  PU L#52 (P#52)
              L1dCache L#53 (size=16KB linesize=64 ways=4)
                Core L#53 (P#5)
                  PU L#53 (P#53)
          L2Cache L#27 (size=2048KB linesize=64 ways=16)
            L1iCache L#27 (size=64KB linesize=64 ways=2)
              L1dCache L#54 (size=16KB linesize=64 ways=4)
                Core L#54 (P#6)
                  PU L#54 (P#54)
              L1dCache L#55 (size=16KB linesize=64 ways=4)
                Core L#55 (P#7)
                  PU L#55 (P#55)
        L3Cache L#7 (size=6144KB linesize=64 ways=64)
          L2Cache L#28 (size=2048KB linesize=64 ways=16)
            L1iCache L#28 (size=64KB linesize=64 ways=2)
              L1dCache L#56 (size=16KB linesize=64 ways=4)
                Core L#56 (P#0)
                  PU L#56 (P#56)
              L1dCache L#57 (size=16KB linesize=64 ways=4)
                Core L#57 (P#1)
                  PU L#57 (P#57)
          L2Cache L#29 (size=2048KB linesize=64 ways=16)
            L1iCache L#29 (size=64KB linesize=64 ways=2)
              L1dCache L#58 (size=16KB linesize=64 ways=4)
                Core L#58 (P#2)
                  PU L#58 (P#58)
              L1dCache L#59 (size=16KB linesize=64 ways=4)
                Core L#59 (P#3)
                  PU L#59 (P#59)
          L2Cache L#30 (size=2048KB linesize=64 ways=16)
            L1iCache L#30 (size=64KB linesize=64 ways=2)
              L1dCache L#60 (size=16KB linesize=64 ways=4)
                Core L#60 (P#4)
                  PU L#60 (P#60)
              L1dCache L#61 (size=16KB linesize=64 ways=4)
                Core L#61 (P#5)
                  PU L#61 (P#61)
          L2Cache L#31 (size=2048KB linesize=64 ways=16)
            L1iCache L#31 (size=64KB linesize=64 ways=2)
              L1dCache L#62 (size=16KB linesize=64 ways=4)
                Core L#62 (P#6)
                  PU L#62 (P#62)
              L1dCache L#63 (size=16KB linesize=64 ways=4)
                Core L#63 (P#7)
                  PU L#63 (P#63)
  Bridge Host->PCI L#0 (P#0 buses=0000:[00-03])
    Bridge PCI->PCI (P#32 busid=0000:00:02.0 id=1002:5a16 class=0604(PCI_B) 
link=4.00GB/s buses=0000:[03-03])
      PCI 15b3:673c (P#12288 busid=0000:03:00.0 class=0c06(IB) link=4.00GB/s)
        Network L#0 
(Address=80:00:00:48:fe:80:00:00:00:00:00:00:00:02:c9:03:00:2a:c8:8b Port=1) 
"ib0"
        OpenFabrics L#1 (NodeGUID=0002:c903:002a:c88a 
SysImageGUID=0002:c903:002a:c88d Port1State=4 Port1LID=0x1d0 Port1LMC=0 
Port1GID0=fe80:0000:0000:0000:0002:c903:002a:c88b) "mlx4_0"
    Bridge PCI->PCI (P#208 busid=0000:00:0d.0 id=1002:5a1e class=0604(PCI_B) 
link=1.00GB/s buses=0000:[02-02])
      PCI 8086:10c9 (P#8192 busid=0000:02:00.0 class=0200(Ether) link=1.00GB/s)
        Network L#2 (Address=00:25:90:59:a1:d2) "eth0"
      PCI 8086:10c9 (P#8193 busid=0000:02:00.1 class=0200(Ether) link=1.00GB/s)
        Network L#3 (Address=00:25:90:59:a1:d3) "eth1"
    PCI 1002:4394 (P#272 busid=0000:00:11.0 class=0106(SATA))
      Block L#4 "sda"
    Bridge PCI->PCI (P#324 busid=0000:00:14.4 id=1002:4384 class=0604(PCI_B) 
buses=0000:[01-01])
      PCI 102b:0532 (P#4160 busid=0000:01:04.0 class=0300(VGA))
depth 0:        1 Machine (type #1)
 depth 1:       2 Group0 (type #7)
  depth 2:      4 NUMANode (type #2)
   depth 3:     2 Socket (type #3)
    depth 4:    8 L3Cache (type #4)
     depth 5:   32 L2Cache (type #4)
      depth 6:  32 L1iCache (type #4)
       depth 7: 64 L1dCache (type #4)
        depth 8:        64 Core (type #5)
         depth 9:       64 PU (type #6)
Special depth -3:       4 Bridge (type #9)
Special depth -4:       5 PCI Device (type #10)
Special depth -5:       5 OS Device (type #11)
latency matrix between NUMANodes (depth 2) by logical indexes:
  index     0     1     2     3
      0 1.000 1.600 2.200 2.200
      1 1.600 1.000 2.200 2.200
      2 2.200 2.200 1.000 1.600
      3 2.200 2.200 1.600 1.000
Topology not from this system

Attachment: interlagos_hwloc.tar.bz2
Description: BZip2 compressed data

Reply via email to