Re: [hwloc-users] node configuration differs form hardware
Thanks very much. G2 BIOS is more recent (v3.50 AMI BIOS). I will upgrade and see how it goes. Thanks again, Craig From: Brice Goglin <brice.gog...@inria.fr> To: Craig Kapfer <c_kap...@yahoo.com>; Hardware locality user list <hwloc-us...@open-mpi.org> Sent: Wednesday, May 28, 2014 5:16 PM Subject: Re: [hwloc-users] node configuration differs form hardware Le 28/05/2014 15:46, Craig Kapfer a écrit : Wait, I'm sorry, I must be missing something, please bear with me! > >By the way, your discussion of groups 1 and 2 below is wrong. Group 2 doesn't >say that NUMA node == socket, and it doesn't report 8 sockets of 8 cores each. >It reports 4 sockets containing 2 NUMA nodes each containing 8 cores each, and >that's likely what you have here (AMD Opteron 6300 or 6200 processors?). >Output of lstopo from nodes of both BIOS versions seem to indicate that there >are 4 sockets, but slurm is reporting on numa nodes, no? If not, which >version of the BIOS is correct? > Ah right, I misread group1. Group1 reports 4 sockets = 4 numa nodes containing 16 cores each. That's wrong. There are 2 NUMA nodes in each socket, and 8 cores in each NUMA nodes (instead of 1 NUMA node in each socket, and 16 core in each NUMA node). Slurm is indeed saying something wrong. I wonder if it confuses NUMA nodes and sockets, I can't find anything like this in Google. On Intel that doesn't matter. On AMD it does. Anyway G2 is correct, so its BIOS may be less buggy than G1. Which BIOS is more recent? Try updating the BIOS on one G1 machines to see if that fixes the issue. Brice
Re: [hwloc-users] node configuration differs form hardware
Wait, I'm sorry, I must be missing something, please bear with me! By the way, your discussion of groups 1 and 2 below is wrong. Group 2 doesn't say that NUMA node == socket, and it doesn't report 8 sockets of 8 cores each. It reports 4 sockets containing 2 NUMA nodes each containing 8 cores each, and that's likely what you have here (AMD Opteron 6300 or 6200 processors?). Output of lstopo from nodes of both BIOS versions seem to indicate that there are 4 sockets, but slurm is reporting on numa nodes, no? If not, which version of the BIOS is correct? SocketsPerBoard=4:8(hw) CoresPerSocket=16:8(hw) >>>This message indicates that slurm believes the hardware actually has 8 >>>sockets and 8 cores per socket no? >>> Complete lstopo info attached for clarity for group 1 and 2. If there is a problem with the BIOS I'd like to correct it so please let me know if the BIOS is actually at fault here. Thanks! Craig On Wednesday, May 28, 2014 4:01 PM, Brice Goglin <brice.gog...@inria.fr> wrote: Le 28/05/2014 14:57, Craig Kapfer a écrit : > > >Hmm ... the slurm config defines that all nodes have 4 sockets with 16 cores per socket (which corresponds to the hardware--all nodes are the same). Slurm node config is as follows: > > >NodeName=n[001-008] RealMemory=258452 Sockets=4 CoresPerSocket=16 >ThreadsPerCore=1 State=UNKNOWN Port=[17001-17008] > > >But we get this error--so I suspect it's a parsing error on the slurm side? No, it's slurm properly reading info from hwloc, but that info doesn't match the actual hardware because the BIOS is buggy. BriceMachine (128GB) NUMANode L#0 (P#0 32GB) + Socket L#0 L3 L#0 (6144KB) L2 L#0 (2048KB) + L1i L#0 (64KB) L1d L#0 (16KB) + Core L#0 + PU L#0 (P#0) L1d L#1 (16KB) + Core L#1 + PU L#1 (P#1) L2 L#1 (2048KB) + L1i L#1 (64KB) L1d L#2 (16KB) + Core L#2 + PU L#2 (P#2) L1d L#3 (16KB) + Core L#3 + PU L#3 (P#3) L2 L#2 (2048KB) + L1i L#2 (64KB) L1d L#4 (16KB) + Core L#4 + PU L#4 (P#4) L1d L#5 (16KB) + Core L#5 + PU L#5 (P#5) L2 L#3 (2048KB) + L1i L#3 (64KB) L1d L#6 (16KB) + Core L#6 + PU L#6 (P#6) L1d L#7 (16KB) + Core L#7 + PU L#7 (P#7) L3 L#1 (6144KB) L2 L#4 (2048KB) + L1i L#4 (64KB) L1d L#8 (16KB) + Core L#8 + PU L#8 (P#8) L1d L#9 (16KB) + Core L#9 + PU L#9 (P#9) L2 L#5 (2048KB) + L1i L#5 (64KB) L1d L#10 (16KB) + Core L#10 + PU L#10 (P#10) L1d L#11 (16KB) + Core L#11 + PU L#11 (P#11) L2 L#6 (2048KB) + L1i L#6 (64KB) L1d L#12 (16KB) + Core L#12 + PU L#12 (P#12) L1d L#13 (16KB) + Core L#13 + PU L#13 (P#13) L2 L#7 (2048KB) + L1i L#7 (64KB) L1d L#14 (16KB) + Core L#14 + PU L#14 (P#14) L1d L#15 (16KB) + Core L#15 + PU L#15 (P#15) NUMANode L#1 (P#2 32GB) + Socket L#1 L3 L#2 (6144KB) L2 L#8 (2048KB) + L1i L#8 (64KB) L1d L#16 (16KB) + Core L#16 + PU L#16 (P#16) L1d L#17 (16KB) + Core L#17 + PU L#17 (P#17) L2 L#9 (2048KB) + L1i L#9 (64KB) L1d L#18 (16KB) + Core L#18 + PU L#18 (P#18) L1d L#19 (16KB) + Core L#19 + PU L#19 (P#19) L2 L#10 (2048KB) + L1i L#10 (64KB) L1d L#20 (16KB) + Core L#20 + PU L#20 (P#20) L1d L#21 (16KB) + Core L#21 + PU L#21 (P#21) L2 L#11 (2048KB) + L1i L#11 (64KB) L1d L#22 (16KB) + Core L#22 + PU L#22 (P#22) L1d L#23 (16KB) + Core L#23 + PU L#23 (P#23) L3 L#3 (6144KB) L2 L#12 (2048KB) + L1i L#12 (64KB) L1d L#24 (16KB) + Core L#24 + PU L#24 (P#24) L1d L#25 (16KB) + Core L#25 + PU L#25 (P#25) L2 L#13 (2048KB) + L1i L#13 (64KB) L1d L#26 (16KB) + Core L#26 + PU L#26 (P#26) L1d L#27 (16KB) + Core L#27 + PU L#27 (P#27) L2 L#14 (2048KB) + L1i L#14 (64KB) L1d L#28 (16KB) + Core L#28 + PU L#28 (P#28) L1d L#29 (16KB) + Core L#29 + PU L#29 (P#29) L2 L#15 (2048KB) + L1i L#15 (64KB) L1d L#30 (16KB) + Core L#30 + PU L#30 (P#30) L1d L#31 (16KB) + Core L#31 + PU L#31 (P#31) NUMANode L#2 (P#4 32GB) + Socket L#2 L3 L#4 (6144KB) L2 L#16 (2048KB) + L1i L#16 (64KB) L1d L#32 (16KB) + Core L#32 + PU L#32 (P#32) L1d L#33 (16KB) + Core L#33 + PU L#33 (P#33) L2 L#17 (2048KB) + L1i L#17 (64KB) L1d L#34 (16KB) + Core L#34 + PU L#34 (P#34) L1d L#35 (16KB) + Core L#35 + PU L#35 (P#35) L2 L#18 (2048KB) + L1i L#18 (64KB) L1d L#36 (16KB) + Core L#36 + PU L#36 (P#36) L1d L#37 (16KB) + Core L#37 + PU L#37 (P#37) L2 L#19 (2048KB) + L1i L#19 (64KB) L1d L#38 (16KB) + Core L#38 + PU L#38 (P#38) L1d L#39 (16KB) + Core L#39 + PU L#39 (P#39) L3 L#5 (6144KB) L2 L#20 (2048KB) + L1i L#20 (64KB) L1d L#40 (16KB) + Core L#40 + PU L#40 (P#40) L1d L#41 (16KB) + Core L#41 + PU L#41 (P#41) L2 L
Re: [hwloc-users] node configuration differs form hardware
Hmm ... the slurm config defines that all nodes have 4 sockets with 16 cores per socket (which corresponds to the hardware--all nodes are the same). Slurm node config is as follows: NodeName=n[001-008] RealMemory=258452 Sockets=4 CoresPerSocket=16 ThreadsPerCore=1 State=UNKNOWN Port=[17001-17008] But we get this error--so I suspect it's a parsing error on the slurm side?: May 27 11:53:04 n001 slurmd[3629]: Node configuration differs from hardware: CPUs=64:64(hw) Boards=1:1(hw) SocketsPerBoard=4:8(hw) CoresPerSocket=16:8(hw) ThreadsPerCore=1:1(hw) Craig On Wednesday, May 28, 2014 3:20 PM, Brice Goglin <brice.gog...@inria.fr> wrote: Le 28/05/2014 14:13, Craig Kapfer a écrit : Interesting, quite right, thank you very much. Yes these are AMD 6300 series. Same kernel but these boxes seem to have different BIOS versions, direct from the factory, delivered in the same physical enclosure even! Some are AMI 3.5 and some are 3.0. > > >So slurm is then incorrectly parsing correct output from lstopo to generate >this message? > >May 27 11:53:04 n001 slurmd[3629]: Node configuration differs from hardware: >CPUs=64:64(hw) Boards=1:1(hw) SocketsPerBoard=4:8(hw) CoresPerSocket=16:8(hw) >ThreadsPerCore=1:1(hw) It's saying "there are 8 sockets with 8 cores in hw instead of 4 sockets with 16 cores each in config" ? My feeling is that Slurm just has a (valid) config that says group2 while it was running on group1 in this case. Brice Thanks much, > > > >Craig > > > >On Wednesday, May 28, 2014 1:39 PM, Brice Goglin <brice.gog...@inria.fr> wrote: > > > >Aside of the BIOS config, are you sure that you have the exact same BIOS >*version* in each node? (can check in /sys/class/dmi/id/bios_*) Same Linux >kernel too? > >Also, recently we've seen somebody fix such problems by unplugging and replugging some CPUs on the motherboard. Seems crazy but it happened for real... > >By the way, your discussion of groups 1 and 2 below is wrong. Group 2 doesn't say that NUMA node == socket, and it doesn't report 8 sockets of 8 cores each. It reports 4 sockets containing 2 NUMA nodes each containing 8 cores each, and that's likely what you have here (AMD Opteron 6300 or 6200 processors?). > >Brice > > > > >Le 28/05/2014 12:27, Craig Kapfer a écrit : > >We have a bunch of 64-core (quad-socket, 16 cores/socket) AMD servers and some >of them are reporting the following error from slurm, which I gather gets its >info from hwloc: >>May 27 11:53:04 n001 slurmd[3629]: Node configuration differs from hardware: >>CPUs=64:64(hw) Boards=1:1(hw) SocketsPerBoard=4:8(hw) CoresPerSocket=16:8(hw) >>ThreadsPerCore=1:1(hw) >>All nodes have the exact same CPUs, motherboards and OS (PXE booted from the >>same master image even). The bios settings between nodes also look the same. >> The nodes only differ in the amount of memory and number of DIMMs. >>There are two sets of nodes with different output from lstopo: Group 1 >>(correct): reporting 4 sockets with 16 cores per socket Group 2 (incorrect): reporting 8 sockets with 8 cores per socket Group 2 seems to be (incorrectly?) taking numanodes as sockets. The output of lstopo is slightly different in the two groups, note the extra Socket layer for group 2: Group 1: Machine (128GB) NUMANode L#0 (P#0 32GB) + Socket L#0 #16 cores listed NUMANode L#1 (P#2 32GB) + Socket L#1 #16 cores listed etc Group 2: Machine (256GB) Socket L#0 (64GB) NUMANode L#0 (P#0 32GB) + L3 L#0 (6144KB) # 8 cores listed NUMANode L#1 (P#1 32GB) + L3 L#1 (6144KB) # 8 cores listed Socket L#1 (64GB) NUMANode L#2 (P#2 32GB) + L3 L#2 (6144KB) # 8 cores listed etc The group 2 reporting doesn't match our hardware, at least as far as sockets and cores per socket goes--is there a reason other than the memory configuration that could cause this? >>Thanks, >>Craig >> >> >> >> >>___ hwloc-users mailing list hwloc-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users > > > >
Re: [hwloc-users] node configuration differs form hardware
Interesting, quite right, thank you very much. Yes these are AMD 6300 series. Same kernel but these boxes seem to have different BIOS versions, direct from the factory, delivered in the same physical enclosure even! Some are AMI 3.5 and some are 3.0. So slurm is then incorrectly parsing correct output from lstopo to generate this message? May 27 11:53:04 n001 slurmd[3629]: Node configuration differs from hardware: CPUs=64:64(hw) Boards=1:1(hw) SocketsPerBoard=4:8(hw) CoresPerSocket=16:8(hw) ThreadsPerCore=1:1(hw) Thanks much, Craig On Wednesday, May 28, 2014 1:39 PM, Brice Goglin <brice.gog...@inria.fr> wrote: Aside of the BIOS config, are you sure that you have the exact same BIOS *version* in each node? (can check in /sys/class/dmi/id/bios_*) Same Linux kernel too? Also, recently we've seen somebody fix such problems by unplugging and replugging some CPUs on the motherboard. Seems crazy but it happened for real... By the way, your discussion of groups 1 and 2 below is wrong. Group 2 doesn't say that NUMA node == socket, and it doesn't report 8 sockets of 8 cores each. It reports 4 sockets containing 2 NUMA nodes each containing 8 cores each, and that's likely what you have here (AMD Opteron 6300 or 6200 processors?). Brice Le 28/05/2014 12:27, Craig Kapfer a écrit : We have a bunch of 64-core (quad-socket, 16 cores/socket) AMD servers and some of them are reporting the following error from slurm, which I gather gets its info from hwloc: >May 27 11:53:04 n001 slurmd[3629]: Node configuration differs from hardware: >CPUs=64:64(hw) Boards=1:1(hw) SocketsPerBoard=4:8(hw) CoresPerSocket=16:8(hw) >ThreadsPerCore=1:1(hw) >All nodes have the exact same CPUs, motherboards and OS (PXE booted from the >same master image even). The bios settings between nodes also look the same. >The nodes only differ in the amount of memory and number of DIMMs. >There are two sets of nodes with different output from lstopo: Group 1 >(correct): reporting 4 sockets with 16 cores per socket Group 2 (incorrect): reporting 8 sockets with 8 cores per socket Group 2 seems to be (incorrectly?) taking numanodes as sockets. The output of lstopo is slightly different in the two groups, note the extra Socket layer for group 2: Group 1: Machine (128GB) NUMANode L#0 (P#0 32GB) + Socket L#0 #16 cores listed NUMANode L#1 (P#2 32GB) + Socket L#1 #16 cores listed etc Group 2: Machine (256GB) Socket L#0 (64GB) NUMANode L#0 (P#0 32GB) + L3 L#0 (6144KB) # 8 cores listed NUMANode L#1 (P#1 32GB) + L3 L#1 (6144KB) # 8 cores listed Socket L#1 (64GB) NUMANode L#2 (P#2 32GB) + L3 L#2 (6144KB) # 8 cores listed etc The group 2 reporting doesn't match our hardware, at least as far as sockets and cores per socket goes--is there a reason other than the memory configuration that could cause this? >Thanks, >Craig > > > > >___ hwloc-users mailing list hwloc-us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users
[hwloc-users] node configuration differs form hardware
We have a bunch of 64-core (quad-socket, 16 cores/socket) AMD servers and some of them are reporting the following error from slurm, which I gather gets its info from hwloc: May 27 11:53:04 n001 slurmd[3629]: Node configuration differs from hardware: CPUs=64:64(hw) Boards=1:1(hw) SocketsPerBoard=4:8(hw) CoresPerSocket=16:8(hw) ThreadsPerCore=1:1(hw) All nodes have the exact same CPUs, motherboards and OS (PXE booted from the same master image even). The bios settings between nodes also look the same. The nodes only differ in the amount of memory and number of DIMMs. There are two sets of nodes with different output from lstopo: Group 1 (correct): reporting 4 sockets with 16 cores per socket Group 2 (incorrect): reporting 8 sockets with 8 cores per socket Group 2 seems to be (incorrectly?) taking numanodes as sockets. The output of lstopo is slightly different in the two groups, note the extra Socket layer for group 2: Group 1: Machine (128GB) NUMANode L#0 (P#0 32GB) + Socket L#0 #16 cores listed NUMANode L#1 (P#2 32GB) + Socket L#1 #16 cores listed etc Group 2: Machine (256GB) Socket L#0 (64GB) NUMANode L#0 (P#0 32GB) + L3 L#0 (6144KB) # 8 cores listed NUMANode L#1 (P#1 32GB) + L3 L#1 (6144KB) # 8 cores listed Socket L#1 (64GB) NUMANode L#2 (P#2 32GB) + L3 L#2 (6144KB) # 8 cores listed etc The group 2 reporting doesn't match our hardware, at least as far as sockets and cores per socket goes--is there a reason other than the memory configuration that could cause this? Thanks, Craig