Re: [hwloc-users] node configuration differs form hardware

2014-05-29 Thread Craig Kapfer
..@open-mpi.org> Sent: Wednesday, May 28, 2014 5:16 PM Subject: Re: [hwloc-users] node configuration differs form hardware Le 28/05/2014 15:46, Craig Kapfer a écrit : Wait, I'm sorry, I must be missing something, please bear with me! > >By the way, your discussion of groups 1 and 2

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Craig Kapfer
Wait, I'm sorry, I must be missing something, please bear with me! By the way, your discussion of groups 1 and 2 below is wrong. Group 2 doesn't say that NUMA node == socket, and it doesn't report 8 sockets of 8 cores each. It reports 4 sockets containing 2 NUMA nodes each containing 8 cores

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Kenneth A. Lloyd
[mailto:hwloc-users-boun...@open-mpi.org] On Behalf Of Brice Goglin Sent: Wednesday, May 28, 2014 7:01 AM To: Craig Kapfer; Hardware locality user list Subject: Re: [hwloc-users] node configuration differs form hardware Le 28/05/2014 14:57, Craig Kapfer a écrit : Hmm ... the slurm config

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
Le 28/05/2014 14:57, Craig Kapfer a écrit : > > > Hmm ... the slurm config defines that all nodes have 4 sockets with 16 > cores per socket (which corresponds to the hardware--all nodes are the > same). Slurm node config is as follows: > > NodeName=n[001-008] RealMemory=258452 Sockets=4

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Craig Kapfer
Hmm ... the slurm config defines that all nodes have 4 sockets with 16 cores per socket (which corresponds to the hardware--all nodes are the same).   Slurm node config is as follows: NodeName=n[001-008] RealMemory=258452 Sockets=4 CoresPerSocket=16 ThreadsPerCore=1 State=UNKNOWN

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
Le 28/05/2014 14:13, Craig Kapfer a écrit : > Interesting, quite right, thank you very much. Yes these are AMD 6300 > series. Same kernel but these boxes seem to have different BIOS > versions, direct from the factory, delivered in the same physical > enclosure even! Some are AMI 3.5 and some

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Craig Kapfer
Interesting, quite right, thank you very much.  Yes these are AMD 6300 series.   Same kernel but these boxes seem to have different BIOS versions, direct from the factory, delivered in the same physical enclosure even!  Some are AMI 3.5 and some are 3.0. So slurm is then incorrectly parsing

Re: [hwloc-users] node configuration differs form hardware

2014-05-28 Thread Brice Goglin
Aside of the BIOS config, are you sure that you have the exact same BIOS *version* in each node? (can check in /sys/class/dmi/id/bios_*) Same Linux kernel too? Also, recently we've seen somebody fix such problems by unplugging and replugging some CPUs on the motherboard. Seems crazy but it