Hi Brice, I had hold an internal presentation on hwloc. It was success, people has liked it. One colleague has tried it on 8 socket box and we have found that memory was installed in the wrong slots resulting in very strange NUMA configuration.
There was some discussion about hwloc-distrib --among
If I understand it correctly, --among accepts one of
{pu,core,socket,node,machine}
Should it support also option in form of socket:0 ?? I have tried it but it
does not work for me.
I do not understand results:
=======================================================
$ hwloc-calc --po --proclist $(hwloc-distrib --single --among machine 4)
0,2,1,3
$ hwloc-calc --po --proclist $(hwloc-distrib --single --among numa 4)
0,2,1,3
$ hwloc-calc --po --proclist $(hwloc-distrib --single --among socket 4)
0,2,1,3
This seems to be OK.
$ hwloc-calc --po --proclist $(hwloc-distrib --single --among core 4)
0,2,4,6
Among Socket:1 ??
$ hwloc-calc --po --proclist $(hwloc-distrib --single --among pu 4)
0,8,2,10
Among Core:0 and Core:1 ??
$ lstopo --physical
Machine (12GB)
NUMANode p#0 (6144MB) + Socket p#1 + L3 (12MB)
L2 (256KB) + L1 (32KB) + Core p#0
PU p#0
PU p#8
L2 (256KB) + L1 (32KB) + Core p#1
PU p#2
PU p#10
L2 (256KB) + L1 (32KB) + Core p#9
PU p#4
PU p#12
L2 (256KB) + L1 (32KB) + Core p#10
PU p#6
PU p#14
NUMANode p#1 (6134MB) + Socket p#0 + L3 (12MB)
L2 (256KB) + L1 (32KB) + Core p#0
PU p#1
PU p#9
L2 (256KB) + L1 (32KB) + Core p#1
PU p#3
PU p#11
L2 (256KB) + L1 (32KB) + Core p#9
PU p#5
PU p#13
L2 (256KB) + L1 (32KB) + Core p#10
PU p#7
PU p#15
========================================================
Could you explain the usage model for --among? Which arguemts are supported
and what effect they have?
I have also attached output of hwloc-gather-topology.sh for 8 Socket system
with two NUMA nodes. One NUMA node has 7 Sockets associated with it whereas
another socket has just Socket connected to it.
I have tried to use various --among and --ignore options to distribute 8
parallel jobs on a box so that each job is running on one socket. I was not
able to achieve this.
Could you please try it? What command should I use? Or is it perhaps some bug?
I have used 1.1rc2
=============8 socket system=======================
[root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore machine 8)
0,1,16,24,32,40,48,56
[root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore numa 8)
0,16,24,32,8,9,10,11
[root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore socket 8)
0,16,24,32,8,9,10,11
[root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore core 8)
0,16,24,32,8,9,10,11
[root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib --
single --ignore pu 8)
0,16,24,32,8,9,10,11
================================================
Please notice that Socket#1 is never chosen. Could you please help me with it?
Thanks a lot!
Jirka
hp-dl980g7-01.tar.bz2
Description: application/bzip-compressed-tar
