Hi Brice, I had hold an internal presentation on hwloc. It was success, people has liked it. One colleague has tried it on 8 socket box and we have found that memory was installed in the wrong slots resulting in very strange NUMA configuration.
There was some discussion about hwloc-distrib --among If I understand it correctly, --among accepts one of {pu,core,socket,node,machine} Should it support also option in form of socket:0 ?? I have tried it but it does not work for me. I do not understand results: ======================================================= $ hwloc-calc --po --proclist $(hwloc-distrib --single --among machine 4) 0,2,1,3 $ hwloc-calc --po --proclist $(hwloc-distrib --single --among numa 4) 0,2,1,3 $ hwloc-calc --po --proclist $(hwloc-distrib --single --among socket 4) 0,2,1,3 This seems to be OK. $ hwloc-calc --po --proclist $(hwloc-distrib --single --among core 4) 0,2,4,6 Among Socket:1 ?? $ hwloc-calc --po --proclist $(hwloc-distrib --single --among pu 4) 0,8,2,10 Among Core:0 and Core:1 ?? $ lstopo --physical Machine (12GB) NUMANode p#0 (6144MB) + Socket p#1 + L3 (12MB) L2 (256KB) + L1 (32KB) + Core p#0 PU p#0 PU p#8 L2 (256KB) + L1 (32KB) + Core p#1 PU p#2 PU p#10 L2 (256KB) + L1 (32KB) + Core p#9 PU p#4 PU p#12 L2 (256KB) + L1 (32KB) + Core p#10 PU p#6 PU p#14 NUMANode p#1 (6134MB) + Socket p#0 + L3 (12MB) L2 (256KB) + L1 (32KB) + Core p#0 PU p#1 PU p#9 L2 (256KB) + L1 (32KB) + Core p#1 PU p#3 PU p#11 L2 (256KB) + L1 (32KB) + Core p#9 PU p#5 PU p#13 L2 (256KB) + L1 (32KB) + Core p#10 PU p#7 PU p#15 ======================================================== Could you explain the usage model for --among? Which arguemts are supported and what effect they have? I have also attached output of hwloc-gather-topology.sh for 8 Socket system with two NUMA nodes. One NUMA node has 7 Sockets associated with it whereas another socket has just Socket connected to it. I have tried to use various --among and --ignore options to distribute 8 parallel jobs on a box so that each job is running on one socket. I was not able to achieve this. Could you please try it? What command should I use? Or is it perhaps some bug? I have used 1.1rc2 =============8 socket system======================= [root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib -- single --ignore machine 8) 0,1,16,24,32,40,48,56 [root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib -- single --ignore numa 8) 0,16,24,32,8,9,10,11 [root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib -- single --ignore socket 8) 0,16,24,32,8,9,10,11 [root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib -- single --ignore core 8) 0,16,24,32,8,9,10,11 [root@hp-dl980g7-01 utils]# ./hwloc-calc --po --proclist $(./hwloc-distrib -- single --ignore pu 8) 0,16,24,32,8,9,10,11 ================================================ Please notice that Socket#1 is never chosen. Could you please help me with it? Thanks a lot! Jirka
hp-dl980g7-01.tar.bz2
Description: application/bzip-compressed-tar