Hi, Brice. I installed the latest hwloc-1.4.1. Here is the output of lstopo -p.
[root@node03 bin]# ./lstopo -p Machine (126GB) Socket P#0 (32GB) NUMANode P#0 (16GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#0 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#4 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#8 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#12 NUMANode P#1 (16GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#16 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#20 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#24 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#28 Socket P#3 (32GB) NUMANode P#6 (16GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#1 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#5 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#9 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#13 NUMANode P#7 (16GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#17 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#21 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#25 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#29 Socket P#2 (32GB) NUMANode P#4 (16GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#2 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#6 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#10 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#14 NUMANode P#5 (16GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#18 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#22 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#26 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#30 Socket P#1 (32GB) NUMANode P#2 (16GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#3 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#7 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#11 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#15 NUMANode P#3 (16GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#19 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#23 L2 (512KB) + L1 (64KB) + Core P#2 + PU P#27 L2 (512KB) + L1 (64KB) + Core P#3 + PU P#31 HostBridge P#0 PCIBridge PCI 14e4:1639 Net "eth0" PCI 14e4:1639 Net "eth1" PCIBridge PCI 14e4:1639 Net "eth2" PCI 14e4:1639 Net "eth3" PCIBridge PCIBridge PCIBridge PCI 1000:0072 Block "sdb" Block "sda" PCI 1002:4390 Block "sr0" PCIBridge PCI 102b:0532 HostBridge P#1 PCIBridge PCI 15b3:6274 Net "ib0" OpenFabrics "mthca0" Tetsuya Mishima > Can you send the output of lstopo -p ? (you'll have to install hwloc) > Brice > > > tmish...@jcity.maeda.co.jp a écrit : > > > Hi, > > I updated openmpi from version 1.5.4 to 1.5.5. > Then, an execution speed of my application becomes quite slower than > before, > due to wrong core bindings. As far as I checked, it seems that > openmpi-1.5.4 > gives correct core bindings for my magnycore based machine. > > 1) my script is as follows: > export OMP_NUM_THREADS=4 > mpirun -machinefile pbs_hosts \ > -np 8 \ > -x OMP_NUM_THREADS \ > -bind-to-core \ > -cpus-per-proc ${OMP_NUM_THREADS} \ > -report-bindings \ > ./Solver > > 2)binding reports are as follows: > openmpi-1.5.4: > [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child > [[55518,1],3] to cpus 22220000 > [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child > [[55518,1],4] to cpus 4444 > [node03.cluster:21706] [[55518,0],0] > odls:default:fork binding child > [[55518,1],5] to cpus 44440000 > [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child > [[55518,1],6] to cpus 8888 > [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child > [[55518,1],7] to cpus 88880000 > [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child > [[55518,1],0] to cpus 1111 > [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child > [[55518,1],1] to cpus 11110000 > [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child > [[55518,1],2] to cpus 2222 > openmpi-1.5.5: > [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child > [[40566,1],3] to cpus f000 > [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child > [[40566,1],4] to cpus 000f0000 > [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child > [[40566,1],5] to cpus 00f00000 > [node03.cluster:04706] [[40566,0],0] > odls:default:fork binding child > [[40566,1],6] to cpus 0f000000 > [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child > [[40566,1],7] to cpus f0000000 > [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child > [[40566,1],0] to cpus 000f > [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child > [[40566,1],1] to cpus 00f0 > [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child > [[40566,1],2] to cpus 0f00 > > 3)node03 has 32 cores with 4 magnycores(8cores/cpu-type). > > Regards, > Tetsuya Mishima > > > > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users