Le 11/04/2012 09:06, tmish...@jcity.maeda.co.jp a écrit : > Hi, Brice. > > I installed the latest hwloc-1.4.1. > Here is the output of lstopo -p. > > [root@node03 bin]# ./lstopo -p > Machine (126GB) > Socket P#0 (32GB) > NUMANode P#0 (16GB) + L3 (5118KB) > L2 (512KB) + L1 (64KB) + Core P#0 + PU P#0 > L2 (512KB) + L1 (64KB) + Core P#1 + PU P#4 > L2 (512KB) + L1 (64KB) + Core P#2 + PU P#8 > L2 (512KB) + L1 (64KB) + Core P#3 + PU P#12
Ok then the cpuset of this numanode is 1111. >> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child >> [[55518,1],0] to cpus 1111 So openmpi 1.5.4 is correct. >> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child >> [[40566,1],0] to cpus 000f And openmpi 1.5.5 is indeed wrong. Random guess: 000f is the bitmask made of hwloc *logical* indexes. hwloc cpusets (used for binding) are internally made of hwloc *physical* indexes (1111 here). Jeff, Ralph: How does OMPI 1.5.5 build hwloc cpusets for binding? Are you doing bitmap operations on hwloc object cpusets? If yes, I don't know what's going wrong here. If no, are you building hwloc cpusets manually by setting individual bits from object indexes? If yes, you must use *physical* indexes to do so. Brice