Interesting. Jeff and I had discussed that very problem not that long ago, and I could swear he fixed it - but I don't see the CMR for that code. He's on vacation this week, so I'll wait for his return to look at it.
Thanks! Ralph On Apr 11, 2012, at 2:36 AM, Brice Goglin wrote: > A quick look at the code seems to confirm my feeling. get/set_module() > callbacks manipulate arrays of logical indexes, and they do not convert > them back to physical indexes before binding. > > Here's a quick patch that may help. Only compile tested... > > Brice > > > > Le 11/04/2012 09:49, Brice Goglin a écrit : >> Le 11/04/2012 09:06, tmish...@jcity.maeda.co.jp a écrit : >>> Hi, Brice. >>> >>> I installed the latest hwloc-1.4.1. >>> Here is the output of lstopo -p. >>> >>> [root@node03 bin]# ./lstopo -p >>> Machine (126GB) >>> Socket P#0 (32GB) >>> NUMANode P#0 (16GB) + L3 (5118KB) >>> L2 (512KB) + L1 (64KB) + Core P#0 + PU P#0 >>> L2 (512KB) + L1 (64KB) + Core P#1 + PU P#4 >>> L2 (512KB) + L1 (64KB) + Core P#2 + PU P#8 >>> L2 (512KB) + L1 (64KB) + Core P#3 + PU P#12 >> Ok then the cpuset of this numanode is 1111. >> >>>> [node03.cluster:21706] [[55518,0],0] odls:default:fork binding child >>>> [[55518,1],0] to cpus 1111 >> So openmpi 1.5.4 is correct. >> >>>> [node03.cluster:04706] [[40566,0],0] odls:default:fork binding child >>>> [[40566,1],0] to cpus 000f >> And openmpi 1.5.5 is indeed wrong. >> >> Random guess: 000f is the bitmask made of hwloc *logical* indexes. hwloc >> cpusets (used for binding) are internally made of hwloc *physical* >> indexes (1111 here). >> >> Jeff, Ralph: >> How does OMPI 1.5.5 build hwloc cpusets for binding? Are you doing >> bitmap operations on hwloc object cpusets? >> If yes, I don't know what's going wrong here. >> If no, are you building hwloc cpusets manually by setting individual >> bits from object indexes? If yes, you must use *physical* indexes to do so. >> >> Brice >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > <try.patch>_______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users