----- "Ralph Castain" <r...@open-mpi.org> wrote:

> Could you check this? You can run a trivial job using the -npernode x 
> option, where x matched the #cores you were allocated on the nodes.
> If you do this, do we bind to the correct cores?

Nope, I'm afraid it doesn't - submitted a job asking
for 4 cores on one node and was allocated cores 0-3 in
the cpuset.

Grep'ing the strace output for anything mentioning affinity shows:

[csamuel@tango027 CPI]$ fgrep affinity cpi-trace.txt
11412 execve("/usr/local/openmpi/1.3.3-gcc/bin/mpiexec", ["mpiexec", "--mca", 
"paffinity", "linux", "-npernode", "4", "/home/csamuel/Sources/Tests/CPI/"...], 
[/* 56 vars */]) = 0
11412 sched_getaffinity(0, 128,  { f }) = 8
11412 sched_setaffinity(0, 8,  { 0 })   = -1 EFAULT (Bad address)
11416 sched_getaffinity(0, 128,  <unfinished ...>
11416 <... sched_getaffinity resumed>  { f }) = 8
11416 sched_setaffinity(0, 8,  { 0 } <unfinished ...>
11416 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
11414 sched_getaffinity(0, 128,  <unfinished ...>
11414 <... sched_getaffinity resumed>  { f }) = 8
11414 sched_setaffinity(0, 8,  { 0 } <unfinished ...>
11414 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
11413 sched_getaffinity(0, 128,  <unfinished ...>
11413 <... sched_getaffinity resumed>  { f }) = 8
11413 sched_setaffinity(0, 8,  { 0 } <unfinished ...>
11413 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
11415 sched_getaffinity(0, 128,  <unfinished ...>
11415 <... sched_getaffinity resumed>  { f }) = 8
11415 sched_setaffinity(0, 8,  { 0 } <unfinished ...>
11415 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
11413 sched_getaffinity(11413, 8,  <unfinished ...>
11415 sched_getaffinity(11415, 8,  <unfinished ...>
11413 <... sched_getaffinity resumed>  { f }) = 8
11415 <... sched_getaffinity resumed>  { f }) = 8
11414 sched_getaffinity(11414, 8,  <unfinished ...>
11414 <... sched_getaffinity resumed>  { f }) = 8
11416 sched_getaffinity(11416, 8,  <unfinished ...>
11416 <... sched_getaffinity resumed>  { f }) = 8

I can confirm that it's not worked by checking what
plpa-taskset says about a process (for example 11414):

[root@tango027 plpa-taskset]# ./plpa-taskset -cp 11414
pid 11414's current affinity list: 0-3

According to the manual page:

       EFAULT A supplied memory address was invalid.

This is on a dual socket quad core AMD Shanghai system
running the 2.6.28.9 kernel (not had a chance to upgrade
recently).

Will do some more poking around after lunch.

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency

Reply via email to