Hi all,

Not sure if this is a OpenMPI query or a PLPA query,
but given that PLPA seems to have some support for it
already I thought I'd start here. :-)

We run a quad core Opteron cluster with Torque 2.3.x
which uses the kernels cpuset support to constrain
a job to just the cores it has been allocated.

However, we are seeing occasionally that where a job
has been allocated multiple cores on the same node we
get two compute bound MPI processes in the job scheduled
onto the same core (obviously a kernel issue).

So CPU affinity would be an obvious solution, but it
needs to be done with reference to the cores that are
available to it in its cpuset.

This information is already retrievable by PLPA (for
instance "plpa-taskset -cp $$" will retrieve the cores
allocated to the shell you run the command from) but
I'm not sure if OpenMPI makes use of this when binding
CPUs using the linux paffinity MCA parameter ?

Our testing (with 1.3.2) seems to show it doesn't, and
I don't think there are any significant differences with
the snapshots in 1.4.

Am I correct in this ?  If so, are there any plans to
make it do this ?

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency

Reply via email to