Interesting. No, we don't take PLPA cpu sets into account when retrieving the allocation.

Just to be clear: from an OMPI perspective, I don't think this is an issue of binding, but rather an issue of allocation. If we knew we had been allocated only a certain number of cores on a node, then we would only map that many procs to the node. When we subsequently "bind", we should then bind those procs to the correct cores (I think).

Could you check this? You can run a trivial job using the -npernode x option, where x matched the #cores you were allocated on the nodes.

If you do this, do we bind to the correct cores?

If we do, then that would confirm that we just aren't picking up the right number of cores allocated to us. If it is wrong, then this is a PLPA issue where it isn't binding to the right core.

Thanks
Ralph

On Jul 15, 2009, at 12:28 AM, Chris Samuel wrote:

Hi all,

Not sure if this is a OpenMPI query or a PLPA query,
but given that PLPA seems to have some support for it
already I thought I'd start here. :-)

We run a quad core Opteron cluster with Torque 2.3.x
which uses the kernels cpuset support to constrain
a job to just the cores it has been allocated.

However, we are seeing occasionally that where a job
has been allocated multiple cores on the same node we
get two compute bound MPI processes in the job scheduled
onto the same core (obviously a kernel issue).

So CPU affinity would be an obvious solution, but it
needs to be done with reference to the cores that are
available to it in its cpuset.

This information is already retrievable by PLPA (for
instance "plpa-taskset -cp $$" will retrieve the cores
allocated to the shell you run the command from) but
I'm not sure if OpenMPI makes use of this when binding
CPUs using the linux paffinity MCA parameter ?

Our testing (with 1.3.2) seems to show it doesn't, and
I don't think there are any significant differences with
the snapshots in 1.4.

Am I correct in this ?  If so, are there any plans to
make it do this ?

cheers,
Chris
--
Christopher Samuel - (03) 9925 4751 - Systems Manager
The Victorian Partnership for Advanced Computing
P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to