On 10/06/16 00:35, Jason Bacon wrote:

> I can imagine this issue going unnoticed most of the time, because it
> will only cause a problem when an OMPI job shares a node with another
> job using core binding, which is infrequent on our clusters.

To reiterate:

You *really* want to be using Linux cgroups in Slurm then, that will
prevent this happening as the processes will only ever be able to bind
to the cores they've been allocated.

IIRC Open-MPI uses hwloc so it'll only see what cores are in its cgroup
and just try and use them, it'll ignore anything else.

All the best,
Chris
-- 
 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: [email protected] Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci

Reply via email to