Hi - for the last couple of weeks, more or less since we did some kernel
updates, certain compute intensive MPI jobs have been behaving oddly as far as
their speed - bits that should be quite fast sometimes (but not consistently)
take a long time, and re-running sometimes fixes the issue, sometimes not. I’m
starting to suspect core binding problems, which I worry will be difficult to
debug, so I hoped to get some feedback on whether my observations are indeed
suggesting that there’s something wrong with the core binding.
I’m running withe CentOS 6 latest kernel (2.6.32-696.30.1.el6.x86_64), OpenMPI
3.1.0, a dual cpu 8 core + HT intel Xeon node. Code is compiled with ifort,
using “-mkl=sequential”, and just to be certain OMP_NUM_THREADS=1, so there
should be no OpenMP parallelism.
The main question is if I’m running 16 MPI tasks per node and look at the PSR
field from ps, should I get some simple sequence of numbers?
Here’s the beginning of the output report on the per-core binding I requested
from mpirun (—bind-to core)
[compute-7-2:31036] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]:
[BB/../../../../../../..][../../../../../../../..]
[compute-7-2:31036] MCW rank 1 bound to socket 1[core 8[hwt 0-1]]:
[../../../../../../../..][BB/../../../../../../..]
[compute-7-2:31036] MCW rank 2 bound to socket 0[core 1[hwt 0-1]]:
[../BB/../../../../../..][../../../../../../../..]
[compute-7-2:31036] MCW rank 3 bound to socket 1[core 9[hwt 0-1]]:
[../../../../../../../..][../BB/../../../../../..]
[compute-7-2:31036] MCW rank 4 bound to socket 0[core 2[hwt 0-1]]:
[../../BB/../../../../..][../../../../../../../..]
[compute-7-2:31036] MCW rank 5 bound to socket 1[core 10[hwt 0-1]]:
[../../../../../../../..][../../BB/../../../../..]
[compute-7-2:31036] MCW rank 6 bound to socket 0[core 3[hwt 0-1]]:
[../../../BB/../../../..][../../../../../../../..]
This is the PSR info from ps
PID PSR TTY TIME CMD
31043 1 ? 00:00:34 vasp.para.intel
31045 2 ? 00:00:34 vasp.para.intel
31047 3 ? 00:00:34 vasp.para.intel
31049 4 ? 00:00:34 vasp.para.intel
31051 5 ? 00:00:34 vasp.para.intel
31055 7 ? 00:00:34 vasp.para.intel
31042 8 ? 00:00:34 vasp.para.intel
31046 10 ? 00:00:34 vasp.para.intel
31048 11 ? 00:00:34 vasp.para.intel
31052 13 ? 00:00:34 vasp.para.intel
31054 14 ? 00:00:34 vasp.para.intel
31053 22 ? 00:00:34 vasp.para.intel
31044 25 ? 00:00:34 vasp.para.intel
31050 28 ? 00:00:34 vasp.para.intel
31056 31 ? 00:00:34 vasp.para.intel
Does this output look reasonable? For any sensible way I can think of to
enumerate the 32 virtual cores, those numbers don’t seem to correspond to one
mpi task per core. If this isn’t supposed to be giving meaningful output given
how openmpi does its binding, is there another tool that can tell me what cores
a running job is actually running on/bound to?
An additional bit of confusion is that "ps -mo pid,tid,fname,user,psr -p PID”
on one of those processes (which is supposed to be running without threaded
parallelism) reports 3 separate TID (which I think correspond to threads), with
3 different PSR values, that seem stable during the run, but don’t have any
connection to one another (not P and P+1, or P and P+8, or P and P+16).
thanks,
Noam
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users