Hi - for the last couple of weeks, more or less since we did some kernel 
updates, certain compute intensive MPI jobs have been behaving oddly as far as 
their speed - bits that should be quite fast sometimes (but not consistently) 
take a long time, and re-running sometimes fixes the issue, sometimes not.  I’m 
starting to suspect core binding problems, which I worry will be difficult to 
debug, so I hoped to get some feedback on whether my observations are indeed 
suggesting that there’s something wrong with the core binding.

I’m running withe CentOS 6 latest kernel (2.6.32-696.30.1.el6.x86_64), OpenMPI 
3.1.0, a dual cpu 8 core + HT intel Xeon node.  Code is compiled with ifort, 
using “-mkl=sequential”, and just to be certain OMP_NUM_THREADS=1, so there 
should be no OpenMP parallelism.

The main question is if I’m running 16 MPI tasks per node and look at the PSR 
field from ps, should I get some simple sequence of numbers?

Here’s the beginning of the output report on the per-core binding I requested 
from mpirun (—bind-to core)
[compute-7-2:31036] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: 
[BB/../../../../../../..][../../../../../../../..]
[compute-7-2:31036] MCW rank 1 bound to socket 1[core 8[hwt 0-1]]: 
[../../../../../../../..][BB/../../../../../../..]
[compute-7-2:31036] MCW rank 2 bound to socket 0[core 1[hwt 0-1]]: 
[../BB/../../../../../..][../../../../../../../..]
[compute-7-2:31036] MCW rank 3 bound to socket 1[core 9[hwt 0-1]]: 
[../../../../../../../..][../BB/../../../../../..]
[compute-7-2:31036] MCW rank 4 bound to socket 0[core 2[hwt 0-1]]: 
[../../BB/../../../../..][../../../../../../../..]
[compute-7-2:31036] MCW rank 5 bound to socket 1[core 10[hwt 0-1]]: 
[../../../../../../../..][../../BB/../../../../..]
[compute-7-2:31036] MCW rank 6 bound to socket 0[core 3[hwt 0-1]]: 
[../../../BB/../../../..][../../../../../../../..]

This is the PSR info from ps
  PID PSR TTY          TIME CMD
31043   1 ?        00:00:34 vasp.para.intel
31045   2 ?        00:00:34 vasp.para.intel
31047   3 ?        00:00:34 vasp.para.intel
31049   4 ?        00:00:34 vasp.para.intel
31051   5 ?        00:00:34 vasp.para.intel
31055   7 ?        00:00:34 vasp.para.intel
31042   8 ?        00:00:34 vasp.para.intel
31046  10 ?        00:00:34 vasp.para.intel
31048  11 ?        00:00:34 vasp.para.intel
31052  13 ?        00:00:34 vasp.para.intel
31054  14 ?        00:00:34 vasp.para.intel
31053  22 ?        00:00:34 vasp.para.intel
31044  25 ?        00:00:34 vasp.para.intel
31050  28 ?        00:00:34 vasp.para.intel
31056  31 ?        00:00:34 vasp.para.intel

Does this output look reasonable? For any sensible way I can think of to 
enumerate the 32 virtual cores, those numbers don’t seem to correspond to one 
mpi task per core. If this isn’t supposed to be giving meaningful output given 
how openmpi does its binding, is there another tool that can tell me what cores 
a running job is actually running on/bound to?

An additional bit of confusion is that "ps -mo pid,tid,fname,user,psr -p PID” 
on one of those processes (which is supposed to be running without threaded 
parallelism) reports 3 separate TID (which I think correspond to threads), with 
3 different PSR values, that seem stable during the run, but don’t have any 
connection to one another (not P and P+1, or P and P+8, or P and P+16).


                                                                                
                thanks,
                                                                                
                Noam
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to