ps -eaf --forest  or indeed pstree is a good way to see what is going on.
Also 'htop' is a very useful utility.

Also well worth running 'lstopo' to look at the layout of cores nd caches
on your machines.

On Mon, 3 Aug 2020 at 09:40, John Duffy via users <users@lists.open-mpi.org>
wrote:

> Hi
>
> I’m experimenting with hybrid OpenMPI/OpenMP Linpack benchmarks on my
> small cluster, and I’m a bit confused as to how to invoke mpirun.
>
> I have compiled/linked HPL-2.3 with OpenMPI and libopenblas-openmp using
> the GCC -fopenmp option on Ubuntu 20.04 64-bit.
>
> With P=1 and Q=1 in HPL.dat, if I use…
>
> mpirun -x OMP_NUM_THREADS=4 xhpl
>
> top reports...
>
> top - 08:03:59 up 1 day, 0 min,  1 user,  load average: 2.25, 1.23, 0.88
> Tasks: 138 total,   2 running, 136 sleeping,   0 stopped,   0 zombie
>
> %Cpu(s): 77.1 us, 22.2 sy,  0.0 ni,  0.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 
> st
> MiB Mem :   3793.3 total,    434.0 free,   2814.1 used,    545.2 buff/cache
> MiB Swap:      0.0 total,      0.0 free,      0.0 used.    919.9 avail Mem
>
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
> COMMAND
>
>    5787 john      20   0 2959408   2.6g   8128 R 354.0  69.1   2:10.43
> xhpl
>
>    5789 john      20   0  263352   9960   7440 S  14.2   0.3   0:07.42
> xhpl
>
>    5788 john      20   0  263352   9844   7320 S  13.9   0.3   0:07.19
> xhpl
>
>    5790 john      20   0  263356   9896   7376 S  13.6   0.3   0:07.17
> xhpl
>
>
> … which seems reasonable, but I don’t understand why there are 4 xhpl
> processes.
>
>
> In anticipation of adding more nodes, if I use…
>
> mpirun --host node1 --map-by ppr:1:node -x OMP_NUM_THREADS=4 xhpl
>
> top reports...
>
> top - 07:56:27 up 23:52,  1 user,  load average: 1.00, 0.98, 0.68
> Tasks: 133 total,   2 running, 131 sleeping,   0 stopped,   0 zombie
>
> %Cpu(s): 25.1 us,  0.0 sy,  0.0 ni, 74.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 
> st
> MiB Mem :   3793.3 total,    454.2 free,   2794.5 used,    544.7 buff/cache
> MiB Swap:      0.0 total,      0.0 free,      0.0 used.    939.9 avail Mem
>
>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
> COMMAND
>
>    5770 john      20   0 2868700   2.5g   7668 R  99.7  68.7   5:20.37
> xhpl
>
>
> … a single xhpl process (as expected), but with only 25% CPU utilisation
> and no other processes running on the other 3 cores. It would appear
> OpenBLAS is not utilising the 4 cores as expected.
>
>
> If I then scale it to 2 nodes, with P=1 and Q=2 in HPL.dat...
>
> mpirun --host node1,node2 --map-by ppr:1:node -x OMP_NUM_THREADS=4 xhpl
>
> … similarly, I get a single process on each node, with only 25% CPU
> utilisation.
>
>
> Any advice/suggestions on how to involve mpirun in a hybrid OpenMPI/OpenMP
> setup would be appreciated.
>
> Kind regards
>
>
>
>

Reply via email to