I wouldn’t say mpirun is “screwing up” the placement of the ranks, but it will 
default to filling a node before starting to place procs on the next node. This 
is done to optimize performance as shared memory is faster than inter-node 
fabric. If you want the procs to instead “balance” across all available nodes, 
then you need to tell mpirun that’s what you want.

Check “mpirun -h” to find the right option to get the desired behavior.

> On Jun 21, 2016, at 6:49 AM, Peter Kjellström <c...@nsc.liu.se> wrote:
> 
> 
> On Sun, 19 Jun 2016 09:15:34 -0700
> Achi Hamza <h16m...@gmail.com> wrote:
> 
>> Hi everyone,
>> 
>> I set a lab consists of 5 simple nodes (1 head and 4 compute nodes),
>> i used SLURM 15.08.11, OpenMpi 1.10.2, MPICH 3.2, FFTW 3.3.4 and
>> LAMMPS 16May2016.
> 
> Did you use OpenMPI _or_ MPICH?
> 
> Two observations on the below behavior:
> 
> * a 17 second runtime could indicate a very small problem that doesn't
>  scale past a few ranks.
> 
> * Your mpi launch could screw up the placement/pinning of ranks
> 
> /Peter K
> 
>> After successful installation of the above i conducted some tests
>> using the existing examples of LAMMPS. I got unrealistic results, the
>> time execution goes up as i increase the number of nodes !
>> 
>> mpirun -np 4 lmp_openmpi < in.vacf.2d
>> *Total wall time: 0:00:17*
>> 
>> mpirun -np 8 lmp_openmpi < in.vacf.2d
>> *Total wall time: 0:00:23*
>> 
>> mpirun -np 12 lmp_openmpi < in.vacf.2d
>> *Total wall time: 0:00:28*
>> 
>> mpirun -np 16 lmp_openmpi < in.vacf.2d
>> *Total wall time: 0:00:33*
>> 
>> 
>> interestingly, *srun* results are worse than mpirun:
>> 
>> srun --mpi=pmi2 -n 16 lmp_openmpi < in.vacf.2d
>> 
>> *Total wall time: 0:05:54*

Reply via email to