Hi, Thank you everyone for the feedback.
As Peter pointed out, the complexity of these examples is too small to run across nodes. Sorry for the naive thread :) On 21 June 2016 at 15:05, Ralph Castain <r...@open-mpi.org> wrote: > > I wouldn’t say mpirun is “screwing up” the placement of the ranks, but it > will default to filling a node before starting to place procs on the next > node. This is done to optimize performance as shared memory is faster than > inter-node fabric. If you want the procs to instead “balance” across all > available nodes, then you need to tell mpirun that’s what you want. > > Check “mpirun -h” to find the right option to get the desired behavior. > > > On Jun 21, 2016, at 6:49 AM, Peter Kjellström <c...@nsc.liu.se> wrote: > > > > > > On Sun, 19 Jun 2016 09:15:34 -0700 > > Achi Hamza <h16m...@gmail.com> wrote: > > > >> Hi everyone, > >> > >> I set a lab consists of 5 simple nodes (1 head and 4 compute nodes), > >> i used SLURM 15.08.11, OpenMpi 1.10.2, MPICH 3.2, FFTW 3.3.4 and > >> LAMMPS 16May2016. > > > > Did you use OpenMPI _or_ MPICH? > > > > Two observations on the below behavior: > > > > * a 17 second runtime could indicate a very small problem that doesn't > > scale past a few ranks. > > > > * Your mpi launch could screw up the placement/pinning of ranks > > > > /Peter K > > > >> After successful installation of the above i conducted some tests > >> using the existing examples of LAMMPS. I got unrealistic results, the > >> time execution goes up as i increase the number of nodes ! > >> > >> mpirun -np 4 lmp_openmpi < in.vacf.2d > >> *Total wall time: 0:00:17* > >> > >> mpirun -np 8 lmp_openmpi < in.vacf.2d > >> *Total wall time: 0:00:23* > >> > >> mpirun -np 12 lmp_openmpi < in.vacf.2d > >> *Total wall time: 0:00:28* > >> > >> mpirun -np 16 lmp_openmpi < in.vacf.2d > >> *Total wall time: 0:00:33* > >> > >> > >> interestingly, *srun* results are worse than mpirun: > >> > >> srun --mpi=pmi2 -n 16 lmp_openmpi < in.vacf.2d > >> > >> *Total wall time: 0:05:54* >