Received from Andy Riebs on Tue, Sep 23, 2014 at 02:57:49PM EDT:
> On 9/23/2014 2:49 PM, Lev Givon wrote:
> >I have OpenMPI 1.8.2 compiled with PMI support enabled and slurm 2.6.5
> >installed on an
> >8-CPU machine running Ubuntu 14.04.1. I noticed that attempting to run any
> >program compiled against said OpenMPI installation via srun using
> >
> >srun -n X mpiexec program
> >
> >with X > 1 effectively is equivalent to running
> >
> >mpiexec -np X program
> >
> >X times. Is this behavior expected? Running the program via sbatch only
> >causes 1
> >run over X MPI processes.
>
>
> Lev, if you drop "mpiexec" from your command line, you should see
> the desired behaviour, i.e.,
>
> $ srun -n X program
Doing so does launch the program only X times, but the communicator size seen
by each
instance is 1, e.g., for the proverbial "Hello world" program, the output
Hello, world, I am 0 of 1 (myhost)
is generated X times.
Incidentally, I verified that OpenMPI was build against PMI successfully:
$ ldd /opt/openmpi-1.8.2/bin/mpiexec | grep pmi
libpmi.so.0 => /usr/lib/libpmi.so.0 (0x00002aed18f66000)
> (Also, be sure to recognize the difference between "-n" and "-N"!)
--
Lev Givon
Bionet Group | Neurokernel Project
http://www.columbia.edu/~lev/
http://lebedov.github.io/
http://neurokernel.github.io/