What version of Slurm?
How many tasks/ranks in your job?
Can you run a non-MPI job of the same size (i.e. srun hostname)?

Quoting Ralph Castain <[email protected]>:
This sounds like something in Slurm - I don’t know how srun would know to emit a message if the app was failing to open a socket between its own procs.

Try starting the OMPI job with “mpirun” instead of srun and see if it has the same issue. If not, then that’s pretty convincing that it’s slurm.


On Sep 21, 2015, at 7:26 PM, Timothy Brown <[email protected]> wrote:


Hi Chris,


On Sep 21, 2015, at 7:36 PM, Christopher Samuel <[email protected]> wrote:


On 22/09/15 07:17, Timothy Brown wrote:

This is using mpiexec.hydra with slurm as the bootstrap.

Have you tried Intel MPI's native PMI start up mode?

You just need to set the environment variable I_MPI_PMI_LIBRARY to the
path to the Slurm libpmi.so file and then you should be able to use srun
to launch your job instead.


Yeap, to the same effect. Here's what it gives:

srun --mpi=pmi2 /lustre/janus_scratch/tibr1099/osu_impi/libexec/osu-micro-benchmarks//mpi/collective/osu_alltoall srun: error: Task launch for 973564.0 failed on node node0453: Socket timed out on send/recv operation srun: error: Application launch failed: Socket timed out on send/recv operation



More here:

http://slurm.schedmd.com/mpi_guide.html#intel_srun

If I switch to OpenMPI the error is:

Which version, and was it build with --with-slurm and (if you're
version is not too ancient) --with-pmi=/path/to/slurm/install ?

Yeap. 1.8.5 (for 1.10 we're going to try and move everything to EasyBuild). Yes we included PMI and the Slurm option. Our configure statement was:

module purge
module load slurm/slurm
module load gcc/5.1.0
./configure  \
 --prefix=/curc/tools/x86_64/rh6/software/openmpi/1.8.5/gcc/5.1.0 \
 --with-threads=posix \
 --enable-mpi-thread-multiple \
 --with-slurm \
 --with-pmi=/curc/slurm/slurm/current/ \
 --enable-static \
 --enable-wrapper-rpath \
 --enable-sensors \
 --enable-mpi-ext=all \
 --with-verbs

It's got me scratching my head, as I started off thinking it was an MPI issue, spent awhile getting Intel's hydra and OpenMPI's oob to go over IB instead of gig-e. This increased the success rate, but we were still failing.

Tried out a pure PMI (version 1) code (init, rank, size, fini), which worked a lot of the times. Which made me think it was MPI again! However that fails enough to say it's not MPI. The PMI v2 code I wrote, gives the wrong results for rank and world size, so I'm sweeping that under the rug until I understand it!

Just wondering if anybody has seen anything like this. Am happy to share our conf file if that helps.

The only other thing I could possibly point a finger at (but don't believe it is), is that the slurm masters (slurmctld) are only on gig-E.

I'm half thinking of opening a TT, but was hoping to get more information (and possibly not increase the logging of slurm, which is my only next idea).

Thanks for your thoughts Chris.

Timothy=


--
Morris "Moe" Jette
CTO, SchedMD LLC
Commercial Slurm Development and Support

Reply via email to