Hi Ralph, > On Sep 21, 2015, at 8:36 PM, Ralph Castain <r...@open-mpi.org> wrote: > > > This sounds like something in Slurm - I don’t know how srun would know to > emit a message if the app was failing to open a socket between its own procs. > > Try starting the OMPI job with “mpirun” instead of srun and see if it has the > same issue. If not, then that’s pretty convincing that it’s slurm.
Yes, I did this earlier and got multiple errors along the lines of: mpiexec -n 6144 --mca oob_tcp_if_include ib0 /lustre/janus_scratch/tibr1099/osu/libexec/osu-micro-benchmarks/mpi/collective/osu_alltoall [node0229:01054] [[33451,0],0]->[[33451,0],32] mca_oob_tcp_msg_send_bytes: write failed: Broken pipe (32) [sd = 179] [node0229:01054] [[33451,0],0]-[[33451,0],32] mca_oob_tcp_peer_send_handler: unable to send header [node0229:01054] [[33451,0],0]->[[33451,0],2] mca_oob_tcp_msg_send_bytes: write failed: Broken pipe (32) [sd = 199] [node0229:01054] [[33451,0],0]-[[33451,0],2] mca_oob_tcp_peer_send_handler: unable to send header [node0229:01054] [[33451,0],0]->[[33451,0],128] mca_oob_tcp_msg_send_bytes: write failed: Broken pipe (32) [sd = 289] [node0229:01054] [[33451,0],0]-[[33451,0],128] mca_oob_tcp_peer_send_handler: unable to send header [node0229:01054] [[33451,0],0]->[[33451,0],16] mca_oob_tcp_msg_send_bytes: write failed: Broken pipe (32) [sd = 389] [node0229:01054] [[33451,0],0]-[[33451,0],16] mca_oob_tcp_peer_send_handler: unable to send header slurmstepd: error: *** JOB 973754 CANCELLED AT 2015-09-19T22:47:58 DUE TO TIME LIMIT on node0229 *** Regards, Timothy > > >> On Sep 21, 2015, at 7:26 PM, Timothy Brown <timothy.brow...@colorado.edu> >> wrote: >> >> >> Hi Chris, >> >> >>> On Sep 21, 2015, at 7:36 PM, Christopher Samuel <sam...@unimelb.edu.au> >>> wrote: >>> >>> >>> On 22/09/15 07:17, Timothy Brown wrote: >>> >>>> This is using mpiexec.hydra with slurm as the bootstrap. >>> >>> Have you tried Intel MPI's native PMI start up mode? >>> >>> You just need to set the environment variable I_MPI_PMI_LIBRARY to the >>> path to the Slurm libpmi.so file and then you should be able to use srun >>> to launch your job instead. >>> >> >> Yeap, to the same effect. Here's what it gives: >> >> srun --mpi=pmi2 >> /lustre/janus_scratch/tibr1099/osu_impi/libexec/osu-micro-benchmarks//mpi/collective/osu_alltoall >> srun: error: Task launch for 973564.0 failed on node node0453: Socket timed >> out on send/recv operation >> srun: error: Application launch failed: Socket timed out on send/recv >> operation >> >> >> >>> More here: >>> >>> http://slurm.schedmd.com/mpi_guide.html#intel_srun >>> >>>> If I switch to OpenMPI the error is: >>> >>> Which version, and was it build with --with-slurm and (if you're >>> version is not too ancient) --with-pmi=/path/to/slurm/install ? >> >> Yeap. 1.8.5 (for 1.10 we're going to try and move everything to EasyBuild). >> Yes we included PMI and the Slurm option. Our configure statement was: >> >> module purge >> module load slurm/slurm >> module load gcc/5.1.0 >> ./configure \ >> --prefix=/curc/tools/x86_64/rh6/software/openmpi/1.8.5/gcc/5.1.0 \ >> --with-threads=posix \ >> --enable-mpi-thread-multiple \ >> --with-slurm \ >> --with-pmi=/curc/slurm/slurm/current/ \ >> --enable-static \ >> --enable-wrapper-rpath \ >> --enable-sensors \ >> --enable-mpi-ext=all \ >> --with-verbs >> >> It's got me scratching my head, as I started off thinking it was an MPI >> issue, spent awhile getting Intel's hydra and OpenMPI's oob to go over IB >> instead of gig-e. This increased the success rate, but we were still failing. >> >> Tried out a pure PMI (version 1) code (init, rank, size, fini), which worked >> a lot of the times. Which made me think it was MPI again! However that fails >> enough to say it's not MPI. The PMI v2 code I wrote, gives the wrong results >> for rank and world size, so I'm sweeping that under the rug until I >> understand it! >> >> Just wondering if anybody has seen anything like this. Am happy to share our >> conf file if that helps. >> >> The only other thing I could possibly point a finger at (but don't believe >> it is), is that the slurm masters (slurmctld) are only on gig-E. >> >> I'm half thinking of opening a TT, but was hoping to get more information >> (and possibly not increase the logging of slurm, which is my only next idea). >> >> Thanks for your thoughts Chris. >> >> Timothy=