Hi Ralph,

> On Sep 21, 2015, at 8:36 PM, Ralph Castain <r...@open-mpi.org> wrote:
> 
> 
> This sounds like something in Slurm - I don’t know how srun would know to 
> emit a message if the app was failing to open a socket between its own procs.
> 
> Try starting the OMPI job with “mpirun” instead of srun and see if it has the 
> same issue. If not, then that’s pretty convincing that it’s slurm.

Yes, I did this earlier and got multiple errors along the lines of:

mpiexec -n 6144 --mca oob_tcp_if_include ib0 
/lustre/janus_scratch/tibr1099/osu/libexec/osu-micro-benchmarks/mpi/collective/osu_alltoall
[node0229:01054] [[33451,0],0]->[[33451,0],32] mca_oob_tcp_msg_send_bytes: 
write failed: Broken pipe (32) [sd = 179]
[node0229:01054] [[33451,0],0]-[[33451,0],32] mca_oob_tcp_peer_send_handler: 
unable to send header
[node0229:01054] [[33451,0],0]->[[33451,0],2] mca_oob_tcp_msg_send_bytes: write 
failed: Broken pipe (32) [sd = 199]
[node0229:01054] [[33451,0],0]-[[33451,0],2] mca_oob_tcp_peer_send_handler: 
unable to send header
[node0229:01054] [[33451,0],0]->[[33451,0],128] mca_oob_tcp_msg_send_bytes: 
write failed: Broken pipe (32) [sd = 289]
[node0229:01054] [[33451,0],0]-[[33451,0],128] mca_oob_tcp_peer_send_handler: 
unable to send header
[node0229:01054] [[33451,0],0]->[[33451,0],16] mca_oob_tcp_msg_send_bytes: 
write failed: Broken pipe (32) [sd = 389]
[node0229:01054] [[33451,0],0]-[[33451,0],16] mca_oob_tcp_peer_send_handler: 
unable to send header
slurmstepd: error: *** JOB 973754 CANCELLED AT 2015-09-19T22:47:58 DUE TO TIME 
LIMIT on node0229 ***


Regards,
Timothy
 
> 
> 
>> On Sep 21, 2015, at 7:26 PM, Timothy Brown <timothy.brow...@colorado.edu> 
>> wrote:
>> 
>> 
>> Hi Chris,
>> 
>> 
>>> On Sep 21, 2015, at 7:36 PM, Christopher Samuel <sam...@unimelb.edu.au> 
>>> wrote:
>>> 
>>> 
>>> On 22/09/15 07:17, Timothy Brown wrote:
>>> 
>>>> This is using mpiexec.hydra with slurm as the bootstrap. 
>>> 
>>> Have you tried Intel MPI's native PMI start up mode?
>>> 
>>> You just need to set the environment variable I_MPI_PMI_LIBRARY to the
>>> path to the Slurm libpmi.so file and then you should be able to use srun
>>> to launch your job instead.
>>> 
>> 
>> Yeap, to the same effect. Here's what it gives:
>> 
>> srun --mpi=pmi2 
>> /lustre/janus_scratch/tibr1099/osu_impi/libexec/osu-micro-benchmarks//mpi/collective/osu_alltoall
>> srun: error: Task launch for 973564.0 failed on node node0453: Socket timed 
>> out on send/recv operation
>> srun: error: Application launch failed: Socket timed out on send/recv 
>> operation
>> 
>> 
>> 
>>> More here:
>>> 
>>> http://slurm.schedmd.com/mpi_guide.html#intel_srun
>>> 
>>>> If I switch to OpenMPI the error is:
>>> 
>>> Which version, and was it build with --with-slurm and (if you're
>>> version is not too ancient) --with-pmi=/path/to/slurm/install ?
>> 
>> Yeap. 1.8.5 (for 1.10 we're going to try and move everything to EasyBuild). 
>> Yes we included PMI and the Slurm option. Our configure statement was:
>> 
>> module purge
>> module load slurm/slurm
>> module load gcc/5.1.0
>> ./configure  \
>> --prefix=/curc/tools/x86_64/rh6/software/openmpi/1.8.5/gcc/5.1.0 \
>> --with-threads=posix \
>> --enable-mpi-thread-multiple \
>> --with-slurm \
>> --with-pmi=/curc/slurm/slurm/current/ \
>> --enable-static \
>> --enable-wrapper-rpath \
>> --enable-sensors \
>> --enable-mpi-ext=all \
>> --with-verbs
>> 
>> It's got me scratching my head, as I started off thinking it was an MPI 
>> issue, spent awhile getting Intel's hydra and OpenMPI's oob to go over IB 
>> instead of gig-e. This increased the success rate, but we were still failing.
>> 
>> Tried out a pure PMI (version 1) code (init, rank, size, fini), which worked 
>> a lot of the times. Which made me think it was MPI again! However that fails 
>> enough to say it's not MPI. The PMI v2 code I wrote, gives the wrong results 
>> for rank and world size, so I'm sweeping that under the rug until I 
>> understand it!
>> 
>> Just wondering if anybody has seen anything like this. Am happy to share our 
>> conf file if that helps.
>> 
>> The only other thing I could possibly point a finger at (but don't believe 
>> it is), is that the slurm masters (slurmctld) are only on gig-E.
>> 
>> I'm half thinking of opening a TT, but was hoping to get more information 
>> (and possibly not increase the logging of slurm, which is my only next idea).
>> 
>> Thanks for your thoughts Chris.
>> 
>> Timothy=

Reply via email to