[slurm-dev] Re: slurm + openmpi + suspend problem

Eugene Dedits Tue, 11 Jul 2017 05:42:26 -0700

Hi Dennis,


as a matter of fact suspend does work when I run my app with srun. 
There is another problem with it however — under srun this code
runs an order of magnitude slower. I wanted to address the mpirun 
and suspend issue first and then deal with srun slowness… 

Thanks,
Eugene.



> On Jul 11, 2017, at 3:52 AM, Dennis Tants <[email protected]> 
> wrote:
> 
> 
> Hello Eugene,
> 
> it is just a wild guess, but could you try "srun --mpi=pmi2"(you said
> you built OMPI with pmi support) instead of "mpirun".
> srun is build-in and I think the preferred way of running parallel
> processes. Maybe scontrol is able to suspend it this way.
> 
> Regards,
> Dennis
> 
> Am 10.07.2017 um 22:20 schrieb Eugene Dedits:
>> Hello SLURM-DEV
>> 
>> 
>> I have a problem with slurm, openmpi, and “scontrol suspend”. 
>> 
>> My setup is:
>> 96-node cluster with IB, running rhel 6.8
>> slurm 17.02.1
>> openmpi 2.0.0 (built using Intel 2016 compiler)
>> 
>> 
>> I am running some application (hpl in this particular case) using batch 
>> script similar to:
>> -----------------------------
>> #!/bin/bash
>> #SBATCH —partiotion=standard
>> #SBATCH -N 10
>> #SBATCH —ntasks-per-node=16
>> 
>> mpirun -np 160 xhpl | tee LOG
>> -----------------------------
>> 
>> So I am running it on 160 cores, 2 nodes. 
>> 
>> Once job is submitted to the queue and is running I suspend it using
>> ~# scontrol suspend JOBID
>> 
>> I see that indeed my job stopped producing output. I go to each of the 10
>> nodes that were assigned for my job and see if the xhpl processes are running
>> there with :
>> 
>> ~# for i in {10..19}; do ssh node$i “top -b -n | head -n 50 | grep xhpl | wc 
>> -l”; done
>> 
>> I expect this little script to return 0 from every node (because suspend 
>> sent the
>> SIGSTOP and they shouldn’t show up in top). However I see that processes 
>> are reliable suspended only on node10. I get:
>> 0
>> 16
>> 16
>> …
>> 16
>> 
>> So 9 out of 10 nodes still have 16 MPI threads of my xhpl application 
>> running at 100%. 
>> 
>> If I run “scontrol resume JOBID” and then suspend it again I see that 
>> (sometimes) more
>> nodes have “xhpl” processes properly suspended. Every time I resume and 
>> suspend the
>> job, I see different nodes returning 0 in my “ssh-run-top” script. 
>> 
>> So all together it looks like the suspend mechanism doesn’t properly work in 
>> SLURM with 
>> OpenMPI. I’ve tried compiling OpenMPI with “—with-slurm 
>> —with-pmi=/path/to/my/slurm”. 
>> I’ve observed the same behavior. 
>> 
>> I would appreciate any help.   
>> 
>> 
>> Thanks,
>> Eugene. 
>> 
>> 
>> 
>> 
> 
> -- 
> Dennis Tants
> Auszubildender: Fachinformatiker für Systemintegration
> 
> ZARM - Zentrum für angewandte Raumfahrttechnologie und Mikrogravitation
> ZARM - Center of Applied Space Technology and Microgravity
> 
> Universität Bremen
> Am Fallturm
> 28359 Bremen, Germany
> 
> Telefon: 0421 218 57940
> E-Mail: [email protected]
> 
> www.zarm.uni-bremen.de

[slurm-dev] Re: slurm + openmpi + suspend problem

Reply via email to