[slurm-dev] Re: slurm + openmpi + suspend problem

[email protected] Tue, 11 Jul 2017 06:02:43 -0700

There is an mca param ess_base_forward_signals that controls which signals to 
forward. However, I just looked at source code and see that it wasn't 
backported. Sigh.


You could try the 3.0.0 branch as it is in release candidate and should go out 
within a week. I'd suggest just cloning that branch of the OMPI repo to get the 
latest state. The fix is definitely there 

Sent from my iPad

> On Jul 11, 2017, at 7:45 AM, Eugene Dedits <[email protected]> wrote:
> 
> 
> Hi Ralph, 
> 
> 
> thanks for reply. I’ve just tried upgrading to ompi 2.1.1. The same problem… 
> :-\
> Could you point me to some discussion of this? 
> 
> Thanks,
> Eugene. 
> 
>> On Jul 11, 2017, at 6:17 AM, [email protected] wrote:
>> 
>> 
>> There is an issue with how the signal is forwarded. This has been fixed in 
>> the latest OMPI release so you might want to upgrade 
>> 
>> Ralph
>> 
>> Sent from my iPad
>> 
>>> On Jul 11, 2017, at 2:53 AM, Dennis Tants <[email protected]> 
>>> wrote:
>>> 
>>> 
>>> Hello Eugene,
>>> 
>>> it is just a wild guess, but could you try "srun --mpi=pmi2"(you said
>>> you built OMPI with pmi support) instead of "mpirun".
>>> srun is build-in and I think the preferred way of running parallel
>>> processes. Maybe scontrol is able to suspend it this way.
>>> 
>>> Regards,
>>> Dennis
>>> 
>>>> Am 10.07.2017 um 22:20 schrieb Eugene Dedits:
>>>> Hello SLURM-DEV
>>>> 
>>>> 
>>>> I have a problem with slurm, openmpi, and “scontrol suspend”. 
>>>> 
>>>> My setup is:
>>>> 96-node cluster with IB, running rhel 6.8
>>>> slurm 17.02.1
>>>> openmpi 2.0.0 (built using Intel 2016 compiler)
>>>> 
>>>> 
>>>> I am running some application (hpl in this particular case) using batch 
>>>> script similar to:
>>>> -----------------------------
>>>> #!/bin/bash
>>>> #SBATCH —partiotion=standard
>>>> #SBATCH -N 10
>>>> #SBATCH —ntasks-per-node=16
>>>> 
>>>> mpirun -np 160 xhpl | tee LOG
>>>> -----------------------------
>>>> 
>>>> So I am running it on 160 cores, 2 nodes. 
>>>> 
>>>> Once job is submitted to the queue and is running I suspend it using
>>>> ~# scontrol suspend JOBID
>>>> 
>>>> I see that indeed my job stopped producing output. I go to each of the 10
>>>> nodes that were assigned for my job and see if the xhpl processes are 
>>>> running
>>>> there with :
>>>> 
>>>> ~# for i in {10..19}; do ssh node$i “top -b -n | head -n 50 | grep xhpl | 
>>>> wc -l”; done
>>>> 
>>>> I expect this little script to return 0 from every node (because suspend 
>>>> sent the
>>>> SIGSTOP and they shouldn’t show up in top). However I see that processes 
>>>> are reliable suspended only on node10. I get:
>>>> 0
>>>> 16
>>>> 16
>>>> …
>>>> 16
>>>> 
>>>> So 9 out of 10 nodes still have 16 MPI threads of my xhpl application 
>>>> running at 100%. 
>>>> 
>>>> If I run “scontrol resume JOBID” and then suspend it again I see that 
>>>> (sometimes) more
>>>> nodes have “xhpl” processes properly suspended. Every time I resume and 
>>>> suspend the
>>>> job, I see different nodes returning 0 in my “ssh-run-top” script. 
>>>> 
>>>> So all together it looks like the suspend mechanism doesn’t properly work 
>>>> in SLURM with 
>>>> OpenMPI. I’ve tried compiling OpenMPI with “—with-slurm 
>>>> —with-pmi=/path/to/my/slurm”. 
>>>> I’ve observed the same behavior. 
>>>> 
>>>> I would appreciate any help.   
>>>> 
>>>> 
>>>> Thanks,
>>>> Eugene. 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> -- 
>>> Dennis Tants
>>> Auszubildender: Fachinformatiker für Systemintegration
>>> 
>>> ZARM - Zentrum für angewandte Raumfahrttechnologie und Mikrogravitation
>>> ZARM - Center of Applied Space Technology and Microgravity
>>> 
>>> Universität Bremen
>>> Am Fallturm
>>> 28359 Bremen, Germany
>>> 
>>> Telefon: 0421 218 57940
>>> E-Mail: [email protected]
>>> 
>>> www.zarm.uni-bremen.de

[slurm-dev] Re: slurm + openmpi + suspend problem

Reply via email to