[slurm-dev] How to emulate qsub's "-sync y"/"-Wblock=true"

Bjørn-Helge Mevik Tue, 05 Mar 2013 06:03:16 -0800

(This is on slurm 2.4.3.)

We are trying to get a piece of software (that wants to play queue system)
to run on our cluster, and it needs a command to submit a job and block
until the job has finished.


SGE's qsub command has an option "-sync y", which makes the qsub command
wait for the job to complete before it exits.  PBS's qsub has
"-Wblock=true", which does the same.

Is it possible to emulate this behaviour in SLURM, either with commands
or the Perl API (or even the C API, if need be)?

The closest I've got with commands is

   salloc <job specific slurm options> /usr/bin/srun --ntasks=1 --nodes=1 
--preserve-env <jobscript>

for instance:

   salloc --job-name=t1 --mem-per-cpu=500 --time=5 --account=staff --nodes=2 
--ntasks-per-node=2 /usr/bin/srun --ntasks=1 --nodes=1 --preserve-env env.sm

This makes sure that the commands in the jobscript are run on the job
allocation's batch node, and mpirun and multithreaded programs work
well.  However, srun commands don't: unless you specify --ntasks less
than the number of tasks for the whole job, you get "srun: error: Unable
to create job step: Requested node configuration is not available".

There is a simpler version that avoids the srun problem:

   salloc <job specific slurm options> <jobscript>

However, this runs all commands in the script on the submit node, and we
cannot assume that all commands in the job script use srun or mpirun to
do the work.

I'd rather not create a command that uses sbatch to submit the job, and
then polls the queue system until the job has finished.

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Research Computing Services, University of Oslo

[slurm-dev] How to emulate qsub's "-sync y"/"-Wblock=true"

Reply via email to