(This is on slurm 2.4.3.) We are trying to get a piece of software (that wants to play queue system) to run on our cluster, and it needs a command to submit a job and block until the job has finished.
SGE's qsub command has an option "-sync y", which makes the qsub command wait for the job to complete before it exits. PBS's qsub has "-Wblock=true", which does the same. Is it possible to emulate this behaviour in SLURM, either with commands or the Perl API (or even the C API, if need be)? The closest I've got with commands is salloc <job specific slurm options> /usr/bin/srun --ntasks=1 --nodes=1 --preserve-env <jobscript> for instance: salloc --job-name=t1 --mem-per-cpu=500 --time=5 --account=staff --nodes=2 --ntasks-per-node=2 /usr/bin/srun --ntasks=1 --nodes=1 --preserve-env env.sm This makes sure that the commands in the jobscript are run on the job allocation's batch node, and mpirun and multithreaded programs work well. However, srun commands don't: unless you specify --ntasks less than the number of tasks for the whole job, you get "srun: error: Unable to create job step: Requested node configuration is not available". There is a simpler version that avoids the srun problem: salloc <job specific slurm options> <jobscript> However, this runs all commands in the script on the submit node, and we cannot assume that all commands in the job script use srun or mpirun to do the work. I'd rather not create a command that uses sbatch to submit the job, and then polls the queue system until the job has finished. -- Regards, Bjørn-Helge Mevik, dr. scient, Research Computing Services, University of Oslo
