Re: [slurm-dev] sbatch and job steps

Moe Jette Thu, 15 Sep 2011 13:35:31 -0700

Another option would be to use the squeue command and poll until allof the steps are complete. You could either use a script or add a--wait option to the squeue command to do the polling.


Quoting Yuri D'Elia <[email protected]>:

On Thu, 15 Sep 2011 20:16:02 +0200
"Yuri D'Elia" <[email protected]> wrote:
When "swait" is invoked inside the same job id as given on thecommand like, it should simply wait for all steps to finish,without counting the id of the allocation. Better yet, ifSLURM_JOB_ID is defined, it should use it directly. That way"swait" would map *perfectly* to the "wait" built-in.
This way I could also implement super-easily my steps:

sbatch multi-stage.sh

# multi-stage.sh
for ...; do
  sbatch --jobid $SLURM_JOB_ID stage1-step.sh
done
swait

for ...; do
  sbatch --jobid $SLURM_JOB_ID stage2-step.sh
done
swait

echo "finished"
#####
There! The dependency problem is resolved without usingdependencies in the first place. Also, managing the queue becomes*much* easier.
Actually, after thinking about it, it should be rather simple toimplement. I think I can implement "swait" entirely in user-space soto speak.
I can simply list the steps for the given job and "sattach" to thefirst one. Once that finishes, repeat until there are no more steps."sattach" has probably some unnecessary overhead that I don't need(redirections), and then again implementing this outside ofslurmctld won't be as efficient, but if the steps are long enough,wasting a couple of seconds over 100k scheduled jobs is nothing.
I'll give it a try using the API first and then report back. I thinkit would make a nice addition to the SLURM system.

Re: [slurm-dev] sbatch and job steps

Reply via email to