Re: [slurm-dev] sbatch and job steps

[email protected] Thu, 24 Nov 2011 07:38:57 -0800

We have discussed adding support for job arrays, which is a way to group jobs, 
but there are no current plans to do that.
--
Sent from my Android phone. Please excuse my brevity and typos.

Yuri D'Elia <[email protected]> wrote:

On Thu, 15 Sep 2011 14:34:38 -0600
Moe Jette <[email protected]> wrote:

> Another option would be to use the squeue command and poll until all
> of the steps are complete. You could either use a script or add a
> --wait option to the squeue command to do the polling.

To comment further on this, after some testing, it doesn't really seem to work 
the way I want it to:

> >> sbatch multi-stage.sh
> >>
> >> # multi-stage.sh
> >> for ...; do
> >> sbatch --jobid $SLURM_JOB_ID stage1-step.sh
> >> done
> >> swait

Each job "step" here has the same allocation as the parent, mearning that you 
have to request all resources of the cluster in advance to have the steps 
scheduled on all the available resources. Uncool.

It doesn't seem to be possible to create a "step" with independent allocation 
either.

Also, steps don't seem to be managed/queued (ie: I can run actually as many 
steps as I want into a previous allocation).

Using dependencies with 100k jobs is prohibitive.

Right now I'm submitting job groups with a specific name. Then, I'm polling the 
queue by using the job name: the single script responsible for polling is then 
used as a dependency for step 2, and so forth.

I really wish I could setup job "groups" to handle dependencies, and also to 
inspect the queue in a more sensible way. Similarly, I'm interested if the 
whole job group finished, not just the single job.

Re: [slurm-dev] sbatch and job steps

Reply via email to