Hi Brian,
Thanks for the suggestion. I tried it out, but I got the same result.
$ sbatch --hold --dependency=singleton ./fakejob.sh
Submitted batch job 26122715
$ sbatch --hold --dependency=singleton ./fakejob.sh
Submitted batch job 26122716
$ sbatch --hold --dependency=singleton ./fakejob.sh
Submitted batch job 26122720
$ scontrol update jobid=26122716 Dependency=singleton,after:26122720
$ scontrol update jobid=26122720 Dependency=singleton,after:26122715
$ scontrol release 26122715 26122716 26122720
... waiting for job 26122715 to complete ...
squeue -u jarno
JOBID USER ACCOUNT NAME ST TIME_LEFT NODES CPUS
GRES MIN_MEM NODELIST (REASON)
26122716 jarno def-jarno_cp fakejob PD 2:00 1 1
(null) 250M (Dependency)
26122720 jarno def-jarno_cp fakejob PD 2:00 1 1
(null) 250M (Dependency)
Jarno van der Kolk, PhD Phys.
Analyste principal en informatique scientifique | Senior Scientific Computing
Specialist
Solutions TI | IT Solutions
Université d’Ottawa | University of Ottawa
From: slurm-users <[email protected]> on behalf of Brian
Andrus <[email protected]>
Sent: August 21, 2019 5:26 PM
To: [email protected] <[email protected]>
Subject: Re: [slurm-users] Dependencies with singleton and after
Have you tried adding the dependency at submit time?
sbatch --dependency=singleton fakejob.sh
Brian Andrus
On 8/21/2019 1:51 PM, Jarno van der Kolk wrote:
> Hi,
>
> I am helping a researcher who encountered an unexpected behaviour with
> dependencies. He uses both "singleton" and "after". he minimal working
> example is as follows:
>
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909273
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909274
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909275
> $ scontrol update jobid=25909273 Dependency=singleton
> $ scontrol update jobid=25909274 Dependency=singleton,after:25909275
> $ scontrol update jobid=25909275 Dependency=singleton,after:25909273
> $ scontrol release 25909273 25909274 25909275
>
> When releasing the jobs, the scheduler will start job 25909273 which is to be
> expected. The other jobs will be held due to the singleton and the jobs
> having the same job name, also expected.
>
> However, when the job finishes, we would have expected job 25909275 to start
> since the singleton is now free and job 25909274 cannot start due to its
> dependency of "after:25909275". That is, the expected order would be 25909273
> 25909275 25909274 and one at a time.
>
> Instead what happens is that job 25909273 starts and completes and then jobs
> 25909274 and 25909275 remain queued with unsatisfied dependencies.
>
> It is entirely possible that I am thinking of this wrong of course, but I
> don't see it. Is this expected behaviour?
>
> The content of fakejob.sh is simply this by the way, nothing special:
> #!/bin/bash
> #SBATCH --account=def-jarno
> #SBATCH --time=0:1:30
> #SBATCH --mem=250M
> #SBATCH --ntasks=1
> #SBATCH --job-name=fakejob
>
> echo "Starting fake job"
> sleep 60
> echo "Finished fake job"
>
>
> By the way, I realize this could be done with "afterany" instead of
> "singleton,after", but since this is a minimal working example it leaves out
> a lot of details of course.
>
> Thanks,
> Jarno
>
> Jarno van der Kolk, PhD Phys.
> Analyste principal en informatique scientifique | Senior Scientific Computing
> Specialist
> Solutions TI | IT Solutions
> Université d’Ottawa | University of Ottawa
>
>
>
>
>