Hi Brian,

Thanks for the suggestion. I tried it out, but I got the same result.

$ sbatch --hold --dependency=singleton ./fakejob.sh 
Submitted batch job 26122715
$ sbatch --hold --dependency=singleton ./fakejob.sh 
Submitted batch job 26122716
$ sbatch --hold --dependency=singleton ./fakejob.sh 
Submitted batch job 26122720
$ scontrol update jobid=26122716 Dependency=singleton,after:26122720
$ scontrol update jobid=26122720 Dependency=singleton,after:26122715
$ scontrol release 26122715 26122716 26122720

... waiting for job 26122715 to complete ...

squeue -u jarno
          JOBID     USER      ACCOUNT           NAME  ST  TIME_LEFT NODES CPUS  
     GRES MIN_MEM NODELIST (REASON) 
       26122716    jarno def-jarno_cp        fakejob  PD       2:00     1    1  
   (null)    250M  (Dependency) 
       26122720    jarno def-jarno_cp        fakejob  PD       2:00     1    1  
   (null)    250M  (Dependency) 


Jarno van der Kolk, PhD Phys.
Analyste principal en informatique scientifique | Senior Scientific Computing 
Specialist
Solutions TI | IT Solutions
Université d’Ottawa | University of Ottawa



From: slurm-users <[email protected]> on behalf of Brian 
Andrus <[email protected]>
Sent: August 21, 2019 5:26 PM
To: [email protected] <[email protected]>
Subject: Re: [slurm-users] Dependencies with singleton and after

Have you tried adding the dependency at submit time?

sbatch --dependency=singleton fakejob.sh

Brian Andrus


On 8/21/2019 1:51 PM, Jarno van der Kolk wrote:
> Hi,
>
> I am helping a researcher who encountered an unexpected behaviour with 
> dependencies. He uses both "singleton" and "after". he minimal working 
> example is as follows:
>
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909273
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909274
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909275
> $ scontrol update jobid=25909273 Dependency=singleton
> $ scontrol update jobid=25909274 Dependency=singleton,after:25909275
> $ scontrol update jobid=25909275 Dependency=singleton,after:25909273
> $ scontrol release 25909273 25909274 25909275
>
> When releasing the jobs, the scheduler will start job 25909273 which is to be 
> expected. The other jobs will be held due to the singleton and the jobs 
> having the same job name, also expected.
>
> However, when the job finishes, we would have expected job 25909275 to start 
> since the singleton is now free and job 25909274 cannot start due to its 
> dependency of "after:25909275". That is, the expected order would be 25909273 
> 25909275 25909274 and one at a time.
>
> Instead what happens is that job 25909273 starts and completes and then jobs 
> 25909274 and 25909275 remain queued with unsatisfied dependencies.
>
> It is entirely possible that I am thinking of this wrong of course, but I 
> don't see it. Is this expected behaviour?
>
> The content of fakejob.sh is simply this by the way, nothing special:
> #!/bin/bash
> #SBATCH --account=def-jarno
> #SBATCH --time=0:1:30
> #SBATCH --mem=250M
> #SBATCH --ntasks=1
> #SBATCH --job-name=fakejob
>
> echo "Starting fake job"
> sleep 60
> echo "Finished fake job"
>
>
> By the way, I realize this could be done with "afterany" instead of 
> "singleton,after", but since this is a minimal working example it leaves out 
> a lot of details of course.
>
> Thanks,
> Jarno
>
> Jarno van der Kolk, PhD Phys.
> Analyste principal en informatique scientifique | Senior Scientific Computing 
> Specialist
> Solutions TI | IT Solutions
> Université d’Ottawa | University of Ottawa
>
>
>
>
>

Reply via email to