Hi Guillaume, On 15/06/2022 16:59, Guillaume De Nayer wrote:
Perhaps I missunderstand the Slurm documentation... As thought that the --exclusive option used in combination with sbatch will reserve the whole node (40 cores) for the job (submitted with sbatch). This part is working fine. I can check it with sacct. Then, this job starts subtasks on the reserved 40 cores with srun. Therefore I'm using "-n1 -c1" in combination with "srun". I thought that it was possible to use the reserved cores inside this job using srun.
You're correct. --exclusive will give you all cores on the nodes but only as much memory as requested.
The following slightly modified job without --exclusive and with --ntasks=2 leads to a similar problem: Only one srun is running at a time. The second starts directly after the first one finished. #!/bin/bash #SBATCH --job-name=test_multi_prog_srun #SBATCH --ntasks=2 #SBATCH --partition=short #SBATCH --time=02:00:00 srun -vvv --exact -n1 -c1 sleep 20 > srun1.log 2>&1 & srun -vvv --exact -n1 -c1 sleep 30 > srun2.log 2>&1 & wait
This should work... It works on our cluster. Are you sure they don't run in parallel? We usually recommend to use gnu parallel or xargs like: xargs -P $SLURM_NTASKS srun -N 1 -n 1 -c 1 --exact sleep 30 Ward
smime.p7s
Description: S/MIME Cryptographic Signature
