srun has a multitude of options to control where the job steps run. Perhaps you just want "srun --exclusive ...".
Quoting Kasra Hosseini <[email protected]>: > Dear SLURMers! > > I want to run a program with the following configuration: > - One Batch script which runs two or more embarrassingly parallel workloads > - Each of these embarrassingly parallel workloads needs one node (24 CPUs) > - Therefore, I need two or more nodes, each node takes care of one > embarrassingly parallel workload > > Problem: > - All tasks will be distributed on just one node! (the first one) > > Here is my BATCH script: (Each node in our cluster has 24 CPUs. Since I > want to run two jobs, I requested 2 nodes and for each node, 24 tasks) > > #!/bin/bash > #SBATCH --partition=.... > #SBATCH --output=... > #SBATCH --workdir=... > #SBATCH --job-name="testing" > #SBATCH --tasks-per-node=24 > #SBATCH --nodes=2 > #SBATCH --mail-type=all > #SBATCH --time=71:59:59 > > # run compute job > source ..../.bashrc > ./submit.sh my_job_1 24 & > ./submit.sh my_job_2 24 & > > wait > > Result: > - 48 tasks on the first node, nothing on the second one! > > Any idea? > > Thank you very much in advance for your time and kind attention to my > request. > > Best, > Kasra >
