Dear SLURMers! I want to run a program with the following configuration: - One Batch script which runs two or more embarrassingly parallel workloads - Each of these embarrassingly parallel workloads needs one node (24 CPUs) - Therefore, I need two or more nodes, each node takes care of one embarrassingly parallel workload
Problem: - All tasks will be distributed on just one node! (the first one) Here is my BATCH script: (Each node in our cluster has 24 CPUs. Since I want to run two jobs, I requested 2 nodes and for each node, 24 tasks) #!/bin/bash #SBATCH --partition=.... #SBATCH --output=... #SBATCH --workdir=... #SBATCH --job-name="testing" #SBATCH --tasks-per-node=24 #SBATCH --nodes=2 #SBATCH --mail-type=all #SBATCH --time=71:59:59 # run compute job source ..../.bashrc ./submit.sh my_job_1 24 & ./submit.sh my_job_2 24 & wait Result: - 48 tasks on the first node, nothing on the second one! Any idea? Thank you very much in advance for your time and kind attention to my request. Best, Kasra
