Hello, Consider the following batch script: #SBATCH --nodes=6 # number of nodes #SBATCH --ntasks-per-node=3 # processes per node
srun --ntasks-per-node=2 -n 3 ./<executable> The environment of some process would look like (only topic-related part): SLURM_JOB_CPUS_PER_NODE='4(x6)' SLURM_JOB_NODELIST='cndev[1-4,8-9]' <-- As expected 6 nodes SLURM_JOB_NUM_NODES=3 <-- Why 3 and not 6??? SLURM_NODELIST='cndev[1-4,8-9]' <-- As expected 6 nodes SLURM_NPROCS=3 <-- Why 3 and not 6??? SLURM_NTASKS=3 SLURM_STEP_NODELIST='cndev[1-3]' <-- As expected 3 nodes SLURM_STEP_NUM_NODES=3 <-- As expected 3 nodes SLURM_STEP_NUM_TASKS=3 SLURM_STEP_TASKS_PER_NODE='1(x3)' <-- duplication?! SLURM_TASKS_PER_NODE='1(x3)' <-- duplication?! 1. Is it correct that SLURM_JOB_NUM_NODES = SLURM_STEP_NUM_NODES? I thought that SLURM_JOB_NUM_NODES should remain it's initial value for the whole job. 2. According to srun's man I can't definitely say if last two variables duplicate each other or not: - SLURM_STEP_TASKS_PER_NODE - Number of processes per node within the step. - SLURM_TASKS_PER_NODE - Number of tasks to be initiated on each node... I understand that SLURM is flexible and I may miss some possible configurations where this two values would be different, could you provide the use case if that is correct? Was it done for backward portability reasons? -- С Уважением, Поляков Артем Юрьевич Best regards, Artem Y. Polyakov
