I think you misunderstood the concept of array jobs. If you want one job to run on several nodes, do not define array jobs. Use array jobs to run the same job or similar jobs several times.
2017-06-15 7:36 GMT+02:00 Ariel Balter <[email protected]>: > > Here is if I with just `srun hostname`. Why is it creating four jobs on node > 2-8 and then running four times on each? I just want `hostname` to run once > on four different nodes. > > balter@exahead1:~/slurm_tutorial$ cat nodes.sub > #!/bin/bash > > > #SBATCH --job-name=nodes > #SBATCH --array=0-3 > #SBATCH --nodes=4 > #SBATCH --tasks-per-node=1 > ##SBATCH --ntasks=4 > #SBATCH --output="nodes_%N_%A_%a_%j.out" > #SBATCH --error="nodes_%N_%A_%a_%j.err" > > srun hostname > > $ for i in *.out; do echo "*** $i ***"; cat $i; done > *** nodes_exanode-2-8_512_0_513.out *** > exanode-2-8 > exanode-4-44 > exanode-4-12 > exanode-6-0 > *** nodes_exanode-2-8_512_1_514.out *** > exanode-2-8 > exanode-4-44 > exanode-4-12 > exanode-6-0 > *** nodes_exanode-2-8_512_2_515.out *** > exanode-2-8 > exanode-4-12 > exanode-4-44 > exanode-6-0 > *** nodes_exanode-2-8_512_3_512.out *** > exanode-2-8 > exanode-4-12 > exanode-4-44 > exanode-6-0 > > > On 6/14/17 2:17 PM, TO_Webmaster wrote: >> >> I guess $(hostname) is evaluated too early (on the batch host). Please >> check the output of "srun hostname". >> >> 2017-06-14 21:23 GMT+02:00 Ariel Balter <[email protected]>: >>> >>> I want to create an array of jobs, but have them execute on different >>> nodes. >>> This is what I'm trying: >>> >>> ``` >>> >>> balter@exahead1:~/slurm_tutorial$ cat nodes.sub >>> #!/bin/bash >>> >>> >>> #SBATCH --job-name=nodes >>> #SBATCH --array=0-3 >>> #SBATCH --nodes=4 >>> #SBATCH --tasks-per-node=1 >>> ##SBATCH --ntasks=4 >>> #SBATCH --output="hello_%N_%A_%a_%j.out" >>> #SBATCH --error="hello_%N_%A_%a_%j.err" >>> >>> >>> srun echo "SLURM_JOB_NAME: $SLURM_JOB_NAME SLURM_JOB_ID: $SLURM_JOB_ID >>> SLURM_ARRAY_TASK_ID: $SLURM_ARRAY_TASK_ID SLURM_ARRAY_JOB_ID: >>> $SLURM_ARRAY_JOB_ID SLURM_PROCID: $SLURM_PROCID hostname: $(hostname)" >>> >>> balter@exahead1:~/slurm_tutorial$ sbatch nodes.sub >>> balter@exahead1:~/slurm_tutorial$ for i in *.out; do echo "*** $i ***"; >>> cat >>> $i; done >>> >>> *** hello_exanode-6-3_408_0_409.out *** >>> >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 409 SLURM_ARRAY_TASK_ID: 0 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 409 SLURM_ARRAY_TASK_ID: 0 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 409 SLURM_ARRAY_TASK_ID: 0 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 409 SLURM_ARRAY_TASK_ID: 0 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> *** hello_exanode-6-3_408_1_410.out *** >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 410 SLURM_ARRAY_TASK_ID: 1 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 410 SLURM_ARRAY_TASK_ID: 1 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 410 SLURM_ARRAY_TASK_ID: 1 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 410 SLURM_ARRAY_TASK_ID: 1 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> *** hello_exanode-6-3_408_2_411.out *** >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 411 SLURM_ARRAY_TASK_ID: 2 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 411 SLURM_ARRAY_TASK_ID: 2 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 411 SLURM_ARRAY_TASK_ID: 2 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 411 SLURM_ARRAY_TASK_ID: 2 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> *** hello_exanode-6-3_408_3_408.out *** >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 408 SLURM_ARRAY_TASK_ID: 3 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 408 SLURM_ARRAY_TASK_ID: 3 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 408 SLURM_ARRAY_TASK_ID: 3 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> SLURM_JOB_NAME: nodes SLURM_JOB_ID: 408 SLURM_ARRAY_TASK_ID: 3 >>> SLURM_ARRAY_JOB_ID: 408 SLURM_PROCID: 0 hostname: exanode-6-3 >>> >>> ``` > > > -- > Ariel Balter > "Don't believe everything you think." > 509-713-0087 > ☮
