We have an application that uses MPI in a master-slave mechanism. The master (rank 0) does almost nothing, while the slaves (rank 1-N) are directed by the master. The application uses GPUs (Nvidia k80 with 4 GPUs per node), a precious commodity on our cluster.
Ideally, the application should be run with an uneven distribution of tasks, such that the first node allocated will have one additional task to serve as the master. This can be done for 1 or 2 nodes running 4 or 8 slaves (5 or 9 tasks) respectively, like so: $ sbatch --nodes=1 --ntasks=5 batch.script places 5 tasks on a single 4xGPU node. $ sbatch --ntasks=9 --ntasks-per-node=5 batch.script places 5 tasks on the first 4xGPU node and 4 tasks on the second. However for anything more than 2 nodes, slurm does not allow this because of a conflict between --nodes, --ntasks, and --ntasks-per-node: $ sbatch --nodes=3 --ntasks=13 --ntasks-per-node=4 batch.script $ sbatch --nodes=4 --ntasks=17 --ntasks-per-node=4 batch.script Is there a way of placing N+1 tasks on the first node, and N tasks on the remaining nodes allocated? Ideally, something like this: $ sbatch --nodes=4 --ntasks=5,4,4,4 batch.script or $ sbatch --nodes=4 --ntasks-per-node=5,4,4,4 batch.script Wondering, David Hoover HPC @ NIH