If I leave off exclusive and cyclic, then all 32 jobs get stacked on the 1st node.
I don't know about the selecttype option. How can i check which one I am using? Thank you, Lucas -----Original Message----- From: Michael Di Domenico [mailto:mdidomeni...@gmail.com] Sent: Tuesday, January 03, 2017 3:17 PM To: slurm-dev <slurm-dev@schedmd.com> Subject: [slurm-dev] Re: Question about -m cyclic and --exclusive options to slurm what behaviour do you get if you leave off the exclusive and cyclic options? which selecttype are you using? On Tue, Jan 3, 2017 at 12:19 PM, Koziol, Lucas <lucas.koz...@exxonmobil.com> wrote: > Dear Vendor, > > > > > > What I want to do is run a large number of single-CPU tasks, and have them > distributed evenly over all allocated nodes, and to oversubscribe CPUs to > tasks (each task is very light on CPU resources). > > > > Here is a small test script that allocates 2 nodes (16 CPUs per Node on our > machines) and tries to distribute 32 tasks over these 32 CPUs: > > > > #SBATCH -n 32 -p short > > > > set Vec = ( 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 > 25 26 27 28 29 30 31 32 ) > > > > foreach frame ($Vec) > > > > cd $frame > > srun –n 1 –m cyclic a.out > output.txt & > > cd .. > > end > > wait > > > > The hope was that all 16 tasks would run on Node 1, and 16 tasks would run > on Node 2. Unfortunately what happens is that all 32 jobs get assigned to > Node 1. I thought –m cyclic was supposed to avoid this. > > > > A note from the vendor suggested using the –exclusive flag. In that case I > modified my srun command to > > > > srun –exclusive –N 1 –n 1 a.out > output.txt & > > > > > > The problem with this is that it still assigns the tasks to Node 1, but > waits until there is an available CPU before assigning the last 16. It still > doesn’t accomplish the task of distributing all 32 jobs to the 32 CPUs > across 2 nodes. And, in the next step I want to overscubscribe tasks to > nodes, and –exclusive specicifally waits until open CPUs before submitting > all the jobs. This sinks a whole lot of time. > > > > I have also played around with the –overcommit option, however that has not > produced any difference. Note that MAX_TASKS_PER_NODE set in slurm.h is > adequate. > > > > The –m cyclic option only applies to multiple tasks launched within a single > step. Is there a mechanism for submiting 32 tasks using 1 srun command, at > which point –m cyclic should hopefully fix everything. > > > > Thank you for your time and any help or suggestions. > > > > Best regards, > > Lucas Koziol > > > > > > Corporate Strategic Research > > ExxonMobil Research and Engineering Co. > > 1545 US Route 22 East > > Annandale, NJ, 08801 > > Tel: (908) 335-3411 > >