If I leave off exclusive and cyclic, then all 32 jobs get stacked on the 1st 
node.

I don't know about the selecttype option. How can i check which one I am using?

Thank you,
Lucas



-----Original Message-----
From: Michael Di Domenico [mailto:mdidomeni...@gmail.com] 
Sent: Tuesday, January 03, 2017 3:17 PM
To: slurm-dev <slurm-dev@schedmd.com>
Subject: [slurm-dev] Re: Question about -m cyclic and --exclusive options to 
slurm


what behaviour do you get if you leave off the exclusive and cyclic
options?  which selecttype are you using?


On Tue, Jan 3, 2017 at 12:19 PM, Koziol, Lucas
<lucas.koz...@exxonmobil.com> wrote:
> Dear Vendor,
>
>
>
>
>
> What I want to do is run a  large number of single-CPU tasks, and have them
> distributed evenly over all allocated nodes, and to oversubscribe CPUs to
> tasks (each task is very light on CPU resources).
>
>
>
> Here is a small test script that allocates 2 nodes (16 CPUs per Node on our
> machines) and tries to distribute 32 tasks over these 32 CPUs:
>
>
>
> #SBATCH -n 32 -p short
>
>
>
> set Vec = ( 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
> 25 26 27 28 29 30 31 32 )
>
>
>
> foreach frame ($Vec)
>
>
>
>                 cd $frame
>
>                                 srun –n 1 –m cyclic a.out > output.txt &
>
>                 cd ..
>
> end
>
> wait
>
>
>
> The hope was that all 16 tasks would run on Node 1, and 16 tasks would run
> on Node 2. Unfortunately what happens is that all 32 jobs get assigned to
> Node 1. I thought –m cyclic was supposed to avoid this.
>
>
>
> A note from the vendor suggested using the –exclusive flag. In that case I
> modified my srun command to
>
>
>
> srun –exclusive –N 1 –n 1 a.out > output.txt &
>
>
>
>
>
> The problem with this is that it still assigns the tasks to Node 1, but
> waits until there is an available CPU before assigning the last 16. It still
> doesn’t accomplish the task of distributing all 32 jobs to the 32 CPUs
> across 2 nodes. And, in the next step I want to overscubscribe tasks to
> nodes, and –exclusive specicifally waits until open CPUs before submitting
> all the jobs. This sinks a whole lot of time.
>
>
>
> I have also played around with the –overcommit option, however that has not
> produced any difference. Note that MAX_TASKS_PER_NODE set in slurm.h is
> adequate.
>
>
>
> The –m cyclic option only applies to multiple tasks launched within a single
> step. Is there a mechanism for submiting 32 tasks using 1 srun command, at
> which point –m cyclic should hopefully fix everything.
>
>
>
> Thank you for your time and any help or suggestions.
>
>
>
> Best regards,
>
> Lucas Koziol
>
>
>
>
>
> Corporate Strategic Research
>
> ExxonMobil Research and Engineering Co.
>
> 1545 US Route 22 East
>
> Annandale, NJ, 08801
>
> Tel: (908) 335-3411
>
>

Reply via email to