[slurm-dev] Re: How to spread jobs among nodes?

Lloyd Brown Thu, 08 May 2014 13:21:07 -0700

Don't forget the communication implications of the task distribution
either.  In general, if you start with fewer nodes, and change to more
nodes (for the same total number of processes), your communication is
more likely to be going between nodes, which will be slower than working
within a node.


Also, if the communication pattern is highly adjacent (lots of
communication with near neighbors, but not much to farther neighbors),
using a cyclic allocation may also hurt you, even in the same
node/processes-per-node allocation, since the neighbors to a process are
more likely to be on different nodes.

How severe this is will depend on the specific algorithm, software,
communication pattern, etc.  But as you work with your users, it's worth
considering.

Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu

On 05/08/2014 12:01 PM, Ryan Cox wrote:
> Rather than maximize fragmentation, you probably want to do it on a
> per-job basis.  If you want one core per node:  sbatch:  -N $numnodes -n
> $numnodes.  Anything else would require the -m flag.  I haven't played
> with it recently but I think you would want -m cyclic.
> 
> Ryan
> 
> On 05/08/2014 11:49 AM, Atom Powers wrote:
>> How to spread jobs among nodes?
>>
>> It appears that my Slurm cluster is scheduling jobs to load up nodes
>> as much as possible before putting jobs on other nodes. I understand
>> the reasons for doing this, however I foresee my users wanting to
>> spread jobs out among as many nodes as possible for various reasons,
>> some of which are even valid.
>>
>> How would I configure the scheduler to distribute jobs in something
>> like a round-robin fashion to many nodes instead of loading jobs onto
>> just a few nodes?
>>
>> I currently have:
>>     'SchedulerType'         => 'sched/builtin',
>>     'SelectTypeParameters'  => 'CR_Core_Memory',
>>     'SelectType'            => 'select/cons_res',
>>
>> -- 
>> Perfection is just a word I use occasionally with mustard.
>> --Atom Powers--
> 
> -- 
> Ryan Cox
> Operations Director
> Fulton Supercomputing Lab
> Brigham Young University
>

[slurm-dev] Re: How to spread jobs among nodes?

Reply via email to