Atom Powers <[email protected]> writes:

> It appears that my Slurm cluster is scheduling jobs to load up nodes as
> much as possible before putting jobs on other nodes.

I believe it orders the available nodes in a fixed order (based on the
Weight parameter and lexical sorting within that) and tries to fit the
job on the first nodes with enough free resources.  That tends to pack
jobs on the first nodes, but it doesn't actively try to pack jobs on
already loaded nodes.  (I.e., if node0 is idle and node1 runs a job (and
they have the same weight), slurm would put a job on node0 even if it
could fit on node1.)

> How would I configure the scheduler to distribute jobs in something like a
> round-robin fashion to many nodes instead of loading jobs onto just a few
> nodes?
>
> I currently have:
>     'SchedulerType'         => 'sched/builtin',
>     'SelectTypeParameters'  => 'CR_Core_Memory',
>     'SelectType'            => 'select/cons_res',

With the latest slurm versions, you can add CR_LLN to
SelectTypeParameters, or LLN to a PartitionName line to get what you
want.

From slurm.conf(5) in 14.03.3:

                      CR_LLN Schedule resources to jobs  on  the
                             least  loaded nodes (based upon the
                             number of idle CPUs). This is  gen-
                             erally   only  recommended  for  an
                             environment  with  serial  jobs  as
                             idle  resources  will  tend  to  be
                             highly  fragmented,  resulting   in
                             parallel   jobs  being  distributed
                             across many nodes.   Also  see  the
                             partition  configuration  parameter
                             LLN use the least loaded  nodes  in
                             selected partitions.

              LLN    Schedule  resources  to  jobs  on the least
                     loaded nodes (based upon the number of idle
                     CPUs).  This  is generally only recommended
                     for an environment with serial jobs as idle
                     resources  will  tend  to  be  highly frag-
                     mented, resulting in  parallel  jobs  being
                     distributed  across  many  nodes.  Also see
                     the SelectParameters configuration  parame-
                     ter CR_LLN to use the least loaded nodes in
                     every partition.

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo

Reply via email to