Works for me as documented
Quoting "Wiegand, Paul" <[email protected]>:
Good evening,
We have configured our cluster so that the "Weight" field for each
node is proportional with the number of cores in the blade. The
thinking was that we would like Slurm to allocate smaller nodes
first, unless people explicitly ask for larger resources; however,
Slurm seems to be scheduling things in precisely the opposite way:
Nodes with larger weights are being scheduled before nodes with
smaller weights. Even if I ask for 1 core on 1 node with limited
memory, I get assigned to the biggest node with the most memory ...
even thought its weight is larger.
I guess it is possible that my understanding is reversed from what
is actually so; however, I've included the relevant quote form the
slurm.conf documentation below. I wonder: Does the actual *value*
matter? That is, should the weights be orders of magnitude
different from one another, or is it just ranked? (I've assumed
that the order matters but the value is not important). Also, does
the topo pluging change things?
From slurm.conf docs:
"Weight
The priority of the node for scheduling purposes. All things being
equal, jobs will be allocated the nodes with the lowest weight which
satisfies their requirements. For example, a heterogeneous
collection of nodes might be placed into a single partition for
greater system utilization, responsiveness and capability. It would
be preferable to allocate smaller memory nodes rather than larger
memory nodes if either will satisfy a job's requirements. The units
of weight are arbitrary, but larger weights should be assigned to
nodes with more processors, memory, disk space, higher processor
speed, etc. Note that if a job allocation request can not be
satisfied using the nodes with the lowest weight, the set of nodes
with the next lowest weight is added to the set of nodes under
consideration for use (repeat as needed for higher weight values).
If you absolutely want to minimize the number of higher weight nodes
allocated to a job (at a cost of higher scheduling overhead), give
each node a distinct Weight value and they will be added to the pool
of nodes being considered for scheduling individually. The default
value is 1."
Thanks,
Paul.