Thank you. This is good and helpful: I've understood the documentation correctly. Is there anything I could/should check to help diagnose why things are behaving counter to my expectations?
Thanks, Paul ------ Original message------ From: [email protected] Date: Mon, Dec 21, 2015 18:05 To: slurm-dev; Subject:[slurm-dev] Re: Node weight working opposite from expected Works for me as documented Quoting "Wiegand, Paul" <[email protected]>: > Good evening, > > We have configured our cluster so that the "Weight" field for each > node is proportional with the number of cores in the blade. The > thinking was that we would like Slurm to allocate smaller nodes > first, unless people explicitly ask for larger resources; however, > Slurm seems to be scheduling things in precisely the opposite way: > Nodes with larger weights are being scheduled before nodes with > smaller weights. Even if I ask for 1 core on 1 node with limited > memory, I get assigned to the biggest node with the most memory ... > even thought its weight is larger. > > I guess it is possible that my understanding is reversed from what > is actually so; however, I've included the relevant quote form the > slurm.conf documentation below. I wonder: Does the actual *value* > matter? That is, should the weights be orders of magnitude > different from one another, or is it just ranked? (I've assumed > that the order matters but the value is not important). Also, does > the topo pluging change things? > > > From slurm.conf docs: > > "Weight > The priority of the node for scheduling purposes. All things being > equal, jobs will be allocated the nodes with the lowest weight which > satisfies their requirements. For example, a heterogeneous > collection of nodes might be placed into a single partition for > greater system utilization, responsiveness and capability. It would > be preferable to allocate smaller memory nodes rather than larger > memory nodes if either will satisfy a job's requirements. The units > of weight are arbitrary, but larger weights should be assigned to > nodes with more processors, memory, disk space, higher processor > speed, etc. Note that if a job allocation request can not be > satisfied using the nodes with the lowest weight, the set of nodes > with the next lowest weight is added to the set of nodes under > consideration for use (repeat as needed for higher weight values). > If you absolutely want to minimize the number of higher weight nodes > allocated to a job (at a cost of higher scheduling overhead), give > each node a distinct Weight value and they will be added to the pool > of nodes being considered for scheduling individually. The default > value is 1." > > Thanks, > Paul.
