Hi There,
I've recently setup SLURM at our office and have been struggling to get
weights to work as expected.
The configuration of SLURM is that:
- Tree topology is setup for our clusters (we have many separate
clusters so require each individual cluster to have its own specified
switch & all users jobs request a max switch of 1)
- Weights have been assigned for systems on the main partition
(incremented in 100s)
As I understand it, the lower weighted nodes should be used first
however I saw a discussion on the slurm-devel google group
(https://groups.google.com/forum/#!topic/slurm-devel/_hppceF2cEw
<https://groups.google.com/forum/#%21topic/slurm-devel/_hppceF2cEw>)
which states that topology overrides weight. This coupled with the
following I have seen in the topology documentation:
/"NOTE:Slurm first identifies the network switches which provide the
best fit for pending jobs and then selectes the nodes with the lowest
"weight" within those switches. If optimizing resource selection by node
weight is more important than optimizing network topology then do NOT
use the topology/tree plugin"/
The problem here is that we require both the topology (as nodes in
different clusters cannot communicate with one another) and the weights.
Is there are a way to utilise both weight & topology? Assigning weights
to switches would be a suitable workaround.
Is there a different way of getting the restricted functionality that
the topology plugin provides? (restricting jobs to only use one
hostname? if running on clusterX nodes then don't use clusterY nodes)
Thanks in Advance,
Stu