Certain combinations of topology configuration and srun -N option produce 
spurious job rejection with "Requested node configuration is not 
available" with select/cons_res. The following example illustrates the 
problem.

[sulu] (slurm) etc> cat slurm.conf
...
TopologyPlugin=topology/tree 
SelectType=select/cons_res
SelectTypeParameters=CR_Core
...

[sulu] (slurm) etc> cat topology.conf
SwitchName=s1 Nodes=xna[13-26]
SwitchName=s2 Nodes=xna[41-45]
SwitchName=s3 Switches=s[1-2]

[sulu] (slurm) etc> sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
...
jkob         up   infinite      4   idle xna[14,19-20,41]
...

[sulu] (slurm) etc> srun -N 2-4 -n 4 -p jkob hostname
srun: Force Terminated job 79
srun: error: Unable to allocate resources: Requested node configuration is 
not available

The problem does not occur with select/linear, or topology/none, or if -N 
is omitted, or for certain other values for -N (for example, -N 4-4 and -N 
2-3 work ok). The problem seems to be in function _eval_nodes_topo in 
src/plugins/select/cons_res/job_test.c. The srun man page states that when 
-N is used, "the job will be allocated as many nodes as possible within 
the range specified and without delaying the initiation of the job." 
Consistent with this description, the requested number of nodes in the 
above example is 4 (req_nodes=4).  However, the code that selects the 
best-fit topology switches appears to make the selection based on the 
minimum required number of nodes (min_nodes=2). It therefore selects 
switch s1.  s1 has only 3 nodes from partition jkob. Since this is fewer 
than req_nodes the job is rejected with the "node configuration" error. 

I'm not sure where the code is going wrong.  It could be in the 
calculation of the number of needed nodes in function _enough_nodes.  Or 
it could be in the code that initializes/updates req_nodes or rem_nodes. I 
don't feel confident that I understand the logic well enough to propose a 
fix without introducing a regression. 

Regards,
Martin

Reply via email to