I have 17 idle node in one partition :

gpu          up   infinite      3  down* gpu-1-[12-13,15]
gpu          up   infinite      9    mix gpu-1-[4-11,17]
gpu          up   infinite     17   idle gpu-1-[14,16],gpu-2-[4-17],gpu-3-9

but jobs do not get scheduled with (Resources). There should be at least
8 more jobs running due to requested features. It seems slumr only takes
nodes from rack 1 and not rack2. Even though I changed the node weight
factor as well as the partition priority it doesn't really matter.
I also requeued the jobs but that didn't help either.

How can I get slurm to start jobs on ALL available nodes?


Thanks
Eva

Reply via email to