Slurm is designed to dedicate resources (e.g. CPUs, memory, GPUs, etc.) and bind those resources to specific user jobs. Slurm does record system load and this could be used as a basis of scheduling decisions with minor code changes, but that will probably severely impact parallel job performance, which is what Slurm was really designed for.
Quoting Mario Kadastik <[email protected]>: > > Hi, > > is it possible to configure SLURM to use system load as a basis on > where to schedule as well as when to stop sending new jobs to shared > resources? > > The reason I ask is that we have 24 core and 32 core nodes and > 99.99% of our jobs are single core jobs. So we'd configure SLURM > with consumable resources and allow therefore 24 or 32 jobs to share > the nodes. We'd however like it to be done so that if say we had 170 > work servers the first 170 jobs would each get one of the servers, > the next 170 would become the second job in each server etc. > > The second part is that we'd like during transition period run slurm > and torque in parallel. This means that we've already set up torque > to schedule based on system load and pbs_mom has an attribute of > max_load that is adhered to therefore if the system load rises above > 10% of the core count (27 for 24 and 36 for 32) there are no more > jobs scheduled as it's assumed the node is busy. It can lead to even > blocking the scheduling of further jobs even though count wise there > may be free slots still. Can the same be done for slurm? This way > the two could schedule jobs both and only if the cluster really gets > full would both stop scheduling until some jobs end and the load > drops below some margin. > > If that can be achieved, then can you let me know what I'd need in > the slurm.conf to make it work like this :) Or is that the default > behavior. > > Thanks, > > Mario Kadastik, PhD > Researcher > > --- > "Physics is like sex, sure it may have practical reasons, but > that's not why we do it" > -- Richard P. Feynman >
