Moe, thank you for your prompt reply. We are already utilizing various slurm resource limits and we also use slurm QOS facilities in our scheduling model. We do not use preemption in our current model. My question was specifically about slurm support of configuring soft and hard resource limits. Please see scenario below:
Cluster has 200 nodes User A and User B use the same QOS and they have the same resource limits. User A submits 1000 single node jobs and has maxjob limit set to 100 User B submits 200 single node jobs and has maxjob limit set to 100 When the 200 jobs of User B completes user A can still have only 100 running jobs, cluster is 50% underutilized. Soft Hard limit feature would help to keep cluster fully utilized as long as there are jobs in the queue. If user A and user B in the above scenario would have limits maxjob1=100[200] where 100 is soft and 200 is hard limit then cluster would be fully utilized as long as there are jobs in the queue because user A could run 200 jobs if there were no demand from other users. Can this soft/hard limit functionality currently be achieved in SLURM? If this is not currently possible in slurm I would like to know how many people would consider this as useful feature and if it is worth to put some development effort into this. Best regards, Wojciech On 17 July 2013 16:40, Moe Jette <[email protected]> wrote: > Slurm has an assortment of hard limits available: > http://www.schedmd.com/**slurmdocs/resource_limits.html<http://www.schedmd.com/slurmdocs/resource_limits.html> > > Slurm also supports various Qualities Of Service (QOS): > http://www.schedmd.com/**slurmdocs/qos.html<http://www.schedmd.com/slurmdocs/qos.html> > > Plus job preemption: > http://www.schedmd.com/**slurmdocs/preempt.html<http://www.schedmd.com/slurmdocs/preempt.html> > > In a typical scenario, there would be a low priority QOS, say "standby", > whose jobs can be preempted as needed for higher priority work. Another > option is a low priority job queue (partition), again with preemption. > > > Quoting Wojciech Turek <[email protected]>: > > We are migrating our scheduling system from torque/maui/moab to slurm and >> there is a particularly important moab/maui feature [hard and soft limits] >> which does not seem to be implemented yet in slurm, please see below a >> link >> to a description of that feature >> http://docs.adaptivecomputing.**com/mwm/archive/6-0/6.** >> 2throttlingpolicies.php#limits<http://docs.adaptivecomputing.com/mwm/archive/6-0/6.2throttlingpolicies.php#limits> >> >> My questions are: >> a) Am I missing something and soft/hard limits feature actually is >> implemented in slurm ? >> b) no this feature does not exists but there is alternative way of doing >> this n slurm ? >> c) no this feature does not exists but implementing it in slurm would be >> easy/difficult >> >> caveat: >> I would like to avoid cronjob like solutions that would change limits in >> flight according to cluster state. >> >> Many thanks for all your help >> >> -- >> Wojciech Turek >> >> Senior System Architect >> >> High Performance Computing Service >> University of Cambridge >> >> > > > -- Wojciech Turek Senior System Architect High Performance Computing Service University of Cambridge
