Hi Mike, I don't have a solution for you, but here are some ideas.
The SLURM preemption logic was designed to preempt jobs based upon QOS (Quality Of Service) or job queue/partition. It was not designed to achieve fairness, so there isn't some configuration parameter that you are missing to do what you describe, it's going to require new code.
Perhaps the simplest thing to do would be to use a daemon to periodically status the jobs in the system and change the QOS of running jobs. Users getting more than their fair-share of resources would have their younger jobs put into a QOS that is preemptable. Under-served users could have their running job's QOS changed (if appropriate) from a preemptable to a non-preemptable QOS. Then you could rely upon existing SLURM code to handle the preemption. You would need to run with the SlurmDBD and configure QOS in the database, but if you already use SlurmDBD, that will require almost no effort. The existing squeue command could be modified to do what you want relatively easily and it should be easily re-used across all versions of SLURM. Making a plugin of your own would work and should be lighter-weight, but would be more complex and likely require more maintenance.
Moe Jette SchedMD LLC Quoting Mike Schachter <[email protected]>:
Hello there, I want slurm to work in the following way and need help on how to configure it to do so. Here's an example of the usage: Cluster has several nodes, 50 cores across all nodes. Action 1: 5:00pm, User 1 submits 50 1 core jobs which take several hours Result 1: all jobs run Action 2: 5:20pm: User 2 submits 30 1 core jobs Result 2: 25 jobs of user 1 are REQUEUED, 25 jobs of User 2 are run We wrote a preemption plugin to accomplish this, and it sorta works, but we run into problems preempting running jobs that have more less cores than the pending job. For example, if User 1 is running 50 1 core jobs, and User 2's jobs require 2 cores, we're having trouble making the preemption plugin requeue 2 of User 1's jobs. Basically, before we work on the plugin any more, we'd like to know if there is an easier way to do this, that doesn't require writing a plugin. In general, we want to guarantee that each user gets an equal amount of cores, and running jobs are requeued to provide this functionality. Thanks in advance! mike
