Hello there, I want slurm to work in the following way and need help on how to configure it to do so. Here's an example of the usage:
Cluster has several nodes, 50 cores across all nodes. Action 1: 5:00pm, User 1 submits 50 1 core jobs which take several hours Result 1: all jobs run Action 2: 5:20pm: User 2 submits 30 1 core jobs Result 2: 25 jobs of user 1 are REQUEUED, 25 jobs of User 2 are run We wrote a preemption plugin to accomplish this, and it sorta works, but we run into problems preempting running jobs that have more less cores than the pending job. For example, if User 1 is running 50 1 core jobs, and User 2's jobs require 2 cores, we're having trouble making the preemption plugin requeue 2 of User 1's jobs. Basically, before we work on the plugin any more, we'd like to know if there is an easier way to do this, that doesn't require writing a plugin. In general, we want to guarantee that each user gets an equal amount of cores, and running jobs are requeued to provide this functionality. Thanks in advance! mike
