Hello there,

I want slurm to work in the following way and need help
on how to configure it to do so. Here's an example of the
usage:

Cluster has several nodes, 50 cores across all nodes.

Action 1: 5:00pm, User 1 submits 50 1 core jobs which take several hours
Result 1: all jobs run

Action 2: 5:20pm: User 2 submits 30 1 core jobs
Result 2: 25 jobs of user 1 are REQUEUED, 25 jobs of User 2 are run

We wrote a preemption plugin to accomplish this, and it sorta
works, but we run into problems preempting running jobs that have
more less cores than the pending job. For example, if User 1 is running
50 1 core jobs, and User 2's jobs require 2 cores, we're having trouble
making the preemption plugin requeue 2 of User 1's jobs.

Basically, before we work on the plugin any more, we'd like to know if
there is an easier way to do this, that doesn't require writing a plugin.
In general, we want to guarantee that each user gets an equal amount
of cores, and running jobs are requeued to provide this functionality.

Thanks in advance!

  mike

Reply via email to