Moe,

That's quite an obfusicated answer! I was looking for a "yes, this is the expected behavior" or "no, something is amuck."

In the case presented, again I'll say, it is clearly evident that the job waiting, number 300, can run. It has free cores, the job currently waiting will have plenty of cores available when the job it is waiting on finishes, yet it does not start simply because the time it requires would interfere with the current start time of the currently waiting job, #201.

But the assertion that job 201 would be held up by starting job 300 is completely incorrect in this case.

Now if this is the way the scheduler works, by being simple minded about time constraints, then it is what it is. I'm asking only if this behavior is the expected behavior. I think you are trying to say that indeed this is the case.

Sincerely,
Bill


On 2/20/2014 1:21 PM, Moe Jette wrote:

Slurm uses what is known as a conservative backfill scheduling algorithm. No job will be started that adversely impacts the expected start time of _any_ higher priority job. The scheduling can also be effected by a job's requirements for memory, generic resources, licenses, and resource limits.

Moe Jette
SchedMD LLC


Quoting Bill Wichser <[email protected]>:


Just a question on expected behavior of the backfill scheduler. This is an SMP machine if that matters. Scheduler is backfill with no preemption.

I have a number of jobs queued. There are three which matter, ordered by priority. In the current state I have 60 free cores.

job 201 needs 200 cores and will start in 1 hour requiring 24 hours of runtime job 202 needs 250 cores and will start in 5 hours requiring 24 hours of runtime
...
job 300 needs 30 cores and will start in 300 hours requiring 2 hours of runtime

The job completing in 1 hour will free 252 cores.

Clearly, starting job 300 will not impact job 201's start time in any way. Yet it will not start since the time overlaps the expected 1 hour start time of job 201. Is this the expected behavior? I haven't yet checked the source code to verify that this just looks at the trivial impact on the next job but I'd expect the scheduler to be able to look a little deeper than this.

Bill


Reply via email to