Moe,
That's quite an obfusicated answer! I was looking for a "yes, this is
the expected behavior" or "no, something is amuck."
In the case presented, again I'll say, it is clearly evident that the
job waiting, number 300, can run. It has free cores, the job currently
waiting will have plenty of cores available when the job it is waiting
on finishes, yet it does not start simply because the time it requires
would interfere with the current start time of the currently waiting
job, #201.
But the assertion that job 201 would be held up by starting job 300 is
completely incorrect in this case.
Now if this is the way the scheduler works, by being simple minded about
time constraints, then it is what it is. I'm asking only if this
behavior is the expected behavior. I think you are trying to say that
indeed this is the case.
Sincerely,
Bill
On 2/20/2014 1:21 PM, Moe Jette wrote:
Slurm uses what is known as a conservative backfill scheduling
algorithm. No job will be started that adversely impacts the expected
start time of _any_ higher priority job. The scheduling can also be
effected by a job's requirements for memory, generic resources,
licenses, and resource limits.
Moe Jette
SchedMD LLC
Quoting Bill Wichser <[email protected]>:
Just a question on expected behavior of the backfill scheduler. This
is an SMP machine if that matters. Scheduler is backfill with no
preemption.
I have a number of jobs queued. There are three which matter,
ordered by priority. In the current state I have 60 free cores.
job 201 needs 200 cores and will start in 1 hour requiring 24 hours
of runtime
job 202 needs 250 cores and will start in 5 hours requiring 24 hours
of runtime
...
job 300 needs 30 cores and will start in 300 hours requiring 2 hours
of runtime
The job completing in 1 hour will free 252 cores.
Clearly, starting job 300 will not impact job 201's start time in any
way. Yet it will not start since the time overlaps the expected 1
hour start time of job 201. Is this the expected behavior? I
haven't yet checked the source code to verify that this just looks at
the trivial impact on the next job but I'd expect the scheduler to be
able to look a little deeper than this.
Bill