Hi everyone. I'm still testing slurm 2.4 (taken from git today), but this is a
pattern I've noticed at least since 4-5 months.
I'm scheduling 900 jobs on a partition, using multifactor priorities and
accounting.
Sometimes (after a few days usually), I noticed that freshly submitted jobs all
have priority "1":
JOBID NAME ST TIME PRIOR
459128 MERLQ_2129 PD 0:00 1
459127 MERLQ_2129 PD 0:00 1
459126 MERLQ_2129 PD 0:00 1
459125 MERLQ_2129 PD 0:00 1
Let's pick the first one:
# sprio -j 459128
Unable to find jobs matching user/id(s) specified
But how is that even possible?
# sacct -j 459128
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
459128 MERLQ_212+ batch default 0 PENDING 0:0
Any clue about what might be causing this?
Older jobs do have a priority and can be queried using sprio.