Yuri, turn on the Priority DebugFlag in the slurm.conf and see what is
happening. Perhaps that would shead some light on the subject. You can
do it from sview or alter the slurm.conf file and scontrol reconfig
without having to restart the slurmctld.
Danny
On 02/07/12 10:13, Yuri D'Elia wrote:
Hi everyone. I'm still testing slurm 2.4 (taken from git today), but this is a
pattern I've noticed at least since 4-5 months.
I'm scheduling 900 jobs on a partition, using multifactor priorities and
accounting.
Sometimes (after a few days usually), I noticed that freshly submitted jobs all have
priority "1":
JOBID NAME ST TIME PRIOR
459128 MERLQ_2129 PD 0:00 1
459127 MERLQ_2129 PD 0:00 1
459126 MERLQ_2129 PD 0:00 1
459125 MERLQ_2129 PD 0:00 1
Let's pick the first one:
# sprio -j 459128
Unable to find jobs matching user/id(s) specified
But how is that even possible?
# sacct -j 459128
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
459128 MERLQ_212+ batch default 0 PENDING 0:0
Any clue about what might be causing this?
Older jobs do have a priority and can be queried using sprio.