Hi,
The problem we encounter is that when a job is suspended its raw usage
continues to grow.
Our cluster uses slurm 2.3.0. Here are some of the parameters of slurm.conf:
PriorityType=priority/multifactor
PriorityDecayHalfLife=0
SelectType=select/cons_res
SelectTypeParameters=CR_CPU
PreemptType=preempt/partition_prio
PreemptMode=SUSPEND,GANG
PartitionName=def Nodes=... Shared=FORCE:1 Priority=80 PreemptMode=off
PartitionName=low Nodes=... Shared=FORCE:1 Priority=20 PreemptMode=suspend
Each account is given a quota of cpu hours that is set using the command
'sacctmgr modify account=... set GrpCPUMins=...'.
The usage of an account can be read with the value of 'Raw Usage' given
by the sshare command. This is this value that continues to grow even
when all the jobs of this account are in the SUSPENDED state.
The value 'Used' obtained using the command 'sreport cluster
AccountUtilizationByUser' seems to be affected by this problem too.
Whereas summing up the value of CPUTimeRAW given by the sacct command
for each job of a given account seems to give sensible results.
This behaviour seems related to the following test in the function
_decay_thread() in
src/plugins/priority/multifactor/priority_multifactor.c in slurm 2.3.4:
[...]
while ((job_ptr = list_next(itr))) {
/* apply new usage */
if (!IS_JOB_PENDING(job_ptr) &&
job_ptr->start_time && job_ptr->assoc_ptr) {
[...]
This test is also present in the git repository, in the same file, in
the function decay_apply_new_usage().
I wonder if replacing the test with:
if (!IS_JOB_PENDING(job_ptr) && !IS_JOB_SUSPENDED(job_ptr) && ...
would solve the problem?
Regards,
Stéphane VAILLANT