Hi,
I've noticed that we've developed (slurm 14.11.4 installation) a disparity
between what slurmctld
believes is total usage via sshare and what slurmdbd believes via sreport
(exhibited below).
Rectifying this is desirable as I suspect it is the reason some projects have
been able to acquire
negative balances (we use sbank). Currently I trust the slurmdbd view of
things. Is there a simple
procedure to realign slurmctld - e.g. would stopping slurmctld, deleting the
assoc_usage checkpoint
file, and restarting slurmctld do the desired thing? I discovered an earlier
thread in which editing
this file corrected a similar problem, which seems more dangerous.
Thanks for any advice -
Stuart
sreport -t hours cluster AccountUtilizationByUser account=a_project
start=2014-02-01T00:00:00
end=2015-05-28T00:00:00 | head -7
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2014-02-01T00:00:00 - 2015-05-27T23:59:59
(41554800 secs)
Time reported in CPU Hours
--------------------------------------------------------------------------------
Cluster Account Login Proper Name Used Energy
--------- --------------- --------- --------------- ---------- ----------
hpcs a_project 1443228 0
sshare --long -A a_project
Account User Raw Shares Norm Shares Raw Usage Norm Usage
Effectv Usage
FairShare GrpCPUMins CPURunMins
-------------------- ---------- ---------- ----------- ----------- -----------
-------------
---------- ----------- ---------------
a_project 121 0.012100 5026671760 0.007770
0.016086 0.397927
100096860 277306
i.e. Raw Usage = 5026671760/3600 = 1396297.7 core hours
--
Dr. Stuart Rankin
Senior System Administrator
High Performance Computing Service
University of Cambridge
Email: [email protected]
Tel: (+)44 1223 763517