OK, so I have accounting configured and it should give me all the data
that is available about jobs. So why can't I see any reports of usage
by users or groups?
# sreport user top
--------------------------------------------------------------------------------
Top 10 Users 2016-02-17T00:00:00 - 2016-02-17T23:59:59 (86400 secs)
Use reported in TRES Minutes
--------------------------------------------------------------------------------
Cluster Login Proper Name Account Used Energy
--------- --------- --------------- --------------- ---------- ----------
SLURM seems to know jobs ran, why doesn't sreport show any?
# sacct
JobID JobName Partition Account AllocCPUS State
ExitCode
------------ ---------- ---------- ---------- ---------- ----------
--------
223 glob_30_60 cas 1 RUNNING 0:0
265 par_glob cas 20 COMPLETED 0:0
265.batch batch 20 COMPLETED 0:0
266 par_glob2 cas 20 COMPLETED 0:0
266.batch batch 20 COMPLETED 0:0
270 idv1765 cas 20 CANCELLED+
0:0
270.batch batch 20 CANCELLED 0:15
271 par_glob3 cas 20 RUNNING 0:0
Jeff White
HPC Systems Engineer
Information Technology Services - WSU
On 02/16/2016 11:35 PM, Loris Bennett wrote:
Hi Jeff,
Jeff White <[email protected]> writes:
I'm working on getting accounting set up on a new SLURM instance. The
cluster is working, slurmdbd is running, database is configured, sacct
spits out some job info, all appears to be working. Good, I built a
thing and it seems to work. Now the hard part: what do I do with it?
* What exactly is an "account" in SLURM speak? We have well-defined
groups already and I don't want my users to need to specify an account
or anything of the such with their jobs. What do I need to do (if
anything) to have accounting use purely users and groups and no
manually-defined "accounts"?
My understanding is that it is a collection of resource restrictions.
If you have well-defined groups, then an account will correspond to a
group. The account model is, however, more general, because, say, one
person could run jobs in various projects which have all have different
CPU-time budgets and/or priorities.
However, I also just have research groups and they correspond 1-to-1
with my accounts. The accounts are arranged in a hierarchy (via the
parent organisation property) which corresponds to the organigram of the
university institutes and departments.
If you are using fairshare, you then need to set the shares per entity
in the organigram. As all our users are created equal, this means
adding a user to a group, incrementing the shares of the group,
incrementing the shares of the institute, and incrementing the shares of
the department. When a user leaves the group, this obviously all has to
be done is reverse. Because this is a bit of a chore and quite error
prone, we use a wrapper around sacctmgr to automate this which is
integrated into our user-lifecycle-management mechanism.
* The whole JobComp explanation in the documentation isn't clear to
me. What does accounting to slurmdbd /not/ provide that setting
JobComp to log elsewhere would? Why can't slurmdbd be used for
everything?
It can.
Here's some parts of the config, let me know if you want more:
# grep AccountingStorage /etc/slurm/slurm.conf
#AccountingStorageEnforce=0
AccountingStorageHost=slurm-p1n01.mgmt.kamiak.example.edu
#AccountingStorageLoc=
#AccountingStoragePass=
#AccountingStoragePort=
AccountingStorageType=accounting_storage/slurmdbd
#AccountingStorageUser=
# grep JobCompType /etc/slurm/slurm.conf
#JobCompType=jobcomp/slurmdbd
If you are using
AccountingStorageType=accounting_storage/slurmdbd
my understanding is that you don't need to set JobComp, as this provides
only a subset of the data you get from accounting storage.
HTH
Loris