On 01/22/14 13:40, William B Hurst wrote:
Greetings,
I could not find an answer to this question in any of the documentation
resources, nor from any that you have listed, so I just thought I would
ask.
We are running numerous large clusters, so this is an important
question. While each user belongs to a group, they may be submitting
jobs on multiple projects. We would like to be able to track resource
consumption by project name.
Is there a way for me to add a "project" parameter requirement for job
submissions to slurm?
Michael is correct,
This is exactly how accounting is set up in the DBD. Your environment
appears very common.
In Slurm projects == accounts.
Users have a default one per cluster or could use the --account option
to specify one.
If that doesn't get you what you want the --wckey option might. If you
wanted to enforce it you could just check for it in a job_submit plugin
and deny a job if it isn't set. The nice thing about wckey is you get a
nice * afterwards if the user didn't specify one. That way the job gets
the default for the association, but a person that cares that the user
didn't specify one can tell.
A couple of SUG meetings ago there was a presentation on it
http://slurm.schedmd.com/slurm_ug_2012/SUG_Oct2012_DBD.pdf
Danny
If so, could you describe how to implement it?
Best Regards,
Brad Hurst
ps: I have included one of the system configurations below
Configuration data as of 2014-01-22T14:41:34
AccountingStorageBackupHost = (null)
AccountingStorageEnforce = associations,limits,qos
AccountingStorageHost = head.tusker.hcc.unl.edu
AccountingStorageLoc = N/A
AccountingStoragePort = 6819
AccountingStorageType = accounting_storage/slurmdbd
AccountingStorageUser = N/A
AccountingStoreJobComment = YES
AcctGatherEnergyType = acct_gather_energy/none
AcctGatherFilesystemType = acct_gather_filesystem/none
AcctGatherInfinibandType = acct_gather_infiniband/none
AcctGatherNodeFreq = 0 sec
AcctGatherProfileType = acct_gather_profile/none
AuthType = auth/munge
BackupAddr = (null)
BackupController = (null)
BatchStartTimeout = 10 sec
BOOT_TIME = 2014-01-14T11:02:03
CacheGroups = 0
CheckpointType = checkpoint/none
ClusterName = tusker
CompleteWait = 0 sec
ControlAddr = head
ControlMachine = head
CryptoType = crypto/munge
DebugFlags = (null)
DefMemPerCPU = 1024
DisableRootJobs = NO
DynAllocPort = 0
EnforcePartLimits = YES
Epilog = /bin/true
EpilogMsgTime = 2000 usec
EpilogSlurmctld = (null)
ExtSensorsType = ext_sensors/none
ExtSensorsFreq = 0 sec
FastSchedule = 1
FirstJobId = 1
GetEnvTimeout = 2 sec
GresTypes = gpu
GroupUpdateForce = 0
GroupUpdateTime = 600 sec