Basically, you create an allocation account for each of the organizations and projects, with the projects having as "parent"
the allocation account corresponding to the organization it
belongs to.  Then set the appropriate limits on the allocation
accounts, and create associations for users with the allocation
accounts for each project (not org) they belong to.

Typically, to limit CPU-hours one would limit GrpCPUMins (older
slurms) or GrpTRESMins (newer slurms, and set the cpu TRES in
this parameter).  All of above is set up with sacctmgr command.

Slurm will then allow users to submit jobs charging against
allocation accounts for projects they belong to (i.e. have
an association with in Slurm's DB), and Slurm will track usage
and prevent jobs from starting (if other Slurm config setup properly)
if new job will cause usage to exceed the limit.  I have not thoroughly
tested, but I believe if you oversubscribe an organization's SUs
(i.e. sum of SU limits on projects belonging to org is greater than
the SU limit on the org), Slurm will ensure both limits are enforced.
This could be confusing to users, as I believe most utilities for
examining allocation account "balances" do not handle multiple layers
(on the todo list for my utility, but not enough manpower SUs:)

To my knowledge, Slurm makes no attempt to integrate with external
user databases of any kind, so if you wanted something a bit more
automated you would need to write your own cron job or something to check for users added to/removed from any of projects you have allocations for
for and issue the sacctmgr commands to update the Slurm DB.  The complexity
depends on how complex your environment is (how many Slurm partitions, etc),
but should not be too bad.

I have Slurm-Sacctmgr and Slurm-Sshare Perl modules on CPAN which
provide perl friendly wrappers to the sacctmgr and sshare commands
which might be helpful in such (I also have similar for a few other
Slurm utilities which I have not had time to polish enough for submission
to CPAN).



On Tue, 18 Jul 2017, Ilja Livenson wrote:
Hello,
newbie here.

We are working on integrating SLURM with our self-service portal. Our model is 
as follows:

- a user can belong to one or more projects.
- projects belong to a particular organization.
- limits on usage can be set on project and organization level (e.g. 1000 cpu/h 
per month in project A and 2000 cpu/h in project B).
- information about user membership and limits is available over several 
protocols, including REST/LDAP/FreeIPA.
- a user logins into a SLURM submission node and submits a job.

Now, we are not quite clear what is the correct approach for enforcing those 
limits in SLURM. Perhaps someone wiser could suggest an approach or point to a 
project where similar has been achieved? Googling didn't help much with the 
latter,
unfortunately.

thanks,
Ilja



Tom Payerle
DIT-ATI-Research Computing              [email protected]
4254 Stadium Dr                         (301) 405-6135
University of Maryland
College Park, MD 20742-4111

Reply via email to