Hi Danny,

Apologies for top posting on this message, but it might be easier to do so. 

We had a think about the previous responses to this thread and we thought that 
we could some what cook up some scripts to wrap up the functionality to provide 
us with a simple banking system. Just enough to give us a credit/debit system 
for groups/projects which just relies on slurm and nothing else.

Please take a look at 

    https://github.com/jcftang/slurm-bank

SLURM Bank, a collection of wrapper scripts to give slurm GOLD like 
capabilities for managing resources.

With the scripts we are able to provide a simple banking system where we can 
deposit hours to an account. Users are associated to these accounts from which 
they can use to run jobs. If users do not have an account or if they do not 
have hours in their account then they cannot run jobs.

Requirements (tested with)

    • SLURM 2.2.0
    • Scientificlinux 5.4 (bash, rsync, perl)

Here's some notes from when we were cooking up ideas

    
http://thammuz.tchpc.tcd.ie/mirrors/slurm-bank/slurm-bank-1.1.1.2/html/design.html
    
http://thammuz.tchpc.tcd.ie/mirrors/slurm-bank/slurm-bank-1.1.1.2/html/walkthrough.html
   (this was just me thinking things through)

The system is pretty simple and dumb but it does appear to work (at least at 
our site). Much of the complicated problems with refunding hours due to failed 
jobs or system down time is left out and we think that it should probably be a 
user/people issue than a technical issue. We're planning on rolling out these 
scripts into full production use in a few weeks time and we hope that these 
scripts will be of use to others.

The full documentation for the scripts are at

    
http://thammuz.tchpc.tcd.ie/mirrors/slurm-bank/slurm-bank-1.1.1.2/html/index.html

and a tarball for *usage* can be got from

    http://thammuz.tchpc.tcd.ie/mirrors/slurm-bank/slurm-bank-1.1.1.2/

I think slurm provides enough reporting and transaction logging functionality 
with slurm-bank for us to migrate completely away from a setup which requires 
GOLD and maui.

Regards,
Jimmy Tang

--
Trinity Centre for High Performance Computing,
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
http://www.tchpc.tcd.ie/



On 11 May 2011, at 15:34, Danny Auble wrote:

> Hey Paddy,
> 
>> Hi Danny, Mark,
>> 
>> Just to follow up on this question a little..
>> 
>>>> At the moment we are using SLURM with Moab (for scheduling) and Gold
>>>> (for accounting), but I'm having a look at whether we can move to an
>>>> all-SLURM setup to do the same thing.
>> 
>> We have the same setup (but with maui instead of moab), and would also like 
>> to
>> move to an all-SLURM setup if possible.
>> 
>> Banking/reporting is a requirement for our setup, due to the nature of our
>> funding.
> 
> That makes sense.
> 
>> 
>>>> Using this fairly simple setup I can enforce user's jobs to be run
>>>> against only accounts that they're associated with. The next thing I
>>>> need to do (that I'm currently stuck on) is work out how to assign
>>>> quotas to these accounts (a number of Core-Hours if you like) that
>>>> decrease by an appropriate amount every time a job is run. The
>>>> documentation often refers to accounts as "bank accounts" which makes
>>>> me think that this can be done and I just haven't work out how to yet.
>>> 
>>> What documentation are you looking at is one question.  You can  
>>> probably look at  
>>> https://computing.llnl.gov/linux/slurm/accounting.html to get a good  
>>> idea how to do what you want.  Each association and QOS have a  
>>> GrpCPUMin limit.  If you decide you don't want to do fairshare (the  
>>> preferred way of doing things) you can do the hard limit stuff using  
>>> the this limit along with the priority/multifactor plugin explained  
>>> here https://computing.llnl.gov/linux/slurm/priority_multifactor.html.
>>> 
>>> Look primarily at the PriorityDecayHalfLife and  
>>> PriorityUsageResetPeriod options for the slurm.conf.
>> 
>> The functionality that I think SLURM will do easily is this:
>> 
>> * have individual associations/accounts, with one or more users in each
>> * potentially hierarchical accounts (not a biggie at present)
> 
> Yes, SLURM does both of these really well.
> 
>> 
>> 
>> The (Gold) functionality that I'm not sure about:
>> 
>> * being able to easily `deposit' resources (e.g. CPU hours) into accounts --
>>   although maybe 'sacctmgr modify ... set GrpCPUMins=XXX' will do  
>> that (it would
>>   be nice if the value can be additive/subtractive, rather than absolute)
> 
> Yes, currently there is only the absolute, but adding or subtracting could be 
> added.  Right now you can alter the amount of accumulated time for a 
> particular association tree starting at either a user or account using 
> sacctmgr (see sacctmgr modify user/account set RawUsage=).  Currently this 
> only will set the usage to zero as well, but could also be altered to support 
> various altercations.
> 
>> 
>> * being able to see the current `balance' for a given account, for  
>> both the user
>>   and the site admin
> 
> sshare will currently give you usage stats with respect to fairshare usage.  
> This could also be altered if the cluster is set up to do this kind of 
> accounting to print out the limit values and such instead of the fairshare 
> information.
> 
>> 
>> * lifetimes/deadlines for deposited CPU hours (e.g. a project might  
>> only have a
>>   lifetime of 6 months)
> 
> sacctmgr will display these by listing the associations.  But this could also 
> be part of the sshare changes since it would be nice if the user only had to 
> learn one tool.
> 
>> 
>> * (optional) different charge rates based on node features (e.g. if most
>>   nodes are uniform, but you have a subset of large memory or more cores; or
>>   different rates per partitions)
> 
> You can do something like this with QOS using the UsageFactor option.  But 
> this idea currently doesn't exist on a node level, but could probably be 
> added as well.
> 
>> 
>> * (optional) having an extra charge for reservations, and/or a charge for
>>   un-used hours in a reservation
> 
> Same idea here.  No usage factor but it could probably be added.
> 
>> 
>> * (optional) the equivalent of Gold's audit trail, of being able to see the
>>   history of when the account/association was updated
> 
> sacctmgr list transactions
> 
>> 
>> 
>> I'm thinking that setting PriorityDecayHalfLife=0 so that the resources (e.g.
>> GrpCPUMins) are absolute will get us most of the way there, but I'm not sure
>> about these other requirements.
>> 
>> Any ideas about these features? Are they already there in slurm?
> 
> I think you are close to what you need, there are a few things missing, but 
> you may be able to get around them by just using something different than you 
> have in the past.
> 
> Let me know if you have any other questions,
> Danny
> 
>> 
>> Thanks,
>> Paddy
>> 
>> -- 
>> Paddy Doyle
>> Trinity Centre for High Performance Computing,
>> Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
>> http://www.tchpc.tcd.ie/
>> 
>> 
>> ----- End forwarded message -----
>> 
>> 
>> 
> 



Reply via email to