Re: [slurm-users] Accounting - running with 'wrong' account on cluster

2018-11-06 Thread Brian Andrus
Hmm. ok, so using unmatched accounts makes a fail:
(on cluster1)
$ srun -n16 -A  Prod--pty bash
*srun: error: Unable to allocate resources: Invalid account or
account/partition combination specified*

But using a valid account also fails:
$ srun -n16 -A projectA --pty bash
*srun: error: Unable to allocate resources: Invalid account or
account/partition combination specified*

So now I don't seem to be able to run anything...

On Tue, Nov 6, 2018 at 7:53 PM Christopher Samuel  wrote:

> On 7/11/18 2:44 pm, Brian Andrus wrote:
>
> > Ah just scontrol reconfigure doesn't actually make it take effect.
> > Restarting slurmctld did it.
>
> Phew!  Glad to hear that's sorted out.. :-)
>
> --
>   Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>
>


Re: [slurm-users] Accounting - running with 'wrong' account on cluster

2018-11-06 Thread Christopher Samuel

On 7/11/18 2:44 pm, Brian Andrus wrote:

Ah just scontrol reconfigure doesn't actually make it take effect. 
Restarting slurmctld did it.


Phew!  Glad to hear that's sorted out.. :-)

--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC



Re: [slurm-users] Accounting - running with 'wrong' account on cluster

2018-11-06 Thread Brian Andrus
Ah just scontrol reconfigure doesn't actually make it take effect.
Restarting slurmctld did it.

On Tue, Nov 6, 2018 at 7:07 PM Christopher Samuel  wrote:

> On 7/11/18 1:57 pm, Brian Andrus wrote:
>
> > Ah. I thought I had set that.
> > So I did and now it is:
> > AccountingStorageEnforce = associations,limits
> >
> > But I am still able to request and get resources on cluster3 using
> > projectA as my account..
> > Heck, I just tried using a fake account (account=asdas) and it worked...
> >
> > "That ain't right..." - Guy Fleegman (GalaxyQuest)
>
> That's very odd, we have:
>
> AccountingStorageEnforce = associations,limits,qos,safe
>
> and for an account I'm not part of I get:
>
> [csamuel@farnarkle1 ~]$ salloc -A oz015
> salloc: error: Job submit/allocate failed: Invalid account or
> account/partition combination specified
>
> All the best,
> Chris
> --
>   Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>
>


Re: [slurm-users] Accounting - running with 'wrong' account on cluster

2018-11-06 Thread Brian Andrus
Ah. I thought I had set that.
So I did and now it is:
AccountingStorageEnforce = associations,limits

But I am still able to request and get resources on cluster3 using projectA
as my account..
Heck, I just tried using a fake account (account=asdas) and it worked...

"That ain't right..." - Guy Fleegman (GalaxyQuest)

Brian Andrus

On Tue, Nov 6, 2018 at 4:39 PM Christopher Samuel  wrote:

> On 7/11/18 7:35 am, Brian Andrus wrote:
>
> > I am able to submit using account=projectB on cluster3. ???
> > Since 'projectB' is a child of account ' DevOps', which is only
> > associated with cluster1 and cluster2, shouldn't I be denied the ability
> > to run using that accout on cluster3?
>
> What does this say for you?
>
> scontrol show config | fgrep AccountingStorageEnforce
>
> All the best,
> Chris
> --
>   Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
>
>


Re: [slurm-users] Accounting: set default account with no access

2018-11-06 Thread Sam Hawarden
Hi Yair,


You can set maxsubmitjob=0 on an account.


The error message isn't helpful beyond the obvious though:


] salloc
salloc: error: AssocMaxSubmitJobLimit
salloc: error: Job submit/allocate failed: Job violates accounting/QOS policy 
(job submit limit, user's size and/or time limits)

So the lua script is preferable.


Kind regards,

  Sam

?

From: slurm-users  on behalf of Yair 
Yarom 
Sent: Wednesday, 7 November 2018 00:58
To: Slurm User Community List
Subject: Re: [slurm-users] Accounting: set default account with no access

Hi,

You can set the maxsubmitjob=0 on that default account. That should prevent 
anyone from using it, but it won't have a specific message like with the lua 
plugin. E.g.
sacctmgr update account default set maxsubmitjob=0

Regards,
Yair.


On Tue, Nov 6, 2018 at 12:58 AM Renfro, Michael 
mailto:ren...@tntech.edu>> wrote:
>From https://stackoverflow.com/a/46176694:

>> I had the same requirement to force users to specify accounts and, after 
>> finding several ways to fulfill it with slurm, I decided to revive this post 
>> with the shortest/easiest solution.
>>
>> The slurm lua submit plugin sees the job description before the default 
>> account is applied. Hence, you can install the slurm-lua package, add 
>> "JobSubmitPlugins=lua" to the slurm.conf, restart the slurmctld, and 
>> directly test against whether the account was defined via the job_submit.lua 
>> script (create the script wherever you keep your slurm.conf; typically in 
>> /etc/slurm/):
>>
>> -- /etc/slurm/job_submit.lua to reject jobs with no account specified
>>
>> function slurm_job_submit(job_desc, part_list, submit_uid)
>> if job_desc.account == nil then
>> slurm.log_error("User %s did not specify an account.", 
>> job_desc.user_id)
>> slurm.log_user("You must specify an account!")
>> return slurm.ERROR
>> end
>> return slurm.SUCCESS
>> end
>>
>> function slurm_job_modify(job_desc, job_rec, part_list, modify_uid)
>> return slurm.SUCCESS
>> end
>>
>> return slurm.SUCCESS

> On Nov 5, 2018, at 4:09 PM, Brian Andrus 
> mailto:toomuc...@gmail.com>> wrote:
>
> All,
>
> I am trying to figure the best way to require users to explicitly specify an 
> account when submitting jobs (--account= )
>
> What I was thinking was to create a default account for the users that has no 
> ability to submit any jobs, so if they don't specify, any submission would 
> fail.
>
> What I'm not seeing is how to set such an option on an account. I was hoping 
> to do something like cluster=none for it's access, but that is not allowed.
>
>
> Is there a way to set an account to not have access to submit jobs?
> Alternatively is there an easier way to require the --account= option for 
> jobs?
>
>
> Brian Andrus
>
>




Re: [slurm-users] Seff error with Slurm-18.08.1

2018-11-06 Thread Chris Samuel

On 6/11/18 7:49 pm, Baker D.J. wrote:

The good new is that I am assured by SchedMD that the bug has been fixed 
in v18.08.3.


Looks like it's fixed in this commmit.

commit 3d85c8f9240542d9e6dfb727244e75e449430aac
Author: Danny Auble 
Date:   Wed Oct 24 14:10:12 2018 -0600

Handle symbol resolution errors in the 18.08 slurmdbd.

Caused by b1ff43429f6426c when moving the slurmdbd agent internals.

Bug 5882.


Having said that we will probably live with this issue 
rather than disrupt users with another upgrade so soon .


An upgrade to 18.08.3 from 18.08.1 shouldn't be disruptive though, 
should it?  We just flip a symlink and the users see the new binaries, 
libraries, etc immediately, we can then restart daemons as and when we 
need to (in the right order of course, slurmdbd, slurmctld and then 
slurmd's).


All the best,
Chris
--
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC