Jagga,

It's possible you need the following in your slurm configuration. To be
honest, I am guessing but it is the only thing in my Slurm conf about GRES.
Also note I am on 16.05.6.

AccountingStorageTRES=gres/gpu

- Barry

On Sun, Jun 18, 2017 at 9:13 PM, Jagga Soorma <[email protected]> wrote:

>
> Hi Guys,
>
> I can't get this to work:
>
> # sacctmgr modify account account=cig set GrpTRES=gres/gpu=8
>  Unknown option: GrpTRES=gres/gpu=8
>  Use keyword 'where' to modify condition
>
> The GrpTRES=cpu=x option works but for some reason GrpTRES=gres/gpu=x
> does not.  Is there some config option I am missing to enable this
> capability?
>
> Thanks.
>
> On Sun, Jun 18, 2017 at 5:57 PM, Jagga Soorma <[email protected]> wrote:
> >
> > Hey Barry,
> >
> > The problem is that the following command does not work for me:
> >
> > --
> > # sacctmgr modify account where Partition=gred set grptres=gres/gpu=8
> >  Unknown option: grptres=gres/gpu=8
> >  Use keyword 'where' to modify condition
> > --
> >
> > Keeps saying that the grptres=gres/gpu=8 is a unknown option.  Am I
> > not doing this the right way?
> >
> > Thanks.
> >
> > On Fri, Jun 16, 2017 at 11:54 AM, Barry Moore <[email protected]>
> wrote:
> >> You don't appear to have any associations to the partitions. Seems like
> your
> >> associations aren't in place correctly.
> >>
> >> On Fri, Jun 16, 2017 at 1:24 PM, Jagga Soorma <[email protected]>
> wrote:
> >>>
> >>>
> >>> I don't see anything here:
> >>>
> >>> # sacctmgr show assoc partition=gred_sa
> >>>    Cluster    Account       User  Partition     Share GrpJobs
> >>> GrpTRES GrpSubmit     GrpWall   GrpTRESMins MaxJobs       MaxTRES
> >>> MaxTRESPerNode MaxSubmit     MaxWall   MaxTRESMins
> >>> QOS   Def QOS GrpTRESRunMin
> >>> ---------- ---------- ---------- ---------- --------- -------
> >>> ------------- --------- ----------- ------------- -------
> >>> ------------- -------------- --------- ----------- -------------
> >>> -------------------- --------- -------------
> >>>
> >>> # sacctmgr list user | grep -i restst
> >>>    user1    SA      None
> >>>    user2    SA      None
> >>>    user3   CIG      None
> >>>    user4   CIG      None
> >>>
> >>> Here is what I have also tried without success:
> >>>
> >>> sacctmgr add qos gpu GrpTRES=GRES/gpu=8 Flags=OverPartQos
> >>>
> >>> PartitionName=gred Nodes=node[313-314] Default=NO MaxTime=INFINITE
> >>> State=UP DefMemPerCPU=2048 QOS=gpu
> >>>
> >>> sacctmgr modify account SA set qos=gpu
> >>> sacctmgr modify account CIG set qos=gpu
> >>>
> >>> Now when I submit a second job by user1 asking for 8gpu's I was hoping
> >>> it would go in the queue but it still runs:
> >>>
> >>> srun --gres=gpu:8 -p gred --pty bash
> >>>
> >>> Thanks again for your help.
> >>>
> >>> On Fri, Jun 16, 2017 at 10:08 AM, Barry Moore <[email protected]>
> wrote:
> >>> > Are there accounts with associations to that partition?
> >>> >
> >>> > sacctmgr show assoc partition=SA
> >>> >
> >>> > Sorry, I am not doing partition level management like this. I
> presumed
> >>> > it
> >>> > would be similar to how I set up association limits per account per
> >>> > cluster.
> >>> >
> >>> > On Fri, Jun 16, 2017 at 12:57 PM, Jagga Soorma <[email protected]>
> >>> > wrote:
> >>> >>
> >>> >>
> >>> >> So, looks like I am unable to set the grptres on the account.
> >>> >> Shouldn't this be set with qos instead?
> >>> >>
> >>> >> # sacctmgr modify account where partition=SA set GrpTRES=GRES/gpu=8
> >>> >>  Unknown option: GrpTRES=GRES/gpu=8
> >>> >>  Use keyword 'where' to modify condition
> >>> >>
> >>> >> Thanks.
> >>> >>
> >>> >> On Fri, Jun 16, 2017 at 9:52 AM, Jagga Soorma <[email protected]>
> >>> >> wrote:
> >>> >> > Ahh my apologies Barry.  Looks like accidentally I did not choose
> the
> >>> >> > correct partition which was causing this error.  It seems to do
> the
> >>> >> > right thing when I have picked the correct partition during
> sacctmgr
> >>> >> > add user.  Will try the rest.
> >>> >> >
> >>> >> > Thanks again for your help with this!  Much appreciated!
> >>> >> >
> >>> >> > On Fri, Jun 16, 2017 at 9:49 AM, Jagga Soorma <[email protected]>
> >>> >> > wrote:
> >>> >> >> Thanks for your response Barry.  But as soon as I add this user
> >>> >> >> using:
> >>> >> >>
> >>> >> >> sacctmgr add user user1 account=SA partition=partition2
> >>> >> >>
> >>> >> >> They are also able to submit jobs to partiton1 which we don't
> want.
> >>> >> >> Does that maybe have something to do with qos?  I want these
> users
> >>> >> >> to
> >>> >> >> only be able to submit jobs to partition2 and not partiton1.
> >>> >> >>
> >>> >> >> # sacctmgr list qos
> >>> >> >>       Name   Priority  GraceTime    Preempt PreemptMode
> >>> >> >>                     Flags UsageThres UsageFactor       GrpTRES
> >>> >> >> GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit     GrpWall
> >>> >> >> MaxTRES
> >>> >> >> MaxTRESPerNode   MaxTRESMins     MaxWall     MaxTRESPU MaxJobsPU
> >>> >> >> MaxSubmitPU     MaxTRESPA MaxJobsPA MaxSubmitPA       MinTRES
> >>> >> >> ---------- ---------- ---------- ---------- -----------
> >>> >> >> ---------------------------------------- ---------- -----------
> >>> >> >> ------------- ------------- ------------- ------- ---------
> >>> >> >> ----------- ------------- -------------- -------------
> -----------
> >>> >> >> ------------- --------- ----------- ------------- ---------
> >>> >> >> ----------- -------------
> >>> >> >>     normal          0   00:00:00                cluster
> >>> >> >>                                         1.000000
> >>> >> >>
> >>> >> >> # sacctmgr list user  | grep -i user1
> >>> >> >>    user1   research      None
> >>> >> >>
> >>> >> >> Thanks.
> >>> >> >>
> >>> >> >> On Fri, Jun 16, 2017 at 9:33 AM, Barry Moore <
> [email protected]>
> >>> >> >> wrote:
> >>> >> >>> Jagga,
> >>> >> >>>
> >>> >> >>> You got it. Something along the lines of:
> >>> >> >>>
> >>> >> >>> # Add users to accounts, partition
> >>> >> >>> sacctmgr add user <users> account=<account> partition=<whatever>
> >>> >> >>>
> >>> >> >>> # Set the association limits, I think you can do something
> similar
> >>> >> >>> to
> >>> >> >>> gres/gpu=-1 for the super users (assuming they are in the same
> >>> >> >>> account?).
> >>> >> >>> sacctmgr modify account where partition=<partition> set
> >>> >> >>> grptres=gres/gpu=8
> >>> >> >>> sacctmgr modify account where partition=<partition> account=root
> >>> >> >>> set
> >>> >> >>> grptres=gres/gpu=-1
> >>> >> >>>
> >>> >> >>> This will work as long as your super users aren't under an
> account
> >>> >> >>> you
> >>> >> >>> want
> >>> >> >>> to enforce. Maybe someone has a nicer solution for that.
> >>> >> >>>
> >>> >> >>> Hope that helps,
> >>> >> >>>
> >>> >> >>> Barry
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> On Fri, Jun 16, 2017 at 12:18 PM, Jagga Soorma <
> [email protected]>
> >>> >> >>> wrote:
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >>>> Hello,
> >>> >> >>>>
> >>> >> >>>> I have a new 17.02.2 slurm environment with only one partition
> and
> >>> >> >>>> now
> >>> >> >>>> have the following requirement that I need to implement:
> >>> >> >>>>
> >>> >> >>>> --
> >>> >> >>>> Create a new partition (partition2) with 2 nodes.  This new
> >>> >> >>>> partition
> >>> >> >>>> should be restricted, i.e. users cannot submit to it unless
> they
> >>> >> >>>> are
> >>> >> >>>> authorized.
> >>> >> >>>>
> >>> >> >>>> The two groups of users, each of whom should be authorized,
> are:
> >>> >> >>>>
> >>> >> >>>> group1: SA
> >>> >> >>>> users: user1, user2
> >>> >> >>>>
> >>> >> >>>> group2: CIG
> >>> >> >>>> users: user3, user4, user5
> >>> >> >>>>
> >>> >> >>>> Each group should be allowed no more than 8 gres/gpus at a
> time.
> >>> >> >>>> The
> >>> >> >>>> limit should be at the group level, not individual user.
> >>> >> >>>>
> >>> >> >>>> These users should not be able to submit jobs to the first
> >>> >> >>>> partition
> >>> >> >>>> (partition1).  This is our first original partition on this
> >>> >> >>>> cluster.
> >>> >> >>>>
> >>> >> >>>> Also, please allow super users (userX, userY, userZ) to submit
> >>> >> >>>> jobs
> >>> >> >>>> to
> >>> >> >>>> this partition without limits, for engineering/troubleshooting
> >>> >> >>>> purposes.
> >>> >> >>>> --
> >>> >> >>>>
> >>> >> >>>> I am sure this is doable via associations in slurmdbd but was
> >>> >> >>>> wondering if that is the correct place to start.  We have only
> >>> >> >>>> implemented simple accounts with slurmbdb so wanted to make
> sure I
> >>> >> >>>> do
> >>> >> >>>> this the right way and would appreciate any help with this.
> >>> >> >>>>
> >>> >> >>>> Thanks!
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> --
> >>> >> >>> Barry E Moore II, PhD
> >>> >> >>> E-mail: [email protected]
> >>> >> >>>
> >>> >> >>> Assistant Research Professor
> >>> >> >>> Center for Simulation and Modeling
> >>> >> >>> University of Pittsburgh
> >>> >> >>> Pittsburgh, PA 15260
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Barry E Moore II, PhD
> >>> > E-mail: [email protected]
> >>> >
> >>> > Assistant Research Professor
> >>> > Center for Simulation and Modeling
> >>> > University of Pittsburgh
> >>> > Pittsburgh, PA 15260
> >>
> >>
> >>
> >>
> >> --
> >> Barry E Moore II, PhD
> >> E-mail: [email protected]
> >>
> >> Assistant Research Professor
> >> Center for Simulation and Modeling
> >> University of Pittsburgh
> >> Pittsburgh, PA 15260
>



-- 
Barry E Moore II, PhD
E-mail: [email protected]

Assistant Research Professor
Center for Simulation and Modeling
University of Pittsburgh
Pittsburgh, PA 15260

Reply via email to