Jagga,

Don't forget to restart the controller.

- Barry

On Mon, Jun 19, 2017 at 9:47 AM, Barry Moore <[email protected]> wrote:

> Jagga,
>
> It's possible you need the following in your slurm configuration. To be
> honest, I am guessing but it is the only thing in my Slurm conf about GRES.
> Also note I am on 16.05.6.
>
> AccountingStorageTRES=gres/gpu
>
> - Barry
>
> On Sun, Jun 18, 2017 at 9:13 PM, Jagga Soorma <[email protected]> wrote:
>
>>
>> Hi Guys,
>>
>> I can't get this to work:
>>
>> # sacctmgr modify account account=cig set GrpTRES=gres/gpu=8
>>  Unknown option: GrpTRES=gres/gpu=8
>>  Use keyword 'where' to modify condition
>>
>> The GrpTRES=cpu=x option works but for some reason GrpTRES=gres/gpu=x
>> does not.  Is there some config option I am missing to enable this
>> capability?
>>
>> Thanks.
>>
>> On Sun, Jun 18, 2017 at 5:57 PM, Jagga Soorma <[email protected]> wrote:
>> >
>> > Hey Barry,
>> >
>> > The problem is that the following command does not work for me:
>> >
>> > --
>> > # sacctmgr modify account where Partition=gred set grptres=gres/gpu=8
>> >  Unknown option: grptres=gres/gpu=8
>> >  Use keyword 'where' to modify condition
>> > --
>> >
>> > Keeps saying that the grptres=gres/gpu=8 is a unknown option.  Am I
>> > not doing this the right way?
>> >
>> > Thanks.
>> >
>> > On Fri, Jun 16, 2017 at 11:54 AM, Barry Moore <[email protected]>
>> wrote:
>> >> You don't appear to have any associations to the partitions. Seems
>> like your
>> >> associations aren't in place correctly.
>> >>
>> >> On Fri, Jun 16, 2017 at 1:24 PM, Jagga Soorma <[email protected]>
>> wrote:
>> >>>
>> >>>
>> >>> I don't see anything here:
>> >>>
>> >>> # sacctmgr show assoc partition=gred_sa
>> >>>    Cluster    Account       User  Partition     Share GrpJobs
>> >>> GrpTRES GrpSubmit     GrpWall   GrpTRESMins MaxJobs       MaxTRES
>> >>> MaxTRESPerNode MaxSubmit     MaxWall   MaxTRESMins
>> >>> QOS   Def QOS GrpTRESRunMin
>> >>> ---------- ---------- ---------- ---------- --------- -------
>> >>> ------------- --------- ----------- ------------- -------
>> >>> ------------- -------------- --------- ----------- -------------
>> >>> -------------------- --------- -------------
>> >>>
>> >>> # sacctmgr list user | grep -i restst
>> >>>    user1    SA      None
>> >>>    user2    SA      None
>> >>>    user3   CIG      None
>> >>>    user4   CIG      None
>> >>>
>> >>> Here is what I have also tried without success:
>> >>>
>> >>> sacctmgr add qos gpu GrpTRES=GRES/gpu=8 Flags=OverPartQos
>> >>>
>> >>> PartitionName=gred Nodes=node[313-314] Default=NO MaxTime=INFINITE
>> >>> State=UP DefMemPerCPU=2048 QOS=gpu
>> >>>
>> >>> sacctmgr modify account SA set qos=gpu
>> >>> sacctmgr modify account CIG set qos=gpu
>> >>>
>> >>> Now when I submit a second job by user1 asking for 8gpu's I was hoping
>> >>> it would go in the queue but it still runs:
>> >>>
>> >>> srun --gres=gpu:8 -p gred --pty bash
>> >>>
>> >>> Thanks again for your help.
>> >>>
>> >>> On Fri, Jun 16, 2017 at 10:08 AM, Barry Moore <[email protected]>
>> wrote:
>> >>> > Are there accounts with associations to that partition?
>> >>> >
>> >>> > sacctmgr show assoc partition=SA
>> >>> >
>> >>> > Sorry, I am not doing partition level management like this. I
>> presumed
>> >>> > it
>> >>> > would be similar to how I set up association limits per account per
>> >>> > cluster.
>> >>> >
>> >>> > On Fri, Jun 16, 2017 at 12:57 PM, Jagga Soorma <[email protected]>
>> >>> > wrote:
>> >>> >>
>> >>> >>
>> >>> >> So, looks like I am unable to set the grptres on the account.
>> >>> >> Shouldn't this be set with qos instead?
>> >>> >>
>> >>> >> # sacctmgr modify account where partition=SA set GrpTRES=GRES/gpu=8
>> >>> >>  Unknown option: GrpTRES=GRES/gpu=8
>> >>> >>  Use keyword 'where' to modify condition
>> >>> >>
>> >>> >> Thanks.
>> >>> >>
>> >>> >> On Fri, Jun 16, 2017 at 9:52 AM, Jagga Soorma <[email protected]>
>> >>> >> wrote:
>> >>> >> > Ahh my apologies Barry.  Looks like accidentally I did not
>> choose the
>> >>> >> > correct partition which was causing this error.  It seems to do
>> the
>> >>> >> > right thing when I have picked the correct partition during
>> sacctmgr
>> >>> >> > add user.  Will try the rest.
>> >>> >> >
>> >>> >> > Thanks again for your help with this!  Much appreciated!
>> >>> >> >
>> >>> >> > On Fri, Jun 16, 2017 at 9:49 AM, Jagga Soorma <[email protected]
>> >
>> >>> >> > wrote:
>> >>> >> >> Thanks for your response Barry.  But as soon as I add this user
>> >>> >> >> using:
>> >>> >> >>
>> >>> >> >> sacctmgr add user user1 account=SA partition=partition2
>> >>> >> >>
>> >>> >> >> They are also able to submit jobs to partiton1 which we don't
>> want.
>> >>> >> >> Does that maybe have something to do with qos?  I want these
>> users
>> >>> >> >> to
>> >>> >> >> only be able to submit jobs to partition2 and not partiton1.
>> >>> >> >>
>> >>> >> >> # sacctmgr list qos
>> >>> >> >>       Name   Priority  GraceTime    Preempt PreemptMode
>> >>> >> >>                     Flags UsageThres UsageFactor       GrpTRES
>> >>> >> >> GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit     GrpWall
>> >>> >> >> MaxTRES
>> >>> >> >> MaxTRESPerNode   MaxTRESMins     MaxWall     MaxTRESPU MaxJobsPU
>> >>> >> >> MaxSubmitPU     MaxTRESPA MaxJobsPA MaxSubmitPA       MinTRES
>> >>> >> >> ---------- ---------- ---------- ---------- -----------
>> >>> >> >> ---------------------------------------- ---------- -----------
>> >>> >> >> ------------- ------------- ------------- ------- ---------
>> >>> >> >> ----------- ------------- -------------- -------------
>> -----------
>> >>> >> >> ------------- --------- ----------- ------------- ---------
>> >>> >> >> ----------- -------------
>> >>> >> >>     normal          0   00:00:00                cluster
>> >>> >> >>                                         1.000000
>> >>> >> >>
>> >>> >> >> # sacctmgr list user  | grep -i user1
>> >>> >> >>    user1   research      None
>> >>> >> >>
>> >>> >> >> Thanks.
>> >>> >> >>
>> >>> >> >> On Fri, Jun 16, 2017 at 9:33 AM, Barry Moore <
>> [email protected]>
>> >>> >> >> wrote:
>> >>> >> >>> Jagga,
>> >>> >> >>>
>> >>> >> >>> You got it. Something along the lines of:
>> >>> >> >>>
>> >>> >> >>> # Add users to accounts, partition
>> >>> >> >>> sacctmgr add user <users> account=<account>
>> partition=<whatever>
>> >>> >> >>>
>> >>> >> >>> # Set the association limits, I think you can do something
>> similar
>> >>> >> >>> to
>> >>> >> >>> gres/gpu=-1 for the super users (assuming they are in the same
>> >>> >> >>> account?).
>> >>> >> >>> sacctmgr modify account where partition=<partition> set
>> >>> >> >>> grptres=gres/gpu=8
>> >>> >> >>> sacctmgr modify account where partition=<partition>
>> account=root
>> >>> >> >>> set
>> >>> >> >>> grptres=gres/gpu=-1
>> >>> >> >>>
>> >>> >> >>> This will work as long as your super users aren't under an
>> account
>> >>> >> >>> you
>> >>> >> >>> want
>> >>> >> >>> to enforce. Maybe someone has a nicer solution for that.
>> >>> >> >>>
>> >>> >> >>> Hope that helps,
>> >>> >> >>>
>> >>> >> >>> Barry
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> On Fri, Jun 16, 2017 at 12:18 PM, Jagga Soorma <
>> [email protected]>
>> >>> >> >>> wrote:
>> >>> >> >>>>
>> >>> >> >>>>
>> >>> >> >>>> Hello,
>> >>> >> >>>>
>> >>> >> >>>> I have a new 17.02.2 slurm environment with only one
>> partition and
>> >>> >> >>>> now
>> >>> >> >>>> have the following requirement that I need to implement:
>> >>> >> >>>>
>> >>> >> >>>> --
>> >>> >> >>>> Create a new partition (partition2) with 2 nodes.  This new
>> >>> >> >>>> partition
>> >>> >> >>>> should be restricted, i.e. users cannot submit to it unless
>> they
>> >>> >> >>>> are
>> >>> >> >>>> authorized.
>> >>> >> >>>>
>> >>> >> >>>> The two groups of users, each of whom should be authorized,
>> are:
>> >>> >> >>>>
>> >>> >> >>>> group1: SA
>> >>> >> >>>> users: user1, user2
>> >>> >> >>>>
>> >>> >> >>>> group2: CIG
>> >>> >> >>>> users: user3, user4, user5
>> >>> >> >>>>
>> >>> >> >>>> Each group should be allowed no more than 8 gres/gpus at a
>> time.
>> >>> >> >>>> The
>> >>> >> >>>> limit should be at the group level, not individual user.
>> >>> >> >>>>
>> >>> >> >>>> These users should not be able to submit jobs to the first
>> >>> >> >>>> partition
>> >>> >> >>>> (partition1).  This is our first original partition on this
>> >>> >> >>>> cluster.
>> >>> >> >>>>
>> >>> >> >>>> Also, please allow super users (userX, userY, userZ) to submit
>> >>> >> >>>> jobs
>> >>> >> >>>> to
>> >>> >> >>>> this partition without limits, for engineering/troubleshooting
>> >>> >> >>>> purposes.
>> >>> >> >>>> --
>> >>> >> >>>>
>> >>> >> >>>> I am sure this is doable via associations in slurmdbd but was
>> >>> >> >>>> wondering if that is the correct place to start.  We have only
>> >>> >> >>>> implemented simple accounts with slurmbdb so wanted to make
>> sure I
>> >>> >> >>>> do
>> >>> >> >>>> this the right way and would appreciate any help with this.
>> >>> >> >>>>
>> >>> >> >>>> Thanks!
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> --
>> >>> >> >>> Barry E Moore II, PhD
>> >>> >> >>> E-mail: [email protected]
>> >>> >> >>>
>> >>> >> >>> Assistant Research Professor
>> >>> >> >>> Center for Simulation and Modeling
>> >>> >> >>> University of Pittsburgh
>> >>> >> >>> Pittsburgh, PA 15260
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > --
>> >>> > Barry E Moore II, PhD
>> >>> > E-mail: [email protected]
>> >>> >
>> >>> > Assistant Research Professor
>> >>> > Center for Simulation and Modeling
>> >>> > University of Pittsburgh
>> >>> > Pittsburgh, PA 15260
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Barry E Moore II, PhD
>> >> E-mail: [email protected]
>> >>
>> >> Assistant Research Professor
>> >> Center for Simulation and Modeling
>> >> University of Pittsburgh
>> >> Pittsburgh, PA 15260
>>
>
>
>
> --
> Barry E Moore II, PhD
> E-mail: [email protected]
>
> Assistant Research Professor
> Center for Simulation and Modeling
> University of Pittsburgh
> Pittsburgh, PA 15260
>



-- 
Barry E Moore II, PhD
E-mail: [email protected]

Assistant Research Professor
Center for Simulation and Modeling
University of Pittsburgh
Pittsburgh, PA 15260

Reply via email to