Jagga, It's possible you need the following in your slurm configuration. To be honest, I am guessing but it is the only thing in my Slurm conf about GRES. Also note I am on 16.05.6.
AccountingStorageTRES=gres/gpu - Barry On Sun, Jun 18, 2017 at 9:13 PM, Jagga Soorma <[email protected]> wrote: > > Hi Guys, > > I can't get this to work: > > # sacctmgr modify account account=cig set GrpTRES=gres/gpu=8 > Unknown option: GrpTRES=gres/gpu=8 > Use keyword 'where' to modify condition > > The GrpTRES=cpu=x option works but for some reason GrpTRES=gres/gpu=x > does not. Is there some config option I am missing to enable this > capability? > > Thanks. > > On Sun, Jun 18, 2017 at 5:57 PM, Jagga Soorma <[email protected]> wrote: > > > > Hey Barry, > > > > The problem is that the following command does not work for me: > > > > -- > > # sacctmgr modify account where Partition=gred set grptres=gres/gpu=8 > > Unknown option: grptres=gres/gpu=8 > > Use keyword 'where' to modify condition > > -- > > > > Keeps saying that the grptres=gres/gpu=8 is a unknown option. Am I > > not doing this the right way? > > > > Thanks. > > > > On Fri, Jun 16, 2017 at 11:54 AM, Barry Moore <[email protected]> > wrote: > >> You don't appear to have any associations to the partitions. Seems like > your > >> associations aren't in place correctly. > >> > >> On Fri, Jun 16, 2017 at 1:24 PM, Jagga Soorma <[email protected]> > wrote: > >>> > >>> > >>> I don't see anything here: > >>> > >>> # sacctmgr show assoc partition=gred_sa > >>> Cluster Account User Partition Share GrpJobs > >>> GrpTRES GrpSubmit GrpWall GrpTRESMins MaxJobs MaxTRES > >>> MaxTRESPerNode MaxSubmit MaxWall MaxTRESMins > >>> QOS Def QOS GrpTRESRunMin > >>> ---------- ---------- ---------- ---------- --------- ------- > >>> ------------- --------- ----------- ------------- ------- > >>> ------------- -------------- --------- ----------- ------------- > >>> -------------------- --------- ------------- > >>> > >>> # sacctmgr list user | grep -i restst > >>> user1 SA None > >>> user2 SA None > >>> user3 CIG None > >>> user4 CIG None > >>> > >>> Here is what I have also tried without success: > >>> > >>> sacctmgr add qos gpu GrpTRES=GRES/gpu=8 Flags=OverPartQos > >>> > >>> PartitionName=gred Nodes=node[313-314] Default=NO MaxTime=INFINITE > >>> State=UP DefMemPerCPU=2048 QOS=gpu > >>> > >>> sacctmgr modify account SA set qos=gpu > >>> sacctmgr modify account CIG set qos=gpu > >>> > >>> Now when I submit a second job by user1 asking for 8gpu's I was hoping > >>> it would go in the queue but it still runs: > >>> > >>> srun --gres=gpu:8 -p gred --pty bash > >>> > >>> Thanks again for your help. > >>> > >>> On Fri, Jun 16, 2017 at 10:08 AM, Barry Moore <[email protected]> > wrote: > >>> > Are there accounts with associations to that partition? > >>> > > >>> > sacctmgr show assoc partition=SA > >>> > > >>> > Sorry, I am not doing partition level management like this. I > presumed > >>> > it > >>> > would be similar to how I set up association limits per account per > >>> > cluster. > >>> > > >>> > On Fri, Jun 16, 2017 at 12:57 PM, Jagga Soorma <[email protected]> > >>> > wrote: > >>> >> > >>> >> > >>> >> So, looks like I am unable to set the grptres on the account. > >>> >> Shouldn't this be set with qos instead? > >>> >> > >>> >> # sacctmgr modify account where partition=SA set GrpTRES=GRES/gpu=8 > >>> >> Unknown option: GrpTRES=GRES/gpu=8 > >>> >> Use keyword 'where' to modify condition > >>> >> > >>> >> Thanks. > >>> >> > >>> >> On Fri, Jun 16, 2017 at 9:52 AM, Jagga Soorma <[email protected]> > >>> >> wrote: > >>> >> > Ahh my apologies Barry. Looks like accidentally I did not choose > the > >>> >> > correct partition which was causing this error. It seems to do > the > >>> >> > right thing when I have picked the correct partition during > sacctmgr > >>> >> > add user. Will try the rest. > >>> >> > > >>> >> > Thanks again for your help with this! Much appreciated! > >>> >> > > >>> >> > On Fri, Jun 16, 2017 at 9:49 AM, Jagga Soorma <[email protected]> > >>> >> > wrote: > >>> >> >> Thanks for your response Barry. But as soon as I add this user > >>> >> >> using: > >>> >> >> > >>> >> >> sacctmgr add user user1 account=SA partition=partition2 > >>> >> >> > >>> >> >> They are also able to submit jobs to partiton1 which we don't > want. > >>> >> >> Does that maybe have something to do with qos? I want these > users > >>> >> >> to > >>> >> >> only be able to submit jobs to partition2 and not partiton1. > >>> >> >> > >>> >> >> # sacctmgr list qos > >>> >> >> Name Priority GraceTime Preempt PreemptMode > >>> >> >> Flags UsageThres UsageFactor GrpTRES > >>> >> >> GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit GrpWall > >>> >> >> MaxTRES > >>> >> >> MaxTRESPerNode MaxTRESMins MaxWall MaxTRESPU MaxJobsPU > >>> >> >> MaxSubmitPU MaxTRESPA MaxJobsPA MaxSubmitPA MinTRES > >>> >> >> ---------- ---------- ---------- ---------- ----------- > >>> >> >> ---------------------------------------- ---------- ----------- > >>> >> >> ------------- ------------- ------------- ------- --------- > >>> >> >> ----------- ------------- -------------- ------------- > ----------- > >>> >> >> ------------- --------- ----------- ------------- --------- > >>> >> >> ----------- ------------- > >>> >> >> normal 0 00:00:00 cluster > >>> >> >> 1.000000 > >>> >> >> > >>> >> >> # sacctmgr list user | grep -i user1 > >>> >> >> user1 research None > >>> >> >> > >>> >> >> Thanks. > >>> >> >> > >>> >> >> On Fri, Jun 16, 2017 at 9:33 AM, Barry Moore < > [email protected]> > >>> >> >> wrote: > >>> >> >>> Jagga, > >>> >> >>> > >>> >> >>> You got it. Something along the lines of: > >>> >> >>> > >>> >> >>> # Add users to accounts, partition > >>> >> >>> sacctmgr add user <users> account=<account> partition=<whatever> > >>> >> >>> > >>> >> >>> # Set the association limits, I think you can do something > similar > >>> >> >>> to > >>> >> >>> gres/gpu=-1 for the super users (assuming they are in the same > >>> >> >>> account?). > >>> >> >>> sacctmgr modify account where partition=<partition> set > >>> >> >>> grptres=gres/gpu=8 > >>> >> >>> sacctmgr modify account where partition=<partition> account=root > >>> >> >>> set > >>> >> >>> grptres=gres/gpu=-1 > >>> >> >>> > >>> >> >>> This will work as long as your super users aren't under an > account > >>> >> >>> you > >>> >> >>> want > >>> >> >>> to enforce. Maybe someone has a nicer solution for that. > >>> >> >>> > >>> >> >>> Hope that helps, > >>> >> >>> > >>> >> >>> Barry > >>> >> >>> > >>> >> >>> > >>> >> >>> On Fri, Jun 16, 2017 at 12:18 PM, Jagga Soorma < > [email protected]> > >>> >> >>> wrote: > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> Hello, > >>> >> >>>> > >>> >> >>>> I have a new 17.02.2 slurm environment with only one partition > and > >>> >> >>>> now > >>> >> >>>> have the following requirement that I need to implement: > >>> >> >>>> > >>> >> >>>> -- > >>> >> >>>> Create a new partition (partition2) with 2 nodes. This new > >>> >> >>>> partition > >>> >> >>>> should be restricted, i.e. users cannot submit to it unless > they > >>> >> >>>> are > >>> >> >>>> authorized. > >>> >> >>>> > >>> >> >>>> The two groups of users, each of whom should be authorized, > are: > >>> >> >>>> > >>> >> >>>> group1: SA > >>> >> >>>> users: user1, user2 > >>> >> >>>> > >>> >> >>>> group2: CIG > >>> >> >>>> users: user3, user4, user5 > >>> >> >>>> > >>> >> >>>> Each group should be allowed no more than 8 gres/gpus at a > time. > >>> >> >>>> The > >>> >> >>>> limit should be at the group level, not individual user. > >>> >> >>>> > >>> >> >>>> These users should not be able to submit jobs to the first > >>> >> >>>> partition > >>> >> >>>> (partition1). This is our first original partition on this > >>> >> >>>> cluster. > >>> >> >>>> > >>> >> >>>> Also, please allow super users (userX, userY, userZ) to submit > >>> >> >>>> jobs > >>> >> >>>> to > >>> >> >>>> this partition without limits, for engineering/troubleshooting > >>> >> >>>> purposes. > >>> >> >>>> -- > >>> >> >>>> > >>> >> >>>> I am sure this is doable via associations in slurmdbd but was > >>> >> >>>> wondering if that is the correct place to start. We have only > >>> >> >>>> implemented simple accounts with slurmbdb so wanted to make > sure I > >>> >> >>>> do > >>> >> >>>> this the right way and would appreciate any help with this. > >>> >> >>>> > >>> >> >>>> Thanks! > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> -- > >>> >> >>> Barry E Moore II, PhD > >>> >> >>> E-mail: [email protected] > >>> >> >>> > >>> >> >>> Assistant Research Professor > >>> >> >>> Center for Simulation and Modeling > >>> >> >>> University of Pittsburgh > >>> >> >>> Pittsburgh, PA 15260 > >>> > > >>> > > >>> > > >>> > > >>> > -- > >>> > Barry E Moore II, PhD > >>> > E-mail: [email protected] > >>> > > >>> > Assistant Research Professor > >>> > Center for Simulation and Modeling > >>> > University of Pittsburgh > >>> > Pittsburgh, PA 15260 > >> > >> > >> > >> > >> -- > >> Barry E Moore II, PhD > >> E-mail: [email protected] > >> > >> Assistant Research Professor > >> Center for Simulation and Modeling > >> University of Pittsburgh > >> Pittsburgh, PA 15260 > -- Barry E Moore II, PhD E-mail: [email protected] Assistant Research Professor Center for Simulation and Modeling University of Pittsburgh Pittsburgh, PA 15260
