Jagga, Don't forget to restart the controller.
- Barry On Mon, Jun 19, 2017 at 9:47 AM, Barry Moore <[email protected]> wrote: > Jagga, > > It's possible you need the following in your slurm configuration. To be > honest, I am guessing but it is the only thing in my Slurm conf about GRES. > Also note I am on 16.05.6. > > AccountingStorageTRES=gres/gpu > > - Barry > > On Sun, Jun 18, 2017 at 9:13 PM, Jagga Soorma <[email protected]> wrote: > >> >> Hi Guys, >> >> I can't get this to work: >> >> # sacctmgr modify account account=cig set GrpTRES=gres/gpu=8 >> Unknown option: GrpTRES=gres/gpu=8 >> Use keyword 'where' to modify condition >> >> The GrpTRES=cpu=x option works but for some reason GrpTRES=gres/gpu=x >> does not. Is there some config option I am missing to enable this >> capability? >> >> Thanks. >> >> On Sun, Jun 18, 2017 at 5:57 PM, Jagga Soorma <[email protected]> wrote: >> > >> > Hey Barry, >> > >> > The problem is that the following command does not work for me: >> > >> > -- >> > # sacctmgr modify account where Partition=gred set grptres=gres/gpu=8 >> > Unknown option: grptres=gres/gpu=8 >> > Use keyword 'where' to modify condition >> > -- >> > >> > Keeps saying that the grptres=gres/gpu=8 is a unknown option. Am I >> > not doing this the right way? >> > >> > Thanks. >> > >> > On Fri, Jun 16, 2017 at 11:54 AM, Barry Moore <[email protected]> >> wrote: >> >> You don't appear to have any associations to the partitions. Seems >> like your >> >> associations aren't in place correctly. >> >> >> >> On Fri, Jun 16, 2017 at 1:24 PM, Jagga Soorma <[email protected]> >> wrote: >> >>> >> >>> >> >>> I don't see anything here: >> >>> >> >>> # sacctmgr show assoc partition=gred_sa >> >>> Cluster Account User Partition Share GrpJobs >> >>> GrpTRES GrpSubmit GrpWall GrpTRESMins MaxJobs MaxTRES >> >>> MaxTRESPerNode MaxSubmit MaxWall MaxTRESMins >> >>> QOS Def QOS GrpTRESRunMin >> >>> ---------- ---------- ---------- ---------- --------- ------- >> >>> ------------- --------- ----------- ------------- ------- >> >>> ------------- -------------- --------- ----------- ------------- >> >>> -------------------- --------- ------------- >> >>> >> >>> # sacctmgr list user | grep -i restst >> >>> user1 SA None >> >>> user2 SA None >> >>> user3 CIG None >> >>> user4 CIG None >> >>> >> >>> Here is what I have also tried without success: >> >>> >> >>> sacctmgr add qos gpu GrpTRES=GRES/gpu=8 Flags=OverPartQos >> >>> >> >>> PartitionName=gred Nodes=node[313-314] Default=NO MaxTime=INFINITE >> >>> State=UP DefMemPerCPU=2048 QOS=gpu >> >>> >> >>> sacctmgr modify account SA set qos=gpu >> >>> sacctmgr modify account CIG set qos=gpu >> >>> >> >>> Now when I submit a second job by user1 asking for 8gpu's I was hoping >> >>> it would go in the queue but it still runs: >> >>> >> >>> srun --gres=gpu:8 -p gred --pty bash >> >>> >> >>> Thanks again for your help. >> >>> >> >>> On Fri, Jun 16, 2017 at 10:08 AM, Barry Moore <[email protected]> >> wrote: >> >>> > Are there accounts with associations to that partition? >> >>> > >> >>> > sacctmgr show assoc partition=SA >> >>> > >> >>> > Sorry, I am not doing partition level management like this. I >> presumed >> >>> > it >> >>> > would be similar to how I set up association limits per account per >> >>> > cluster. >> >>> > >> >>> > On Fri, Jun 16, 2017 at 12:57 PM, Jagga Soorma <[email protected]> >> >>> > wrote: >> >>> >> >> >>> >> >> >>> >> So, looks like I am unable to set the grptres on the account. >> >>> >> Shouldn't this be set with qos instead? >> >>> >> >> >>> >> # sacctmgr modify account where partition=SA set GrpTRES=GRES/gpu=8 >> >>> >> Unknown option: GrpTRES=GRES/gpu=8 >> >>> >> Use keyword 'where' to modify condition >> >>> >> >> >>> >> Thanks. >> >>> >> >> >>> >> On Fri, Jun 16, 2017 at 9:52 AM, Jagga Soorma <[email protected]> >> >>> >> wrote: >> >>> >> > Ahh my apologies Barry. Looks like accidentally I did not >> choose the >> >>> >> > correct partition which was causing this error. It seems to do >> the >> >>> >> > right thing when I have picked the correct partition during >> sacctmgr >> >>> >> > add user. Will try the rest. >> >>> >> > >> >>> >> > Thanks again for your help with this! Much appreciated! >> >>> >> > >> >>> >> > On Fri, Jun 16, 2017 at 9:49 AM, Jagga Soorma <[email protected] >> > >> >>> >> > wrote: >> >>> >> >> Thanks for your response Barry. But as soon as I add this user >> >>> >> >> using: >> >>> >> >> >> >>> >> >> sacctmgr add user user1 account=SA partition=partition2 >> >>> >> >> >> >>> >> >> They are also able to submit jobs to partiton1 which we don't >> want. >> >>> >> >> Does that maybe have something to do with qos? I want these >> users >> >>> >> >> to >> >>> >> >> only be able to submit jobs to partition2 and not partiton1. >> >>> >> >> >> >>> >> >> # sacctmgr list qos >> >>> >> >> Name Priority GraceTime Preempt PreemptMode >> >>> >> >> Flags UsageThres UsageFactor GrpTRES >> >>> >> >> GrpTRESMins GrpTRESRunMin GrpJobs GrpSubmit GrpWall >> >>> >> >> MaxTRES >> >>> >> >> MaxTRESPerNode MaxTRESMins MaxWall MaxTRESPU MaxJobsPU >> >>> >> >> MaxSubmitPU MaxTRESPA MaxJobsPA MaxSubmitPA MinTRES >> >>> >> >> ---------- ---------- ---------- ---------- ----------- >> >>> >> >> ---------------------------------------- ---------- ----------- >> >>> >> >> ------------- ------------- ------------- ------- --------- >> >>> >> >> ----------- ------------- -------------- ------------- >> ----------- >> >>> >> >> ------------- --------- ----------- ------------- --------- >> >>> >> >> ----------- ------------- >> >>> >> >> normal 0 00:00:00 cluster >> >>> >> >> 1.000000 >> >>> >> >> >> >>> >> >> # sacctmgr list user | grep -i user1 >> >>> >> >> user1 research None >> >>> >> >> >> >>> >> >> Thanks. >> >>> >> >> >> >>> >> >> On Fri, Jun 16, 2017 at 9:33 AM, Barry Moore < >> [email protected]> >> >>> >> >> wrote: >> >>> >> >>> Jagga, >> >>> >> >>> >> >>> >> >>> You got it. Something along the lines of: >> >>> >> >>> >> >>> >> >>> # Add users to accounts, partition >> >>> >> >>> sacctmgr add user <users> account=<account> >> partition=<whatever> >> >>> >> >>> >> >>> >> >>> # Set the association limits, I think you can do something >> similar >> >>> >> >>> to >> >>> >> >>> gres/gpu=-1 for the super users (assuming they are in the same >> >>> >> >>> account?). >> >>> >> >>> sacctmgr modify account where partition=<partition> set >> >>> >> >>> grptres=gres/gpu=8 >> >>> >> >>> sacctmgr modify account where partition=<partition> >> account=root >> >>> >> >>> set >> >>> >> >>> grptres=gres/gpu=-1 >> >>> >> >>> >> >>> >> >>> This will work as long as your super users aren't under an >> account >> >>> >> >>> you >> >>> >> >>> want >> >>> >> >>> to enforce. Maybe someone has a nicer solution for that. >> >>> >> >>> >> >>> >> >>> Hope that helps, >> >>> >> >>> >> >>> >> >>> Barry >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> On Fri, Jun 16, 2017 at 12:18 PM, Jagga Soorma < >> [email protected]> >> >>> >> >>> wrote: >> >>> >> >>>> >> >>> >> >>>> >> >>> >> >>>> Hello, >> >>> >> >>>> >> >>> >> >>>> I have a new 17.02.2 slurm environment with only one >> partition and >> >>> >> >>>> now >> >>> >> >>>> have the following requirement that I need to implement: >> >>> >> >>>> >> >>> >> >>>> -- >> >>> >> >>>> Create a new partition (partition2) with 2 nodes. This new >> >>> >> >>>> partition >> >>> >> >>>> should be restricted, i.e. users cannot submit to it unless >> they >> >>> >> >>>> are >> >>> >> >>>> authorized. >> >>> >> >>>> >> >>> >> >>>> The two groups of users, each of whom should be authorized, >> are: >> >>> >> >>>> >> >>> >> >>>> group1: SA >> >>> >> >>>> users: user1, user2 >> >>> >> >>>> >> >>> >> >>>> group2: CIG >> >>> >> >>>> users: user3, user4, user5 >> >>> >> >>>> >> >>> >> >>>> Each group should be allowed no more than 8 gres/gpus at a >> time. >> >>> >> >>>> The >> >>> >> >>>> limit should be at the group level, not individual user. >> >>> >> >>>> >> >>> >> >>>> These users should not be able to submit jobs to the first >> >>> >> >>>> partition >> >>> >> >>>> (partition1). This is our first original partition on this >> >>> >> >>>> cluster. >> >>> >> >>>> >> >>> >> >>>> Also, please allow super users (userX, userY, userZ) to submit >> >>> >> >>>> jobs >> >>> >> >>>> to >> >>> >> >>>> this partition without limits, for engineering/troubleshooting >> >>> >> >>>> purposes. >> >>> >> >>>> -- >> >>> >> >>>> >> >>> >> >>>> I am sure this is doable via associations in slurmdbd but was >> >>> >> >>>> wondering if that is the correct place to start. We have only >> >>> >> >>>> implemented simple accounts with slurmbdb so wanted to make >> sure I >> >>> >> >>>> do >> >>> >> >>>> this the right way and would appreciate any help with this. >> >>> >> >>>> >> >>> >> >>>> Thanks! >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> -- >> >>> >> >>> Barry E Moore II, PhD >> >>> >> >>> E-mail: [email protected] >> >>> >> >>> >> >>> >> >>> Assistant Research Professor >> >>> >> >>> Center for Simulation and Modeling >> >>> >> >>> University of Pittsburgh >> >>> >> >>> Pittsburgh, PA 15260 >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > -- >> >>> > Barry E Moore II, PhD >> >>> > E-mail: [email protected] >> >>> > >> >>> > Assistant Research Professor >> >>> > Center for Simulation and Modeling >> >>> > University of Pittsburgh >> >>> > Pittsburgh, PA 15260 >> >> >> >> >> >> >> >> >> >> -- >> >> Barry E Moore II, PhD >> >> E-mail: [email protected] >> >> >> >> Assistant Research Professor >> >> Center for Simulation and Modeling >> >> University of Pittsburgh >> >> Pittsburgh, PA 15260 >> > > > > -- > Barry E Moore II, PhD > E-mail: [email protected] > > Assistant Research Professor > Center for Simulation and Modeling > University of Pittsburgh > Pittsburgh, PA 15260 > -- Barry E Moore II, PhD E-mail: [email protected] Assistant Research Professor Center for Simulation and Modeling University of Pittsburgh Pittsburgh, PA 15260
