Re: [slurm-users] how to restrict jobs

2020-05-07 Thread Daniel Letai

  
  
On 06/05/2020 20:44, Mark Hahn wrote:

  Is there no way to set or define a custom
variable like at node level and

  
  
  you could use a per-node Feature for this, but a partition would
  also work.
  
  

A bit of an ugly hack, but  you could use QoS (requires
  accounting) to enforce this:
1. Create a qos (using sacctmgr) with GrpTRES=Node=4
2. Create a new partiton identical to the current one, but with
  the new qos 

3. instruct users to submit to the new partition any job
  requiring the license.


This will not solve the issue of fragmentation due to
  non-licensed jobs - for that you should enable a packing scheduler
  like 

SelectTypeParameters=CR_Pack_Nodes
  (https://slurm.schedmd.com/slurm.conf.html#OPT_CR_Pack_Nodes).

  




Re: [slurm-users] how to restrict jobs

2020-05-06 Thread Mark Hahn

Is there no way to set or define a custom variable like at node level and


you could use a per-node Feature for this, but a partition would also work.



Re: [slurm-users] how to restrict jobs

2020-05-06 Thread navin srivastava
Is there no way to set or define a custom variable like at node level and
then you pass the same variable in the job request so that it will land
into those nodes only.


Regards
Navin

On Wed, May 6, 2020, 21:04 Renfro, Michael  wrote:

> Ok, then regular license accounting won’t work.
>
> Somewhat tested, but should work or at least be a starting point. Given a
> job number JOBID that’s already running with this license on one or more
> nodes:
>
>   sbatch -w $(scontrol show job JOBID | grep ' NodeList=' | cut -d= -f2)
> -N 1
>
> should start a one-node job on an available node being used by JOBID. Add
> other parameters as required for cpus-per-task, time limits, or whatever
> else is needed. If you start the larger jobs first, and let the later jobs
> fill in on idle CPUs on those nodes, it should work.
>
> > On May 6, 2020, at 9:46 AM, navin srivastava 
> wrote:
> >
> > To explain with more details.
> >
> > job will be submitted based on core at any time but it will go to any
> random nodes but limited to 4 Nodes only.(license having some intelligence
> that it calculate the nodes and if it reached to 4 then it will not allow
> any more nodes. yes it didn't depend on the no of core available on nodes.
> >
> > Case-1 if 4 jobs running with 4 cores each on 4 nodes [node1, node2,
> node3 and node4]
> >  Again Fifth job assigned by SLURM with 4 cores on any one
> node of node1, node2, node3 and node4 then license will be allowed.
> >
> > Case-2 if 4 jobs running with 4 cores each on 4 nodes [node1, node2,
> node3 and node4]
> >  Again Fifth job assigned by SLURM on node5 with 4 cores
> then license will not allowed [ license not found error came in this case]
> >
> > Regards
> > Navin.
> >
> >
> > On Wed, May 6, 2020 at 7:47 PM Renfro, Michael 
> wrote:
> > To make sure I’m reading this correctly, you have a software license
> that lets you run jobs on up to 4 nodes at once, regardless of how many
> CPUs you use? That is, you could run any one of the following sets of jobs:
> >
> > - four 1-node jobs,
> > - two 2-node jobs,
> > - one 1-node and one 3-node job,
> > - two 1-node and one 2-node jobs,
> > - one 4-node job,
> >
> > simultaneously? And the license isn’t node-locked to specific nodes by
> MAC address or anything similar? But if you try to run jobs beyond what
> I’ve listed above, you run out of licenses, and you want those later jobs
> to be held until licenses are freed up?
> >
> > If all of those questions have an answer of ‘yes’, I think you want the
> remote license part of the https://slurm.schedmd.com/licenses.html,
> something like:
> >
> >   sacctmgr add resource name=software_name count=4 percentallowed=100
> server=flex_host servertype=flexlm type=license
> >
> > and submit jobs with a '-L software_name:N’ flag where N is the number
> of nodes you want to run on.
> >
> > > On May 6, 2020, at 5:33 AM, navin srivastava 
> wrote:
> > >
> > > Thanks Micheal.
> > >
> > > Actually one application license are based on node and we have 4 Node
> license( not a fix node). we have several nodes but when job lands on any 4
> random nodes it runs on those nodes only. After that it fails if it goes to
> other nodes.
> > >
> > > can we define a custom variable and set it on the node level and when
> user submit it will pass that variable and then job will and onto those
> specific nodes?
> > > i do not want to create a separate partition.
> > >
> > > is there any way to achieve this by any other method?
> > >
> > > Regards
> > > Navin.
> > >
> > >
> > > Regards
> > > Navin.
> > >
> > > On Tue, May 5, 2020 at 7:46 PM Renfro, Michael 
> wrote:
> > > Haven’t done it yet myself, but it’s on my todo list.
> > >
> > > But I’d assume that if you use the FlexLM or RLM parts of that
> documentation, that Slurm would query the remote license server
> periodically and hold the job until the necessary licenses were available.
> > >
> > > > On May 5, 2020, at 8:37 AM, navin srivastava 
> wrote:
> > > >
> > > > External Email Warning
> > > > This email originated from outside the university. Please use
> caution when opening attachments, clicking links, or responding to requests.
> > > > Thanks Michael,
> > > >
> > > > yes i have gone through but the licenses are remote license and it
> will be used by outside as well not only in slurm.
> > > > so basically i am interested to know how we can update the database
> dynamically to get the exact value at that point of time.
> > > > i mean query the license server and update the database accordingly.
> does slurm automatically updated the value based on usage?
> > > >
> > > >
> > > > Regards
> > > > Navin.
> > > >
> > > >
> > > > On Tue, May 5, 2020 at 7:00 PM Renfro, Michael 
> wrote:
> > > > Have you seen https://slurm.schedmd.com/licenses.html already? If
> the software is just for use inside the cluster, one Licenses= line in
> slurm.conf plus users submitting with the -L flag should suffice. Should be
> able to set that license value is 4 

Re: [slurm-users] how to restrict jobs

2020-05-06 Thread Renfro, Michael
Ok, then regular license accounting won’t work.

Somewhat tested, but should work or at least be a starting point. Given a job 
number JOBID that’s already running with this license on one or more nodes:

  sbatch -w $(scontrol show job JOBID | grep ' NodeList=' | cut -d= -f2) -N 1

should start a one-node job on an available node being used by JOBID. Add other 
parameters as required for cpus-per-task, time limits, or whatever else is 
needed. If you start the larger jobs first, and let the later jobs fill in on 
idle CPUs on those nodes, it should work.

> On May 6, 2020, at 9:46 AM, navin srivastava  wrote:
> 
> To explain with more details.
> 
> job will be submitted based on core at any time but it will go to any random 
> nodes but limited to 4 Nodes only.(license having some intelligence that it 
> calculate the nodes and if it reached to 4 then it will not allow any more 
> nodes. yes it didn't depend on the no of core available on nodes.
> 
> Case-1 if 4 jobs running with 4 cores each on 4 nodes [node1, node2, node3 
> and node4]
>  Again Fifth job assigned by SLURM with 4 cores on any one node 
> of node1, node2, node3 and node4 then license will be allowed.
>  
> Case-2 if 4 jobs running with 4 cores each on 4 nodes [node1, node2, node3 
> and node4]
>  Again Fifth job assigned by SLURM on node5 with 4 cores  then 
> license will not allowed [ license not found error came in this case]
> 
> Regards
> Navin.
> 
> 
> On Wed, May 6, 2020 at 7:47 PM Renfro, Michael  wrote:
> To make sure I’m reading this correctly, you have a software license that 
> lets you run jobs on up to 4 nodes at once, regardless of how many CPUs you 
> use? That is, you could run any one of the following sets of jobs:
> 
> - four 1-node jobs,
> - two 2-node jobs,
> - one 1-node and one 3-node job,
> - two 1-node and one 2-node jobs,
> - one 4-node job,
> 
> simultaneously? And the license isn’t node-locked to specific nodes by MAC 
> address or anything similar? But if you try to run jobs beyond what I’ve 
> listed above, you run out of licenses, and you want those later jobs to be 
> held until licenses are freed up?
> 
> If all of those questions have an answer of ‘yes’, I think you want the 
> remote license part of the https://slurm.schedmd.com/licenses.html, something 
> like:
> 
>   sacctmgr add resource name=software_name count=4 percentallowed=100 
> server=flex_host servertype=flexlm type=license
> 
> and submit jobs with a '-L software_name:N’ flag where N is the number of 
> nodes you want to run on.
> 
> > On May 6, 2020, at 5:33 AM, navin srivastava  wrote:
> > 
> > Thanks Micheal.
> > 
> > Actually one application license are based on node and we have 4 Node 
> > license( not a fix node). we have several nodes but when job lands on any 4 
> > random nodes it runs on those nodes only. After that it fails if it goes to 
> > other nodes.
> > 
> > can we define a custom variable and set it on the node level and when user 
> > submit it will pass that variable and then job will and onto those specific 
> > nodes?
> > i do not want to create a separate partition. 
> > 
> > is there any way to achieve this by any other method?
> > 
> > Regards
> > Navin.
> > 
> > 
> > Regards
> > Navin.
> > 
> > On Tue, May 5, 2020 at 7:46 PM Renfro, Michael  wrote:
> > Haven’t done it yet myself, but it’s on my todo list.
> > 
> > But I’d assume that if you use the FlexLM or RLM parts of that 
> > documentation, that Slurm would query the remote license server 
> > periodically and hold the job until the necessary licenses were available.
> > 
> > > On May 5, 2020, at 8:37 AM, navin srivastava  
> > > wrote:
> > > 
> > > External Email Warning
> > > This email originated from outside the university. Please use caution 
> > > when opening attachments, clicking links, or responding to requests.
> > > Thanks Michael,
> > > 
> > > yes i have gone through but the licenses are remote license and it will 
> > > be used by outside as well not only in slurm.
> > > so basically i am interested to know how we can update the database 
> > > dynamically to get the exact value at that point of time.
> > > i mean query the license server and update the database accordingly. does 
> > > slurm automatically updated the value based on usage?
> > > 
> > > 
> > > Regards
> > > Navin.
> > > 
> > > 
> > > On Tue, May 5, 2020 at 7:00 PM Renfro, Michael  wrote:
> > > Have you seen https://slurm.schedmd.com/licenses.html already? If the 
> > > software is just for use inside the cluster, one Licenses= line in 
> > > slurm.conf plus users submitting with the -L flag should suffice. Should 
> > > be able to set that license value is 4 if it’s licensed per node and you 
> > > can run up to 4 jobs simultaneously, or 4*NCPUS if it’s licensed per CPU, 
> > > or 1 if it’s a single license good for one run from 1-4 nodes.
> > > 
> > > There are also options to query a FlexLM or RLM server for license 
> > > management.
> > > 
> > > -- 
> > 

Re: [slurm-users] how to restrict jobs

2020-05-06 Thread navin srivastava
To explain with more details.

job will be submitted based on core at any time but it will go to any
random nodes but limited to 4 Nodes only.(license having some intelligence
that it calculate the nodes and if it reached to 4 then it will not allow
any more nodes. yes it didn't depend on the no of core available on nodes.

Case-1 if 4 jobs running with 4 cores each on 4 nodes [node1, node2, node3
and node4]

 Again Fifth job assigned by SLURM with 4 cores on any one node
of node1, node2, node3 and node4 then license will be allowed.



Case-2 if 4 jobs running with 4 cores each on 4 nodes [node1, node2, node3
and node4]

 Again Fifth job assigned by SLURM on node5 with 4 cores  then
license will not allowed [ license not found error came in this case]


Regards
Navin.


On Wed, May 6, 2020 at 7:47 PM Renfro, Michael  wrote:

> To make sure I’m reading this correctly, you have a software license that
> lets you run jobs on up to 4 nodes at once, regardless of how many CPUs you
> use? That is, you could run any one of the following sets of jobs:
>
> - four 1-node jobs,
> - two 2-node jobs,
> - one 1-node and one 3-node job,
> - two 1-node and one 2-node jobs,
> - one 4-node job,
>
> simultaneously? And the license isn’t node-locked to specific nodes by MAC
> address or anything similar? But if you try to run jobs beyond what I’ve
> listed above, you run out of licenses, and you want those later jobs to be
> held until licenses are freed up?
>
> If all of those questions have an answer of ‘yes’, I think you want the
> remote license part of the https://slurm.schedmd.com/licenses.html,
> something like:
>
>   sacctmgr add resource name=software_name count=4 percentallowed=100
> server=flex_host servertype=flexlm type=license
>
> and submit jobs with a '-L software_name:N’ flag where N is the number of
> nodes you want to run on.
>
> > On May 6, 2020, at 5:33 AM, navin srivastava 
> wrote:
> >
> > Thanks Micheal.
> >
> > Actually one application license are based on node and we have 4 Node
> license( not a fix node). we have several nodes but when job lands on any 4
> random nodes it runs on those nodes only. After that it fails if it goes to
> other nodes.
> >
> > can we define a custom variable and set it on the node level and when
> user submit it will pass that variable and then job will and onto those
> specific nodes?
> > i do not want to create a separate partition.
> >
> > is there any way to achieve this by any other method?
> >
> > Regards
> > Navin.
> >
> >
> > Regards
> > Navin.
> >
> > On Tue, May 5, 2020 at 7:46 PM Renfro, Michael 
> wrote:
> > Haven’t done it yet myself, but it’s on my todo list.
> >
> > But I’d assume that if you use the FlexLM or RLM parts of that
> documentation, that Slurm would query the remote license server
> periodically and hold the job until the necessary licenses were available.
> >
> > > On May 5, 2020, at 8:37 AM, navin srivastava 
> wrote:
> > >
> > > External Email Warning
> > > This email originated from outside the university. Please use caution
> when opening attachments, clicking links, or responding to requests.
> > > Thanks Michael,
> > >
> > > yes i have gone through but the licenses are remote license and it
> will be used by outside as well not only in slurm.
> > > so basically i am interested to know how we can update the database
> dynamically to get the exact value at that point of time.
> > > i mean query the license server and update the database accordingly.
> does slurm automatically updated the value based on usage?
> > >
> > >
> > > Regards
> > > Navin.
> > >
> > >
> > > On Tue, May 5, 2020 at 7:00 PM Renfro, Michael 
> wrote:
> > > Have you seen https://slurm.schedmd.com/licenses.html already? If the
> software is just for use inside the cluster, one Licenses= line in
> slurm.conf plus users submitting with the -L flag should suffice. Should be
> able to set that license value is 4 if it’s licensed per node and you can
> run up to 4 jobs simultaneously, or 4*NCPUS if it’s licensed per CPU, or 1
> if it’s a single license good for one run from 1-4 nodes.
> > >
> > > There are also options to query a FlexLM or RLM server for license
> management.
> > >
> > > --
> > > Mike Renfro, PhD / HPC Systems Administrator, Information Technology
> Services
> > > 931 372-3601 / Tennessee Tech University
> > >
> > > > On May 5, 2020, at 7:54 AM, navin srivastava 
> wrote:
> > > >
> > > > Hi Team,
> > > >
> > > > we have an application whose licenses is limited .it scales upto 4
> nodes(~80 cores).
> > > > so if 4 nodes are full, in 5th node job used to get fail.
> > > > we want to put a restriction so that the application can't go for
> the execution beyond the 4 nodes and fail it should be in queue state.
> > > > i do not want to keep a separate partition to achieve this config.is
> there a way to achieve this scenario using some dynamic resource which can
> call the license variable on the fly and if it is reached it should keep

Re: [slurm-users] how to restrict jobs

2020-05-06 Thread Renfro, Michael
To make sure I’m reading this correctly, you have a software license that lets 
you run jobs on up to 4 nodes at once, regardless of how many CPUs you use? 
That is, you could run any one of the following sets of jobs:

- four 1-node jobs,
- two 2-node jobs,
- one 1-node and one 3-node job,
- two 1-node and one 2-node jobs,
- one 4-node job,

simultaneously? And the license isn’t node-locked to specific nodes by MAC 
address or anything similar? But if you try to run jobs beyond what I’ve listed 
above, you run out of licenses, and you want those later jobs to be held until 
licenses are freed up?

If all of those questions have an answer of ‘yes’, I think you want the remote 
license part of the https://slurm.schedmd.com/licenses.html, something like:

  sacctmgr add resource name=software_name count=4 percentallowed=100 
server=flex_host servertype=flexlm type=license

and submit jobs with a '-L software_name:N’ flag where N is the number of nodes 
you want to run on.

> On May 6, 2020, at 5:33 AM, navin srivastava  wrote:
> 
> Thanks Micheal.
> 
> Actually one application license are based on node and we have 4 Node 
> license( not a fix node). we have several nodes but when job lands on any 4 
> random nodes it runs on those nodes only. After that it fails if it goes to 
> other nodes.
> 
> can we define a custom variable and set it on the node level and when user 
> submit it will pass that variable and then job will and onto those specific 
> nodes?
> i do not want to create a separate partition. 
> 
> is there any way to achieve this by any other method?
> 
> Regards
> Navin.
> 
> 
> Regards
> Navin.
> 
> On Tue, May 5, 2020 at 7:46 PM Renfro, Michael  wrote:
> Haven’t done it yet myself, but it’s on my todo list.
> 
> But I’d assume that if you use the FlexLM or RLM parts of that documentation, 
> that Slurm would query the remote license server periodically and hold the 
> job until the necessary licenses were available.
> 
> > On May 5, 2020, at 8:37 AM, navin srivastava  wrote:
> > 
> > External Email Warning
> > This email originated from outside the university. Please use caution when 
> > opening attachments, clicking links, or responding to requests.
> > Thanks Michael,
> > 
> > yes i have gone through but the licenses are remote license and it will be 
> > used by outside as well not only in slurm.
> > so basically i am interested to know how we can update the database 
> > dynamically to get the exact value at that point of time.
> > i mean query the license server and update the database accordingly. does 
> > slurm automatically updated the value based on usage?
> > 
> > 
> > Regards
> > Navin.
> > 
> > 
> > On Tue, May 5, 2020 at 7:00 PM Renfro, Michael  wrote:
> > Have you seen https://slurm.schedmd.com/licenses.html already? If the 
> > software is just for use inside the cluster, one Licenses= line in 
> > slurm.conf plus users submitting with the -L flag should suffice. Should be 
> > able to set that license value is 4 if it’s licensed per node and you can 
> > run up to 4 jobs simultaneously, or 4*NCPUS if it’s licensed per CPU, or 1 
> > if it’s a single license good for one run from 1-4 nodes.
> > 
> > There are also options to query a FlexLM or RLM server for license 
> > management.
> > 
> > -- 
> > Mike Renfro, PhD / HPC Systems Administrator, Information Technology 
> > Services
> > 931 372-3601 / Tennessee Tech University
> > 
> > > On May 5, 2020, at 7:54 AM, navin srivastava  
> > > wrote:
> > > 
> > > Hi Team,
> > > 
> > > we have an application whose licenses is limited .it scales upto 4 
> > > nodes(~80 cores).
> > > so if 4 nodes are full, in 5th node job used to get fail.
> > > we want to put a restriction so that the application can't go for the 
> > > execution beyond the 4 nodes and fail it should be in queue state.
> > > i do not want to keep a separate partition to achieve this config.is 
> > > there a way to achieve this scenario using some dynamic resource which 
> > > can call the license variable on the fly and if it is reached it should 
> > > keep the job in queue.
> > > 
> > > Regards
> > > Navin.
> > > 
> > > 
> > > 
> > 
> 



Re: [slurm-users] how to restrict jobs

2020-05-06 Thread navin srivastava
Thanks Micheal.

Actually one application license are based on node and we have 4 Node
license( not a fix node). we have several nodes but when job lands on any 4
random nodes it runs on those nodes only. After that it fails if it goes to
other nodes.

can we define a custom variable and set it on the node level and when user
submit it will pass that variable and then job will and onto those specific
nodes?
i do not want to create a separate partition.

is there any way to achieve this by any other method?

Regards
Navin.


Regards
Navin.

On Tue, May 5, 2020 at 7:46 PM Renfro, Michael  wrote:

> Haven’t done it yet myself, but it’s on my todo list.
>
> But I’d assume that if you use the FlexLM or RLM parts of that
> documentation, that Slurm would query the remote license server
> periodically and hold the job until the necessary licenses were available.
>
> > On May 5, 2020, at 8:37 AM, navin srivastava 
> wrote:
> >
> > External Email Warning
> > This email originated from outside the university. Please use caution
> when opening attachments, clicking links, or responding to requests.
> > Thanks Michael,
> >
> > yes i have gone through but the licenses are remote license and it will
> be used by outside as well not only in slurm.
> > so basically i am interested to know how we can update the database
> dynamically to get the exact value at that point of time.
> > i mean query the license server and update the database accordingly.
> does slurm automatically updated the value based on usage?
> >
> >
> > Regards
> > Navin.
> >
> >
> > On Tue, May 5, 2020 at 7:00 PM Renfro, Michael 
> wrote:
> > Have you seen https://slurm.schedmd.com/licenses.html already? If the
> software is just for use inside the cluster, one Licenses= line in
> slurm.conf plus users submitting with the -L flag should suffice. Should be
> able to set that license value is 4 if it’s licensed per node and you can
> run up to 4 jobs simultaneously, or 4*NCPUS if it’s licensed per CPU, or 1
> if it’s a single license good for one run from 1-4 nodes.
> >
> > There are also options to query a FlexLM or RLM server for license
> management.
> >
> > --
> > Mike Renfro, PhD / HPC Systems Administrator, Information Technology
> Services
> > 931 372-3601 / Tennessee Tech University
> >
> > > On May 5, 2020, at 7:54 AM, navin srivastava 
> wrote:
> > >
> > > Hi Team,
> > >
> > > we have an application whose licenses is limited .it scales upto 4
> nodes(~80 cores).
> > > so if 4 nodes are full, in 5th node job used to get fail.
> > > we want to put a restriction so that the application can't go for the
> execution beyond the 4 nodes and fail it should be in queue state.
> > > i do not want to keep a separate partition to achieve this config.is
> there a way to achieve this scenario using some dynamic resource which can
> call the license variable on the fly and if it is reached it should keep
> the job in queue.
> > >
> > > Regards
> > > Navin.
> > >
> > >
> > >
> >
>
>


Re: [slurm-users] how to restrict jobs

2020-05-05 Thread Renfro, Michael
Haven’t done it yet myself, but it’s on my todo list.

But I’d assume that if you use the FlexLM or RLM parts of that documentation, 
that Slurm would query the remote license server periodically and hold the job 
until the necessary licenses were available.

> On May 5, 2020, at 8:37 AM, navin srivastava  wrote:
> 
> External Email Warning
> This email originated from outside the university. Please use caution when 
> opening attachments, clicking links, or responding to requests.
> Thanks Michael,
> 
> yes i have gone through but the licenses are remote license and it will be 
> used by outside as well not only in slurm.
> so basically i am interested to know how we can update the database 
> dynamically to get the exact value at that point of time.
> i mean query the license server and update the database accordingly. does 
> slurm automatically updated the value based on usage?
> 
> 
> Regards
> Navin.
> 
> 
> On Tue, May 5, 2020 at 7:00 PM Renfro, Michael  wrote:
> Have you seen https://slurm.schedmd.com/licenses.html already? If the 
> software is just for use inside the cluster, one Licenses= line in slurm.conf 
> plus users submitting with the -L flag should suffice. Should be able to set 
> that license value is 4 if it’s licensed per node and you can run up to 4 
> jobs simultaneously, or 4*NCPUS if it’s licensed per CPU, or 1 if it’s a 
> single license good for one run from 1-4 nodes.
> 
> There are also options to query a FlexLM or RLM server for license management.
> 
> -- 
> Mike Renfro, PhD / HPC Systems Administrator, Information Technology Services
> 931 372-3601 / Tennessee Tech University
> 
> > On May 5, 2020, at 7:54 AM, navin srivastava  wrote:
> > 
> > Hi Team,
> > 
> > we have an application whose licenses is limited .it scales upto 4 
> > nodes(~80 cores).
> > so if 4 nodes are full, in 5th node job used to get fail.
> > we want to put a restriction so that the application can't go for the 
> > execution beyond the 4 nodes and fail it should be in queue state.
> > i do not want to keep a separate partition to achieve this config.is there 
> > a way to achieve this scenario using some dynamic resource which can call 
> > the license variable on the fly and if it is reached it should keep the job 
> > in queue.
> > 
> > Regards
> > Navin.
> > 
> > 
> > 
> 



Re: [slurm-users] how to restrict jobs

2020-05-05 Thread navin srivastava
Thanks Michael,

yes i have gone through but the licenses are remote license and it will be
used by outside as well not only in slurm.
so basically i am interested to know how we can update the database
dynamically to get the exact value at that point of time.
i mean query the license server and update the database accordingly. does
slurm automatically updated the value based on usage?


Regards
Navin.


On Tue, May 5, 2020 at 7:00 PM Renfro, Michael  wrote:

> Have you seen https://slurm.schedmd.com/licenses.html already? If the
> software is just for use inside the cluster, one Licenses= line in
> slurm.conf plus users submitting with the -L flag should suffice. Should be
> able to set that license value is 4 if it’s licensed per node and you can
> run up to 4 jobs simultaneously, or 4*NCPUS if it’s licensed per CPU, or 1
> if it’s a single license good for one run from 1-4 nodes.
>
> There are also options to query a FlexLM or RLM server for license
> management.
>
> --
> Mike Renfro, PhD / HPC Systems Administrator, Information Technology
> Services
> 931 372-3601 / Tennessee Tech University
>
> > On May 5, 2020, at 7:54 AM, navin srivastava 
> wrote:
> >
> > Hi Team,
> >
> > we have an application whose licenses is limited .it scales upto 4
> nodes(~80 cores).
> > so if 4 nodes are full, in 5th node job used to get fail.
> > we want to put a restriction so that the application can't go for the
> execution beyond the 4 nodes and fail it should be in queue state.
> > i do not want to keep a separate partition to achieve this config.is
> there a way to achieve this scenario using some dynamic resource which can
> call the license variable on the fly and if it is reached it should keep
> the job in queue.
> >
> > Regards
> > Navin.
> >
> >
> >
>
>


Re: [slurm-users] how to restrict jobs

2020-05-05 Thread Renfro, Michael
Have you seen https://slurm.schedmd.com/licenses.html already? If the software 
is just for use inside the cluster, one Licenses= line in slurm.conf plus users 
submitting with the -L flag should suffice. Should be able to set that license 
value is 4 if it’s licensed per node and you can run up to 4 jobs 
simultaneously, or 4*NCPUS if it’s licensed per CPU, or 1 if it’s a single 
license good for one run from 1-4 nodes.

There are also options to query a FlexLM or RLM server for license management.

-- 
Mike Renfro, PhD / HPC Systems Administrator, Information Technology Services
931 372-3601 / Tennessee Tech University

> On May 5, 2020, at 7:54 AM, navin srivastava  wrote:
> 
> Hi Team,
> 
> we have an application whose licenses is limited .it scales upto 4 nodes(~80 
> cores).
> so if 4 nodes are full, in 5th node job used to get fail.
> we want to put a restriction so that the application can't go for the 
> execution beyond the 4 nodes and fail it should be in queue state.
> i do not want to keep a separate partition to achieve this config.is there a 
> way to achieve this scenario using some dynamic resource which can call the 
> license variable on the fly and if it is reached it should keep the job in 
> queue.
> 
> Regards
> Navin.
> 
> 
>