On Fri, 7 Aug 2015, [email protected] wrote:

Hello,

We use a licensed software (lets call it SOFT1)  for which license is  linked 
to hardware configuration. Also the license limits the number of instances of 
the software can be run at the same time.

On our cluster, we want to run 2 instances of the software. Each instance 
consumes 1 core.  We use nodes with 24 cores, other software are also deployed 
and run on the same nodes as SOFT1.
We use cons_res  (CP_CPU type) resources selection plugin.

Our current nodes / partitiions definition  is as follows

NodeName=calc[1:11] Sockets=2 CoresPerSocket=12 ThreadsPerCore=1 State=UNKNOWN
PartitionName=PDEF Nodes=calc[2:11] Default=YES MaxTime=INFINITE State=UP
PartitionName=PSOFT1 Nodes=calc1 Default=NO MaxTime=INFINITE State=UP
PartitionName=PSOFT2 Nodes=calc1 Default=NO MaxTime=INFINITE State=UP

SOFT1 license is installed on calc1. Another software is also installed on 
calc1, for which there is no license limitations.

To limit the number of concurrent software instances, we have  set the number 
of licences :

Licenses:SOFT1*2,SOFT2*22

(SOFT2*22 is just to ensure that the 2 instances of SOFT1 can be run at any 
time)

SOFT1 jobs executions are submitted by passing partition parameter "-pPSOFT1"  
to sbatch.

Everything works fine if the 2 instances can be executed on a single node. But 
now - for redundancy purpose - we would like to use 2 nodes, with a license 
allowing to run 1 instance of SOFT1 on each node.
So our nodes / partitiions definition would become :

NodeName=calc[1:11] Sockets=2 CoresPerSocket=12 ThreadsPerCore=1 State=UNKNOWN
PartitionName=PDEF Nodes=calc[2:11] Default=YES MaxTime=INFINITE State=UP
PartitionName=PSOFT1 Nodes=calc[1:2] Default=NO MaxTime=INFINITE State=UP
PartitionName=PSOFT2 Nodes=calc1 Default=NO MaxTime=INFINITE State=UP

Is there a way to configure slurm and/or pass parameters to sbatch so that 
slurm will not try to run 2 SOFT1 jobs on the same node ? (which would cause a 
license error).

We currently use slurm release 2.4.5. We could eventually upgrade if required.

Thanks for any help.

Paule.

Mayby try use gres.

Regards
DB

Reply via email to