Hi,

Am 03.02.2018 um 00:47 schrieb Joshua Baker-LePain:

> I've been doing some searching on this, but my Google skills are failing me.  
> We're running SoGE 8.1.9 on a heterogeneous cluster (i.e. nodes with core 
> counts ranging from 8 to 28).  Several users are looking to run hybrid 
> MPI/OpenMP jobs (i.e. multiple threads per MPI process), and their early 
> attempts are confusing our current setup.  How are folks enabling these types 
> of jobs in a way that SGE can keep track of?
> 
> If it matters, our nodes are running CentOS-7 and the default MPI is the 
> bundled OpenMPI-1.10.6.  Thanks!

For such a setup it's necessary to tamper the generated machine file, so that 
Open MPI or MPICH won't start to many processes on their own already.

As long as you are one node node, it's to divide the granted slot count (means: 
the overall slot count including MPI and OpenMP slots needs to be requested for 
the job) by the OpenMP slot count and write this to a copy of the file pointed 
to by $PE_HOSTFILE and reset this variable to point to the generated copy. Then 
Open MPI or MPICH will use this one and the leftover slots can be used for 
threads, all well in the granted overall slot count.

In case you want to compute across several exechosts, it's best to have a fixed 
allocation rule for the PE. This way it's easy to proceed like outlined above 
for each exechost included in the $PE_HOSTFILE. An uneven distribution would 
lead to a non-working and not intended distribution of slots.

-- Reuti
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to