Hi,
Am 03.02.2018 um 00:47 schrieb Joshua Baker-LePain:
> I've been doing some searching on this, but my Google skills are failing me.
> We're running SoGE 8.1.9 on a heterogeneous cluster (i.e. nodes with core
> counts ranging from 8 to 28). Several users are looking to run hybrid
> MPI/OpenMP jobs (i.e. multiple threads per MPI process), and their early
> attempts are confusing our current setup. How are folks enabling these types
> of jobs in a way that SGE can keep track of?
>
> If it matters, our nodes are running CentOS-7 and the default MPI is the
> bundled OpenMPI-1.10.6. Thanks!
For such a setup it's necessary to tamper the generated machine file, so that
Open MPI or MPICH won't start to many processes on their own already.
As long as you are one node node, it's to divide the granted slot count (means:
the overall slot count including MPI and OpenMP slots needs to be requested for
the job) by the OpenMP slot count and write this to a copy of the file pointed
to by $PE_HOSTFILE and reset this variable to point to the generated copy. Then
Open MPI or MPICH will use this one and the leftover slots can be used for
threads, all well in the granted overall slot count.
In case you want to compute across several exechosts, it's best to have a fixed
allocation rule for the PE. This way it's easy to proceed like outlined above
for each exechost included in the $PE_HOSTFILE. An uneven distribution would
lead to a non-working and not intended distribution of slots.
-- Reuti
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users