Since Slurm is built with the system compiler you should also build its
PMIx/UCX builds with the system compiler (and the corresponding libevent
in case you don't use the system version), and keep them separate from
the EB installation.

(Also check the compatibility matrix for PMIx.)

Build Slurm with support for all 3 PMIx versions, 1.2.5, 2.2.x,
3.something. That way you can talk to it regardless of which
OpenMPI/PMIx combo you have.

We have:
srun --mpi=list
srun: MPI types are...
srun: pmix_v3
srun: none
srun: openmpi
srun: pmi2
srun: pmix
srun: pmix_v1
srun: pmix_v2

Also see our hooks for setting Slurm env vars for specifying MPI-type.

On 2/18/20 10:39 AM, Loris Bennett wrote:
> Loris Bennett <[email protected]> writes:
> 
>> Dear Andreas,
>>
>> I am not aware that we have ever built OpenMPI against a particular
>> version of Slurm before (before being in the last 10 years or so).
>> My very poor understanding is that both OpenMPI and Slurm support some
>> variants of PMI.  As long as there is some overlap between the versions
>> supported, Slurm and OpenMPI should be able to work together.
>>
>> What we have done recently is to specify the PMI variants when building
>> Slurm and currently have this:
>>
>>   $ srun --mpi=list
>>   srun: MPI types are...
>>   srun: pmix_v1
>>   srun: none
>>   srun: pmix
>>   srun: openmpi
>>   srun: pmi2
>>
>> As indicated here
>>
>>   https://pmix.org/support/faq/which-environments-include-support-for-pmix/
>>
>> recent versions of Slurm can support up to pmix_v3.  Also on that page
>> it states the following
>>
>>   Open MPI v3.x: PMIx v2.x
>>
>> The OpenMPI dependency for my TensorFlow module is OpenMPI/3.1.3, so
>> perhaps it's just that we need to build Slurm with support vorn PMIx v2.
> 
> With quite a lot of help from Åke I think I have worked out some of what
> I need to do, namely:
> 
> 1. Build each Version of OpenMPI against the correct known working
>    version of PMIx.
> 
> 2. Build Slurm with PMIx support for multiple PMIx versions using some
>    thing like
>    
>    rpmbuild --define "_with_pmix 
> --with-pmix=/sw/PMIx/1.2.5-GCCcore-6.4.0:/sw/PMIx/2.2.3-GCCcore-7.3.0:/sw/PMIx/3.1.1-GCCcore-8.2.0:"
>  -ta slurm-19.05.5.tar.bz2
> 
> What I am still not sure about is how to deal with toolchains which have
> the major version of OpenMPI, but obviously different versions of the
> compiler.  Thus if I choose 2.2.3 as the version of PMIx for OpenMPI 3.x,
> then I have the following dependencies:
> 
>   foss-2018b <- OpenMPI-3.1.1-GCC-7.3.0-2.30 <- PMIx/2.2.3-GCCcore-7.3.0
>   foss-2019a <- OpenMPI-3.1.3-GCC-8.2.0-2.31.1 <- PMIx/2.2.3-GCCcore-8.2.0
> 
> However, it is not clear to me what Slurm will do, since all the PMIx
> versions provide
> 
>   libpmi.so
>   libpmi2.so
>   libpmix.so
> 
> It seems that the -with-pmix option is more about choosing between the
> libpmi.so and libpmi2.so provided by the slurm-libpmi package and those
> provided by PMIx.  However, the different version of OpenMPI seem to
> require different versions of PMIx.
> 
> So I guess my question is:
> 
>   What should the argument for the --with-pmix option be? 
> 
> (Seems like that's more of a Slurm question rather than an EB one -
> sorry about that)


-- 
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: [email protected]   Phone: +46 90 7866134 Fax: +46 90-580 14
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

Reply via email to