Åke Sandgren <[email protected]> writes:

> Since Slurm is built with the system compiler you should also build its
> PMIx/UCX builds with the system compiler (and the corresponding libevent
> in case you don't use the system version), and keep them separate from
> the EB installation.

Sorry, I'm confused again.  Do you mean that the PMIx versions should be
built with the system compiler and *without* using EB?

But aren't I still going to need to install PMIx via EB to allow adding
PMIx as a dependency for OpenMPI via a hook?

Cheers,

Loris

> (Also check the compatibility matrix for PMIx.)
>
> Build Slurm with support for all 3 PMIx versions, 1.2.5, 2.2.x,
> 3.something. That way you can talk to it regardless of which
> OpenMPI/PMIx combo you have.
>
> We have:
> srun --mpi=list
> srun: MPI types are...
> srun: pmix_v3
> srun: none
> srun: openmpi
> srun: pmi2
> srun: pmix
> srun: pmix_v1
> srun: pmix_v2
>
> Also see our hooks for setting Slurm env vars for specifying MPI-type.
>
> On 2/18/20 10:39 AM, Loris Bennett wrote:
>> Loris Bennett <[email protected]> writes:
>> 
>>> Dear Andreas,
>>>
>>> I am not aware that we have ever built OpenMPI against a particular
>>> version of Slurm before (before being in the last 10 years or so).
>>> My very poor understanding is that both OpenMPI and Slurm support some
>>> variants of PMI.  As long as there is some overlap between the versions
>>> supported, Slurm and OpenMPI should be able to work together.
>>>
>>> What we have done recently is to specify the PMI variants when building
>>> Slurm and currently have this:
>>>
>>>   $ srun --mpi=list
>>>   srun: MPI types are...
>>>   srun: pmix_v1
>>>   srun: none
>>>   srun: pmix
>>>   srun: openmpi
>>>   srun: pmi2
>>>
>>> As indicated here
>>>
>>>   https://pmix.org/support/faq/which-environments-include-support-for-pmix/
>>>
>>> recent versions of Slurm can support up to pmix_v3.  Also on that page
>>> it states the following
>>>
>>>   Open MPI v3.x: PMIx v2.x
>>>
>>> The OpenMPI dependency for my TensorFlow module is OpenMPI/3.1.3, so
>>> perhaps it's just that we need to build Slurm with support vorn PMIx v2.
>> 
>> With quite a lot of help from Åke I think I have worked out some of what
>> I need to do, namely:
>> 
>> 1. Build each Version of OpenMPI against the correct known working
>>    version of PMIx.
>> 
>> 2. Build Slurm with PMIx support for multiple PMIx versions using some
>>    thing like
>>    
>>    rpmbuild --define "_with_pmix 
>> --with-pmix=/sw/PMIx/1.2.5-GCCcore-6.4.0:/sw/PMIx/2.2.3-GCCcore-7.3.0:/sw/PMIx/3.1.1-GCCcore-8.2.0:"
>>  -ta slurm-19.05.5.tar.bz2
>> 
>> What I am still not sure about is how to deal with toolchains which have
>> the major version of OpenMPI, but obviously different versions of the
>> compiler.  Thus if I choose 2.2.3 as the version of PMIx for OpenMPI 3.x,
>> then I have the following dependencies:
>> 
>>   foss-2018b <- OpenMPI-3.1.1-GCC-7.3.0-2.30 <- PMIx/2.2.3-GCCcore-7.3.0
>>   foss-2019a <- OpenMPI-3.1.3-GCC-8.2.0-2.31.1 <- PMIx/2.2.3-GCCcore-8.2.0
>> 
>> However, it is not clear to me what Slurm will do, since all the PMIx
>> versions provide
>> 
>>   libpmi.so
>>   libpmi2.so
>>   libpmix.so
>> 
>> It seems that the -with-pmix option is more about choosing between the
>> libpmi.so and libpmi2.so provided by the slurm-libpmi package and those
>> provided by PMIx.  However, the different version of OpenMPI seem to
>> require different versions of PMIx.
>> 
>> So I guess my question is:
>> 
>>   What should the argument for the --with-pmix option be? 
>> 
>> (Seems like that's more of a Slurm question rather than an EB one -
>> sorry about that)
-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email [email protected]

Reply via email to