We use the foss-2016b toolchain, and we need OpenMPI to be built with
Slurm resource manager support. It seems that the foss-2016b build
doesn't include Slurm:
# ml OpenMPI/1.10.3-GCC-5.4.0-2.26
# ompi_info | egrep -i 'slurm|pmi'
MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component v1.10.3)
MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3)
MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3)
Our multi-node MPI jobs fail miserably, and I surmise that this is due
to the lacking Slurm support.
Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2)
--with-slurm. References:
1) https://www.mail-archive.com/easybuild@lists.ugent.be/msg01975.html
2) https://www.open-mpi.org/faq/?category=slurm
I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb
and appending a line:
configopts += '--with-slurm --with-pmi '
and rebuilding the module with eb --force. Unfortunately, the resulting
module seems *not* to include my updated configopts (looking at the file
$EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC-5.4.0-2.26.eb).
Question: How do I rebuild the OpenMPI module with proper Slurm support?
Question: Can Slurm support please be included in future versions of the
OpenMPI module in the foss-201x tool chain?
Thanks a lot,
Ole