-----Original Message-----
From: easybuild-requ...@lists.ugent.be [mailto:easybuild-
requ...@lists.ugent.be] On Behalf Of Kenneth Hoste
Sent: Wednesday, October 26, 2016 6:43 PM
To: easybuild@lists.ugent.be
Subject: Re: [easybuild] How to rebuild the foss-2016b toolchain with
OpenMPI including Slurm support?
Hi Ole,
On 26/10/16 16:03, Ole Holm Nielsen wrote:
We use the foss-2016b toolchain, and we need OpenMPI to be built with
Slurm resource manager support. It seems that the foss-2016b build
doesn't include Slurm:
# ml OpenMPI/1.10.3-GCC-5.4.0-2.26
# ompi_info | egrep -i 'slurm|pmi'
MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component
v1.10.3)
MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component
v1.10.3)
MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component
v1.10.3)
Our multi-node MPI jobs fail miserably, and I surmise that this is due
to the lacking Slurm support.
Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2)
--with-slurm. References:
1) https://www.mail-
archive.com/easybuild@lists.ugent.be/msg01975.html
2) https://www.open-mpi.org/faq/?category=slurm
I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb
and appending a line:
configopts += '--with-slurm --with-pmi '
and rebuilding the module with eb --force. Unfortunately, the
resulting module seems *not* to include my updated configopts (looking
at the file
$EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC-5.4.0-
2.26.eb).
Question: How do I rebuild the OpenMPI module with proper Slurm
support?
Rebuilding with --force should work, so for some reason your customized
EasyBuild was not picked up...
How did you provide it to EasyBuild exactly? Was it available in the
local
directory where you ran the 'eb' command?
You can verify that the right easyconfig is picked up via a dry run
like: "eb OpenMPI-1.10.3-GCC-5.4.0-2.26.eb -Df", which will print
the path to
the easyconfig files used.
Question: Can Slurm support please be included in future versions of
the OpenMPI module in the foss-201x tool chain?
This is a left as a site-specific customization, since including
--with-slurm hard
would make the installation fail on any systems that do not have SLURM.
We should have documentation on how to deal with site-specific
customisations well though.
Is anyone doing that (JSC, CSCS, TAMU?) up for writing up some
documentation for this?
The existing documentation has some examples hinting towards a possible
setup:
http://easybuild.readthedocs.io/en/latest/Configuration.html#example
regards,
Kenneth