RE: [easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?
Hi Kenneth, Full path to installation of PBS. PBS is not installed with EB, but loaded as external module. Specifying the following wasn't sufficient: configopts += '--with-tm ' But using the following does work: configopts += '--with-tm=/full/path/to/install/of/pbs ' Regards, Erik > -Original Message- > From: easybuild-requ...@lists.ugent.be [mailto:easybuild- > requ...@lists.ugent.be] On Behalf Of Kenneth Hoste > Sent: Friday, October 28, 2016 8:55 AM > To: easybuild@lists.ugent.be > Subject: Re: [easybuild] How to rebuild the foss-2016b toolchain with > OpenMPI including Slurm support? > > Hi Erik, > > On 28/10/16 08:50, Erik Smeets wrote: > > Hi Ole, > > > > We've had a similar issue, but then for PBS. I created a new eb file and > added with-tm specifying full-path: > > configopts += '--with-tm=/full/path/to/pbs ' > > > > I then installed openmpi with --force --rebuild. > > > > For some reason when not specifying the full path it doesn't work for us at > the moment. I still need to look at this, as it is inconvenient having to > update > this file when we upgrade PBS. At least for now it does the trick. > > Can you clarify this? Full path to what? The easyconfig file? > And if so, on the command line, or in your EasyBuild configuration (robot- > paths)? > > > regards, > > Kenneth > > > > > Regards, > > Erik > > > > > > > >> -Original Message- > >> From: easybuild-requ...@lists.ugent.be [mailto:easybuild- > >> requ...@lists.ugent.be] On Behalf Of Kenneth Hoste > >> Sent: Wednesday, October 26, 2016 6:43 PM > >> To: easybuild@lists.ugent.be > >> Subject: Re: [easybuild] How to rebuild the foss-2016b toolchain with > >> OpenMPI including Slurm support? > >> > >> Hi Ole, > >> > >> On 26/10/16 16:03, Ole Holm Nielsen wrote: > >>> We use the foss-2016b toolchain, and we need OpenMPI to be built with > >>> Slurm resource manager support. It seems that the foss-2016b build > >>> doesn't include Slurm: > >>> > >>> # ml OpenMPI/1.10.3-GCC-5.4.0-2.26 > >>> # ompi_info | egrep -i 'slurm|pmi' > >>> MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component > >>> v1.10.3) > >>> MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component > >>> v1.10.3) > >>> MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component > >>> v1.10.3) > >>> > >>> Our multi-node MPI jobs fail miserably, and I surmise that this is due > >>> to the lacking Slurm support. > >>> > >>> Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2) > >>> --with-slurm. References: > >>> > >>> 1) https://www.mail- > >> archive.com/easybuild@lists.ugent.be/msg01975.html > >>> 2) https://www.open-mpi.org/faq/?category=slurm > >>> > >>> I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb > >>> and appending a line: > >>> > >>> configopts += '--with-slurm --with-pmi ' > >>> > >>> and rebuilding the module with eb --force. Unfortunately, the > >>> resulting module seems *not* to include my updated configopts > (looking > >>> at the file > >>> $EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC- > 5.4.0- > >> 2.26.eb). > >>> Question: How do I rebuild the OpenMPI module with proper Slurm > >> support? > >> > >> Rebuilding with --force should work, so for some reason your customized > >> EasyBuild was not picked up... > >> How did you provide it to EasyBuild exactly? Was it available in the local > >> directory where you ran the 'eb' command? > >> > >> You can verify that the right easyconfig is picked up via a dry run > >> like: "eb OpenMPI-1.10.3-GCC-5.4.0-2.26.eb -Df", which will print the path > to > >> the easyconfig files used. > >> > >>> Question: Can Slurm support please be included in future versions of > >>> the OpenMPI module in the foss-201x tool chain? > >> This is a left as a site-specific customization, since including > >> --with-slurm > hard > >> would make the installation fail on any systems that do not have SLURM. > >> > >> We should have documentation on how to deal with site-specific > >> customisations well though. > >> Is anyone doing that (JSC, CSCS, TAMU?) up for writing up some > >> documentation for this? > >> The existing documentation has some examples hinting towards a > possible > >> setup: > >> http://easybuild.readthedocs.io/en/latest/Configuration.html#example > >> > >> > >> regards, > >> > >> Kenneth > > -- The information contained in this communication and any attachments is > confidential and may be privileged, and is for the sole use of the intended > recipient(s). Any unauthorized review, use, disclosure or distribution is > prohibited. Unless explicitly stated otherwise in the body of this > communication or the attachment thereto (if any), the information is > provided on an AS-IS basis without any express or implied warranties or > liabilities. To the extent you are relying on this information, you are doing > so > at your own risk. If you are not the intended recipient, please notify the > sender immediately by replying to
Re: [easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?
Hi Erik, Thanks for your input. I already discovered that I had to add to OpenMPI-1.10.3-GCC-5.4.0-2.26.eb this line: configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr ' # Support of Slurm Question: What's the difference when you add --rebuild to --force? I was assuming that --force would rebuild the module completely, but perhaps I'm mistaken? Thanks, Ole On 10/28/2016 08:50 AM, Erik Smeets wrote: Hi Ole, We've had a similar issue, but then for PBS. I created a new eb file and added with-tm specifying full-path: configopts += '--with-tm=/full/path/to/pbs ' I then installed openmpi with --force --rebuild. For some reason when not specifying the full path it doesn't work for us at the moment. I still need to look at this, as it is inconvenient having to update this file when we upgrade PBS. At least for now it does the trick. Regards, Erik -Original Message- From: easybuild-requ...@lists.ugent.be [mailto:easybuild- requ...@lists.ugent.be] On Behalf Of Kenneth Hoste Sent: Wednesday, October 26, 2016 6:43 PM To: easybuild@lists.ugent.be Subject: Re: [easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support? Hi Ole, On 26/10/16 16:03, Ole Holm Nielsen wrote: We use the foss-2016b toolchain, and we need OpenMPI to be built with Slurm resource manager support. It seems that the foss-2016b build doesn't include Slurm: # ml OpenMPI/1.10.3-GCC-5.4.0-2.26 # ompi_info | egrep -i 'slurm|pmi' MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component v1.10.3) MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3) MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3) Our multi-node MPI jobs fail miserably, and I surmise that this is due to the lacking Slurm support. Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2) --with-slurm. References: 1) https://www.mail- archive.com/easybuild@lists.ugent.be/msg01975.html 2) https://www.open-mpi.org/faq/?category=slurm I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb and appending a line: configopts += '--with-slurm --with-pmi ' and rebuilding the module with eb --force. Unfortunately, the resulting module seems *not* to include my updated configopts (looking at the file $EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC-5.4.0- 2.26.eb). Question: How do I rebuild the OpenMPI module with proper Slurm support? Rebuilding with --force should work, so for some reason your customized EasyBuild was not picked up... How did you provide it to EasyBuild exactly? Was it available in the local directory where you ran the 'eb' command? You can verify that the right easyconfig is picked up via a dry run like: "eb OpenMPI-1.10.3-GCC-5.4.0-2.26.eb -Df", which will print the path to the easyconfig files used. Question: Can Slurm support please be included in future versions of the OpenMPI module in the foss-201x tool chain? This is a left as a site-specific customization, since including --with-slurm hard would make the installation fail on any systems that do not have SLURM. We should have documentation on how to deal with site-specific customisations well though. Is anyone doing that (JSC, CSCS, TAMU?) up for writing up some documentation for this? The existing documentation has some examples hinting towards a possible setup: http://easybuild.readthedocs.io/en/latest/Configuration.html#example regards, Kenneth -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. The sender nor the company/group of companies he or she represents shall be liable for the proper and complete transmission of the information contained in this communication, or for any delay in its receipt.
Re: [easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?
On 10/26/2016 06:43 PM, Kenneth Hoste wrote: Question: How do I rebuild the OpenMPI module with proper Slurm support? Rebuilding with --force should work, so for some reason your customized EasyBuild was not picked up... How did you provide it to EasyBuild exactly? Was it available in the local directory where you ran the 'eb' command? You can verify that the right easyconfig is picked up via a dry run like: "eb OpenMPI-1.10.3-GCC-5.4.0-2.26.eb -Df", which will print the path to the easyconfig files used. The dry-run actually looks fine, and I can do the rebuild with --force correctly now (don't know why I had the problem in the first place). Question: Can Slurm support please be included in future versions of the OpenMPI module in the foss-201x tool chain? This is a left as a site-specific customization, since including --with-slurm hard would make the installation fail on any systems that do not have SLURM. We should have documentation on how to deal with site-specific customisations well though. Is anyone doing that (JSC, CSCS, TAMU?) up for writing up some documentation for this? The existing documentation has some examples hinting towards a possible setup: http://easybuild.readthedocs.io/en/latest/Configuration.html#example I have tested successfully adding this line to OpenMPI-1.10.3-GCC-5.4.0-2.26.eb: configopts += '--with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr ' # Support of Slurm Our cluster uses Slurm 16.05.5 on CentOS 7.2, and the Slurm RPMs slurm and slurm-devel install files in the above mentioned directories. As for documentation of Slurm with OpenMPI, I have some notes in our Wiki: https://wiki.fysik.dtu.dk/niflheim/SLURM#mpi-setup /Ole
Re: [easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?
Hi Ole, On 26/10/16 16:03, Ole Holm Nielsen wrote: We use the foss-2016b toolchain, and we need OpenMPI to be built with Slurm resource manager support. It seems that the foss-2016b build doesn't include Slurm: # ml OpenMPI/1.10.3-GCC-5.4.0-2.26 # ompi_info | egrep -i 'slurm|pmi' MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component v1.10.3) MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3) MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3) Our multi-node MPI jobs fail miserably, and I surmise that this is due to the lacking Slurm support. Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2) --with-slurm. References: 1) https://www.mail-archive.com/easybuild@lists.ugent.be/msg01975.html 2) https://www.open-mpi.org/faq/?category=slurm I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb and appending a line: configopts += '--with-slurm --with-pmi ' and rebuilding the module with eb --force. Unfortunately, the resulting module seems *not* to include my updated configopts (looking at the file $EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC-5.4.0-2.26.eb). Question: How do I rebuild the OpenMPI module with proper Slurm support? Rebuilding with --force should work, so for some reason your customized EasyBuild was not picked up... How did you provide it to EasyBuild exactly? Was it available in the local directory where you ran the 'eb' command? You can verify that the right easyconfig is picked up via a dry run like: "eb OpenMPI-1.10.3-GCC-5.4.0-2.26.eb -Df", which will print the path to the easyconfig files used. Question: Can Slurm support please be included in future versions of the OpenMPI module in the foss-201x tool chain? This is a left as a site-specific customization, since including --with-slurm hard would make the installation fail on any systems that do not have SLURM. We should have documentation on how to deal with site-specific customisations well though. Is anyone doing that (JSC, CSCS, TAMU?) up for writing up some documentation for this? The existing documentation has some examples hinting towards a possible setup: http://easybuild.readthedocs.io/en/latest/Configuration.html#example regards, Kenneth
[easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?
We use the foss-2016b toolchain, and we need OpenMPI to be built with Slurm resource manager support. It seems that the foss-2016b build doesn't include Slurm: # ml OpenMPI/1.10.3-GCC-5.4.0-2.26 # ompi_info | egrep -i 'slurm|pmi' MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component v1.10.3) MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3) MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3) Our multi-node MPI jobs fail miserably, and I surmise that this is due to the lacking Slurm support. Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2) --with-slurm. References: 1) https://www.mail-archive.com/easybuild@lists.ugent.be/msg01975.html 2) https://www.open-mpi.org/faq/?category=slurm I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb and appending a line: configopts += '--with-slurm --with-pmi ' and rebuilding the module with eb --force. Unfortunately, the resulting module seems *not* to include my updated configopts (looking at the file $EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC-5.4.0-2.26.eb). Question: How do I rebuild the OpenMPI module with proper Slurm support? Question: Can Slurm support please be included in future versions of the OpenMPI module in the foss-201x tool chain? Thanks a lot, Ole