RE: [easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?

2016-10-28 Thread Erik Smeets
Hi Kenneth,

Full path to installation of PBS. PBS is not installed with EB, but loaded as 
external module.
Specifying the following wasn't sufficient:
configopts += '--with-tm '
But using the following does work:
configopts += '--with-tm=/full/path/to/install/of/pbs '

Regards,
Erik



> -Original Message-
> From: easybuild-requ...@lists.ugent.be [mailto:easybuild-
> requ...@lists.ugent.be] On Behalf Of Kenneth Hoste
> Sent: Friday, October 28, 2016 8:55 AM
> To: easybuild@lists.ugent.be
> Subject: Re: [easybuild] How to rebuild the foss-2016b toolchain with
> OpenMPI including Slurm support?
>
> Hi Erik,
>
> On 28/10/16 08:50, Erik Smeets wrote:
> > Hi Ole,
> >
> > We've had a similar issue, but then for PBS. I created a new eb file and
> added with-tm specifying full-path:
> > configopts += '--with-tm=/full/path/to/pbs '
> >
> > I then installed openmpi with --force --rebuild.
> >
> > For some reason when not specifying the full path it doesn't work for us at
> the moment. I still need to look at this, as it is inconvenient having to 
> update
> this file when we upgrade PBS. At least for now it does the trick.
>
> Can you clarify this? Full path to what? The easyconfig file?
> And if so, on the command line, or in your EasyBuild configuration (robot-
> paths)?
>
>
> regards,
>
> Kenneth
>
> >
> > Regards,
> > Erik
> >
> >
> >
> >> -Original Message-
> >> From: easybuild-requ...@lists.ugent.be [mailto:easybuild-
> >> requ...@lists.ugent.be] On Behalf Of Kenneth Hoste
> >> Sent: Wednesday, October 26, 2016 6:43 PM
> >> To: easybuild@lists.ugent.be
> >> Subject: Re: [easybuild] How to rebuild the foss-2016b toolchain with
> >> OpenMPI including Slurm support?
> >>
> >> Hi Ole,
> >>
> >> On 26/10/16 16:03, Ole Holm Nielsen wrote:
> >>> We use the foss-2016b toolchain, and we need OpenMPI to be built with
> >>> Slurm resource manager support.  It seems that the foss-2016b build
> >>> doesn't include Slurm:
> >>>
> >>> # ml OpenMPI/1.10.3-GCC-5.4.0-2.26
> >>> # ompi_info | egrep -i 'slurm|pmi'
> >>>   MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component
> >>> v1.10.3)
> >>>   MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component
> >>> v1.10.3)
> >>>   MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component
> >>> v1.10.3)
> >>>
> >>> Our multi-node MPI jobs fail miserably, and I surmise that this is due
> >>> to the lacking Slurm support.
> >>>
> >>> Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2)
> >>> --with-slurm. References:
> >>>
> >>> 1) https://www.mail-
> >> archive.com/easybuild@lists.ugent.be/msg01975.html
> >>> 2) https://www.open-mpi.org/faq/?category=slurm
> >>>
> >>> I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb
> >>> and appending a line:
> >>>
> >>> configopts += '--with-slurm --with-pmi '
> >>>
> >>> and rebuilding the module with eb --force.  Unfortunately, the
> >>> resulting module seems *not* to include my updated configopts
> (looking
> >>> at the file
> >>> $EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC-
> 5.4.0-
> >> 2.26.eb).
> >>> Question: How do I rebuild the OpenMPI module with proper Slurm
> >> support?
> >>
> >> Rebuilding with --force should work, so for some reason your customized
> >> EasyBuild was not picked up...
> >> How did you provide it to EasyBuild exactly? Was it available in the local
> >> directory where you ran the 'eb' command?
> >>
> >> You can verify that the right easyconfig is picked up via a dry run
> >> like: "eb OpenMPI-1.10.3-GCC-5.4.0-2.26.eb -Df", which will print the path
> to
> >> the easyconfig files used.
> >>
> >>> Question: Can Slurm support please be included in future versions of
> >>> the OpenMPI module in the foss-201x tool chain?
> >> This is a left as a site-specific customization, since including 
> >> --with-slurm
> hard
> >> would make the installation fail on any systems that do not have SLURM.
> >>
> >> We should have documentation on how to deal with site-specific
> >> customisations well though.
> >> Is anyone doing that (JSC, CSCS, TAMU?) up for writing up some
> >> documentation for this?
> >> The existing documentation has some examples hinting towards a
> possible
> >> setup:
> >> http://easybuild.readthedocs.io/en/latest/Configuration.html#example
> >>
> >>
> >> regards,
> >>
> >> Kenneth
> > -- The information contained in this communication and any attachments is
> confidential and may be privileged, and is for the sole use of the intended
> recipient(s). Any unauthorized review, use, disclosure or distribution is
> prohibited. Unless explicitly stated otherwise in the body of this
> communication or the attachment thereto (if any), the information is
> provided on an AS-IS basis without any express or implied warranties or
> liabilities. To the extent you are relying on this information, you are doing 
> so
> at your own risk. If you are not the intended recipient, please notify the
> sender immediately by replying to 

Re: [easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?

2016-10-28 Thread Ole Holm Nielsen

Hi Erik,

Thanks for your input.  I already discovered that I had to add to 
OpenMPI-1.10.3-GCC-5.4.0-2.26.eb this line:


configopts += '--with-slurm --with-pmi=/usr/include/slurm 
--with-pmi-libdir=/usr '  # Support of Slurm


Question: What's the difference when you add --rebuild to --force?

I was assuming that --force would rebuild the module completely, but 
perhaps I'm mistaken?


Thanks,
Ole

On 10/28/2016 08:50 AM, Erik Smeets wrote:

Hi Ole,

We've had a similar issue, but then for PBS. I created a new eb file and added 
with-tm specifying full-path:
configopts += '--with-tm=/full/path/to/pbs '

I then installed openmpi with --force --rebuild.

For some reason when not specifying the full path it doesn't work for us at the 
moment. I still need to look at this, as it is inconvenient having to update 
this file when we upgrade PBS. At least for now it does the trick.

Regards,
Erik




-Original Message-
From: easybuild-requ...@lists.ugent.be [mailto:easybuild-
requ...@lists.ugent.be] On Behalf Of Kenneth Hoste
Sent: Wednesday, October 26, 2016 6:43 PM
To: easybuild@lists.ugent.be
Subject: Re: [easybuild] How to rebuild the foss-2016b toolchain with
OpenMPI including Slurm support?

Hi Ole,

On 26/10/16 16:03, Ole Holm Nielsen wrote:

We use the foss-2016b toolchain, and we need OpenMPI to be built with
Slurm resource manager support.  It seems that the foss-2016b build
doesn't include Slurm:

# ml OpenMPI/1.10.3-GCC-5.4.0-2.26
# ompi_info | egrep -i 'slurm|pmi'
 MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component
v1.10.3)
 MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component
v1.10.3)
 MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component
v1.10.3)

Our multi-node MPI jobs fail miserably, and I surmise that this is due
to the lacking Slurm support.

Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2)
--with-slurm. References:

1) https://www.mail-

archive.com/easybuild@lists.ugent.be/msg01975.html

2) https://www.open-mpi.org/faq/?category=slurm

I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb
and appending a line:

configopts += '--with-slurm --with-pmi '

and rebuilding the module with eb --force.  Unfortunately, the
resulting module seems *not* to include my updated configopts (looking
at the file
$EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC-5.4.0-

2.26.eb).


Question: How do I rebuild the OpenMPI module with proper Slurm

support?

Rebuilding with --force should work, so for some reason your customized
EasyBuild was not picked up...
How did you provide it to EasyBuild exactly? Was it available in the local
directory where you ran the 'eb' command?

You can verify that the right easyconfig is picked up via a dry run
like: "eb OpenMPI-1.10.3-GCC-5.4.0-2.26.eb -Df", which will print the path to
the easyconfig files used.


Question: Can Slurm support please be included in future versions of
the OpenMPI module in the foss-201x tool chain?

This is a left as a site-specific customization, since including --with-slurm 
hard
would make the installation fail on any systems that do not have SLURM.

We should have documentation on how to deal with site-specific
customisations well though.
Is anyone doing that (JSC, CSCS, TAMU?) up for writing up some
documentation for this?
The existing documentation has some examples hinting towards a possible
setup:
http://easybuild.readthedocs.io/en/latest/Configuration.html#example


regards,

Kenneth

-- The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the intended 
recipient(s). Any unauthorized review, use, disclosure or distribution is 
prohibited. Unless explicitly stated otherwise in the body of this 
communication or the attachment thereto (if any), the information is provided 
on an AS-IS basis without any express or implied warranties or liabilities. To 
the extent you are relying on this information, you are doing so at your own 
risk. If you are not the intended recipient, please notify the sender 
immediately by replying to this message and destroy all copies of this message 
and any attachments. The sender nor the company/group of companies he or she 
represents shall be liable for the proper and complete transmission of the 
information contained in this communication, or for any delay in its receipt.



Re: [easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?

2016-10-27 Thread Ole Holm Nielsen

On 10/26/2016 06:43 PM, Kenneth Hoste wrote:

Question: How do I rebuild the OpenMPI module with proper Slurm support?


Rebuilding with --force should work, so for some reason your customized
EasyBuild was not picked up...
How did you provide it to EasyBuild exactly? Was it available in the
local directory where you ran the 'eb' command?

You can verify that the right easyconfig is picked up via a dry run
like: "eb OpenMPI-1.10.3-GCC-5.4.0-2.26.eb -Df", which will print the
path to the easyconfig files used.


The dry-run actually looks fine, and I can do the rebuild with --force 
correctly now (don't know why I had the problem in the first place).



Question: Can Slurm support please be included in future versions of
the OpenMPI module in the foss-201x tool chain?

This is a left as a site-specific customization, since including
--with-slurm hard would make the installation fail on any systems that
do not have SLURM.

We should have documentation on how to deal with site-specific
customisations well though.
Is anyone doing that (JSC, CSCS, TAMU?) up for writing up some
documentation for this?
The existing documentation has some examples hinting towards a possible
setup: http://easybuild.readthedocs.io/en/latest/Configuration.html#example


I have tested successfully adding this line to 
OpenMPI-1.10.3-GCC-5.4.0-2.26.eb:


configopts += '--with-slurm --with-pmi=/usr/include/slurm 
--with-pmi-libdir=/usr '  # Support of Slurm


Our cluster uses Slurm 16.05.5 on CentOS 7.2, and the Slurm RPMs slurm 
and slurm-devel install files in the above mentioned directories.


As for documentation of Slurm with OpenMPI, I have some notes in our 
Wiki: https://wiki.fysik.dtu.dk/niflheim/SLURM#mpi-setup


/Ole


Re: [easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?

2016-10-26 Thread Kenneth Hoste

Hi Ole,

On 26/10/16 16:03, Ole Holm Nielsen wrote:
We use the foss-2016b toolchain, and we need OpenMPI to be built with 
Slurm resource manager support.  It seems that the foss-2016b build 
doesn't include Slurm:


# ml OpenMPI/1.10.3-GCC-5.4.0-2.26
# ompi_info | egrep -i 'slurm|pmi'
 MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component 
v1.10.3)
 MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component 
v1.10.3)
 MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component 
v1.10.3)


Our multi-node MPI jobs fail miserably, and I surmise that this is due 
to the lacking Slurm support.


Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2) 
--with-slurm. References:


1) https://www.mail-archive.com/easybuild@lists.ugent.be/msg01975.html
2) https://www.open-mpi.org/faq/?category=slurm

I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb 
and appending a line:


configopts += '--with-slurm --with-pmi '

and rebuilding the module with eb --force.  Unfortunately, the 
resulting module seems *not* to include my updated configopts (looking 
at the file 
$EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC-5.4.0-2.26.eb).


Question: How do I rebuild the OpenMPI module with proper Slurm support?


Rebuilding with --force should work, so for some reason your customized 
EasyBuild was not picked up...
How did you provide it to EasyBuild exactly? Was it available in the 
local directory where you ran the 'eb' command?


You can verify that the right easyconfig is picked up via a dry run 
like: "eb OpenMPI-1.10.3-GCC-5.4.0-2.26.eb -Df", which will print the 
path to the easyconfig files used.


Question: Can Slurm support please be included in future versions of 
the OpenMPI module in the foss-201x tool chain?
This is a left as a site-specific customization, since including 
--with-slurm hard would make the installation fail on any systems that 
do not have SLURM.


We should have documentation on how to deal with site-specific 
customisations well though.
Is anyone doing that (JSC, CSCS, TAMU?) up for writing up some 
documentation for this?
The existing documentation has some examples hinting towards a possible 
setup: http://easybuild.readthedocs.io/en/latest/Configuration.html#example



regards,

Kenneth


[easybuild] How to rebuild the foss-2016b toolchain with OpenMPI including Slurm support?

2016-10-26 Thread Ole Holm Nielsen
We use the foss-2016b toolchain, and we need OpenMPI to be built with 
Slurm resource manager support.  It seems that the foss-2016b build 
doesn't include Slurm:


# ml OpenMPI/1.10.3-GCC-5.4.0-2.26
# ompi_info | egrep -i 'slurm|pmi'
 MCA ess: slurm (MCA v2.0.0, API v3.0.0, Component v1.10.3)
 MCA plm: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3)
 MCA ras: slurm (MCA v2.0.0, API v2.0.0, Component v1.10.3)

Our multi-node MPI jobs fail miserably, and I surmise that this is due 
to the lacking Slurm support.


Slurm seems to require a build of OpenMPI with 1) --with-pmi and/or 2) 
--with-slurm. References:


1) https://www.mail-archive.com/easybuild@lists.ugent.be/msg01975.html
2) https://www.open-mpi.org/faq/?category=slurm

I tried making a copy of the EB file OpenMPI-1.10.3-GCC-5.4.0-2.26.eb 
and appending a line:


configopts += '--with-slurm --with-pmi '

and rebuilding the module with eb --force.  Unfortunately, the resulting 
module seems *not* to include my updated configopts (looking at the file 
$EASYBUILD_PREFIX/ebfiles_repo/OpenMPI/OpenMPI-1.10.3-GCC-5.4.0-2.26.eb).


Question: How do I rebuild the OpenMPI module with proper Slurm support?

Question: Can Slurm support please be included in future versions of the 
OpenMPI module in the foss-201x tool chain?


Thanks a lot,
Ole