[easybuild] Intel OneAPI Base + HPC toolkit

2020-12-18 Thread André Gemünd
Dear Easybuilders,

sorry if I missed the discussion if there exists one already, but did someone 
take a look at the new structure of the intel packages? It seems our Parallel 
Studio XE Cluster Edition is now "Intel® oneAPI Base & HPC Toolkit 
(Multi-Node)" with two packages. No direct .tar.gz files are offered for 
download anymore, but a _offline.sh script. That seems to be only a minor 
inconvenience though, because the script is just the .tar.gz with some 
extraction code in front, so you can skip everything up to and including 
__CONTENT__ and have the .tar.gz. However, is someone already working on this?

Best Greetings
-- 
Dipl.-Inf. André Gemünd, Leiter IT / Head of IT
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] EB/EC for CP2K 7.1?

2020-05-29 Thread André Gemünd
By the way, make check runs through for me if I add a second parameter to 
--with-max-am (--with-max-am=%d,%d). I wonder how that works for cp2k, because 
they don't have that (took the hint from the compile script in libint).

Greetings
André

- Am 29. Mai 2020 um 18:05 schrieb Andre Gemuend 
andre.gemu...@scai.fraunhofer.de:

> Hi Kenneth,
> 
> in fact, I think I misunderstood the prebuilt packages and the build process.
> 
> I seems we first need to build the libint compiler, then the export library,
> then build that. The libint configure that is available after autogen.sh
> doesn't even know "--enable-fortran". Only the export package, which is also
> what is provided as "libint-cp2k". I'm currently looking into it.
> 
> Greetings
> André
> 
> - Am 29. Mai 2020 um 14:02 schrieb Kenneth Hoste kenneth.ho...@ugent.be:
> 
>> On 29/05/2020 12:37, André Gemünd wrote:
>>> Hi Kenneth,
>>>
>>> I'm currently trying with the following and its building since roughly 2 
>>> hours,
>>> on a single core though although it was called with -j 32.
>> 
>> I'm seeing the same thing, I'm not sure you can make it build in parallel...
>> 
>> I also tried with lmax=4, and that finished building after ~45min only
>> to fail in "make check" with:
>> 
>> In file included from ../../include/libint2/engine.h(988), from
>> ../../include/libint2/cxxapi.h(41),
>>  from ../../include/libint2.hpp(24), from test.cc(24):
>> ../../include/libint2/./engine.impl.h(627): error: identifier
>> "LIBINT2_MAX_AM_default1" is undefined BOOST_PP_LIST_FOR_EACH_PRODUCT(
>>     ^
>> 
>> compilation aborted for test.cc (code 2)
>> make[1]: *** [test.o] Error 2
>> 
>> 
>> The generated configure command was using this:
>> 
>> --enable-eri=1 --enable-eri2=1 --enable-eri3=1 --with-max-am=4
>> --with-eri-max-am=4,3 --with-eri2-max-am=6,5 --with-eri3-max-am=6,5
>> --enable-shared  --with-pic
>> 
>> 
>> Should be OK, no?
>> 
>> 
>>>
>>> easyblock = 'ConfigureMake'
>>> name = 'Libint'
>>> version = '2.6.0'
>>>
>>> homepage = 'https://github.com/evaleev/libint'
>>> description = """Libint library is used to evaluate the traditional 
>>> (electron
>>> repulsion) and certain novel two-body
>>>   matrix elements (integrals) over Cartesian Gaussian functions used in 
>>> modern
>>>   atomic and molecular theory."""
>>>
>>> toolchain = {'name': 'foss', 'version': '2020a'}
>>> toolchainopts = {'pic': True, 'cstd': 'c++11'}
>>>
>>> source_urls = ['https://github.com/evaleev/libint/archive']
>>> sources = ['v%(version)s.tar.gz']
>>> checksums = 
>>> ['4ae47e8f0b5632c3d2a956469a7920896708e9f0e396ec10071b8181e4c8d9fa']
>>>
>>> builddependencies = [
>>>  ('Autotools', '20180311'),
>>>  ('GMP', '6.2.0'),
>>>  ('Boost', '1.72.0'),
>>>  ('Eigen', '3.3.7', '', True),
>>>  ('MPFR', '4.0.2'),
>>>  ('Python', '2.7.18'),
>>> ]
>>>
>>> preconfigopts = './autogen.sh && '
>>>
>>> _lmax = 7
>>>
>>> # configure opts motivated by cp2k:
>>> # https://github.com/cp2k/libint-cp2k/blob/master/Jenkinsfile
>>> configopts = '--enable-fortran --enable-eri=1 --enable-eri2=1 
>>> --enable-eri3=1 \
>>> --with-max-am=%d \
>>> --with-eri-max-am=%d,%d \
>>> --with-eri2-max-am=%d,%d \
>>> --with-eri3-max-am=%d,%d \
>>> --with-opt-am=3 ' % (_lmax, _lmax, _lmax-1, _lmax+2, 
>>> _lmax+1, _lmax+2, _lmax+1)
>>>
>>> moduleclass = 'chem'
>>>
>>> Greetings
>>> André
>>>
>>> - Am 29. Mai 2020 um 11:35 schrieb Kenneth Hoste kenneth.ho...@ugent.be:
>>>
>>>> On 29/05/2020 10:46, André Gemünd wrote:
>>>>> Hi Kenneth,
>>>>>
>>>>> thanks for that. I'm at a similar point but using foss-2020a. I also 
>>>>> wanted to
>>>>> do Intel afterwards, but I thought I'd start with foss because I had some 
>>>>> very
>>>>> weird errors with CP2k and Intel in the past. I'm currently looking more
>>>>> closely at Libint (https://github.com/evaleev/libint/wiki)
>>>>>
>>>>> According to that and the buildflags the cp2k people use, we should be 
>

Re: [easybuild] EB/EC for CP2K 7.1?

2020-05-29 Thread André Gemünd
Hi Kenneth,

in fact, I think I misunderstood the prebuilt packages and the build process. 

I seems we first need to build the libint compiler, then the export library, 
then build that. The libint configure that is available after autogen.sh 
doesn't even know "--enable-fortran". Only the export package, which is also 
what is provided as "libint-cp2k". I'm currently looking into it.

Greetings
André

- Am 29. Mai 2020 um 14:02 schrieb Kenneth Hoste kenneth.ho...@ugent.be:

> On 29/05/2020 12:37, André Gemünd wrote:
>> Hi Kenneth,
>>
>> I'm currently trying with the following and its building since roughly 2 
>> hours,
>> on a single core though although it was called with -j 32.
> 
> I'm seeing the same thing, I'm not sure you can make it build in parallel...
> 
> I also tried with lmax=4, and that finished building after ~45min only
> to fail in "make check" with:
> 
> In file included from ../../include/libint2/engine.h(988), from
> ../../include/libint2/cxxapi.h(41),
>  from ../../include/libint2.hpp(24), from test.cc(24):
> ../../include/libint2/./engine.impl.h(627): error: identifier
> "LIBINT2_MAX_AM_default1" is undefined BOOST_PP_LIST_FOR_EACH_PRODUCT(
>     ^
> 
> compilation aborted for test.cc (code 2)
> make[1]: *** [test.o] Error 2
> 
> 
> The generated configure command was using this:
> 
> --enable-eri=1 --enable-eri2=1 --enable-eri3=1 --with-max-am=4
> --with-eri-max-am=4,3 --with-eri2-max-am=6,5 --with-eri3-max-am=6,5
> --enable-shared  --with-pic
> 
> 
> Should be OK, no?
> 
> 
>>
>> easyblock = 'ConfigureMake'
>> name = 'Libint'
>> version = '2.6.0'
>>
>> homepage = 'https://github.com/evaleev/libint'
>> description = """Libint library is used to evaluate the traditional (electron
>> repulsion) and certain novel two-body
>>   matrix elements (integrals) over Cartesian Gaussian functions used in 
>> modern
>>   atomic and molecular theory."""
>>
>> toolchain = {'name': 'foss', 'version': '2020a'}
>> toolchainopts = {'pic': True, 'cstd': 'c++11'}
>>
>> source_urls = ['https://github.com/evaleev/libint/archive']
>> sources = ['v%(version)s.tar.gz']
>> checksums = 
>> ['4ae47e8f0b5632c3d2a956469a7920896708e9f0e396ec10071b8181e4c8d9fa']
>>
>> builddependencies = [
>>  ('Autotools', '20180311'),
>>  ('GMP', '6.2.0'),
>>  ('Boost', '1.72.0'),
>>  ('Eigen', '3.3.7', '', True),
>>  ('MPFR', '4.0.2'),
>>  ('Python', '2.7.18'),
>> ]
>>
>> preconfigopts = './autogen.sh && '
>>
>> _lmax = 7
>>
>> # configure opts motivated by cp2k:
>> # https://github.com/cp2k/libint-cp2k/blob/master/Jenkinsfile
>> configopts = '--enable-fortran --enable-eri=1 --enable-eri2=1 
>> --enable-eri3=1 \
>> --with-max-am=%d \
>> --with-eri-max-am=%d,%d \
>> --with-eri2-max-am=%d,%d \
>> --with-eri3-max-am=%d,%d \
>> --with-opt-am=3 ' % (_lmax, _lmax, _lmax-1, _lmax+2, 
>> _lmax+1, _lmax+2, _lmax+1)
>>
>> moduleclass = 'chem'
>>
>> Greetings
>> André
>>
>> - Am 29. Mai 2020 um 11:35 schrieb Kenneth Hoste kenneth.ho...@ugent.be:
>>
>>> On 29/05/2020 10:46, André Gemünd wrote:
>>>> Hi Kenneth,
>>>>
>>>> thanks for that. I'm at a similar point but using foss-2020a. I also 
>>>> wanted to
>>>> do Intel afterwards, but I thought I'd start with foss because I had some 
>>>> very
>>>> weird errors with CP2k and Intel in the past. I'm currently looking more
>>>> closely at Libint (https://github.com/evaleev/libint/wiki)
>>>>
>>>> According to that and the buildflags the cp2k people use, we should be 
>>>> building
>>>> Libint with
>>>>
>>>> --enable-eri=1 --enable-eri2=1 --enable-eri3=1 \
>>>>   --with-max-am=${LMAX} \
>>>>   --with-eri-max-am=${LMAX},$((LMAX-1)) \
>>>>   --with-eri2-max-am=$((LMAX+2)),$((LMAX+1)) \
>>>>   --with-eri3-max-am=$((LMAX+2)),$((LMAX+1)) \
>>>>   --with-opt-am=3
>>>>
>>>> I don't see the easyblock doing that, or did I miss it?
>>> That should be done in the Libint easyconfig, indeed, I'll look into
>>> that as well.
>>>
>>>> I'm also wondering if it might make sense to put the lmax option in the 
>>>> name of
>>>> the package

Re: [easybuild] EB/EC for CP2K 7.1?

2020-05-29 Thread André Gemünd
Hi Kenneth,

I'm currently trying with the following and its building since roughly 2 hours, 
on a single core though although it was called with -j 32.

easyblock = 'ConfigureMake'
name = 'Libint'
version = '2.6.0'

homepage = 'https://github.com/evaleev/libint'
description = """Libint library is used to evaluate the traditional (electron 
repulsion) and certain novel two-body
 matrix elements (integrals) over Cartesian Gaussian functions used in modern 
atomic and molecular theory."""

toolchain = {'name': 'foss', 'version': '2020a'}
toolchainopts = {'pic': True, 'cstd': 'c++11'}

source_urls = ['https://github.com/evaleev/libint/archive']
sources = ['v%(version)s.tar.gz']
checksums = ['4ae47e8f0b5632c3d2a956469a7920896708e9f0e396ec10071b8181e4c8d9fa']

builddependencies = [
('Autotools', '20180311'),
('GMP', '6.2.0'),
('Boost', '1.72.0'),
('Eigen', '3.3.7', '', True),
('MPFR', '4.0.2'),
('Python', '2.7.18'),
]

preconfigopts = './autogen.sh && '

_lmax = 7

# configure opts motivated by cp2k:
# https://github.com/cp2k/libint-cp2k/blob/master/Jenkinsfile
configopts = '--enable-fortran --enable-eri=1 --enable-eri2=1 --enable-eri3=1 \
   --with-max-am=%d \
   --with-eri-max-am=%d,%d \
   --with-eri2-max-am=%d,%d \
   --with-eri3-max-am=%d,%d \
   --with-opt-am=3 ' % (_lmax, _lmax, _lmax-1, _lmax+2, _lmax+1, 
_lmax+2, _lmax+1)

moduleclass = 'chem'

Greetings
André

- Am 29. Mai 2020 um 11:35 schrieb Kenneth Hoste kenneth.ho...@ugent.be:

> On 29/05/2020 10:46, André Gemünd wrote:
>> Hi Kenneth,
>>
>> thanks for that. I'm at a similar point but using foss-2020a. I also wanted 
>> to
>> do Intel afterwards, but I thought I'd start with foss because I had some 
>> very
>> weird errors with CP2k and Intel in the past. I'm currently looking more
>> closely at Libint (https://github.com/evaleev/libint/wiki)
>>
>> According to that and the buildflags the cp2k people use, we should be 
>> building
>> Libint with
>>
>> --enable-eri=1 --enable-eri2=1 --enable-eri3=1 \
>>  --with-max-am=${LMAX} \
>>  --with-eri-max-am=${LMAX},$((LMAX-1)) \
>>  --with-eri2-max-am=$((LMAX+2)),$((LMAX+1)) \
>>  --with-eri3-max-am=$((LMAX+2)),$((LMAX+1)) \
>>  --with-opt-am=3
>>
>> I don't see the easyblock doing that, or did I miss it?
> 
> That should be done in the Libint easyconfig, indeed, I'll look into
> that as well.
> 
>> I'm also wondering if it might make sense to put the lmax option in the name 
>> of
>> the package to be a bit more generic? On the other hand, increasing lmax
>> apparently only increases buildtime and library size (according to the README
>> here: https://github.com/cp2k/libint-cp2k). The cp2k guys themselves offer
>> prebuilt binaries for up to lmax 7, so maybe that should be our goto as well?
>> Enabling fortran is not a disadvantage for any other use, so that would be 
>> make
>> the library as generic as possible I guess?
> 
> I think it indeed makes perfect sense to "tag" Libint with versionsuffix
> = '-lmax-7', and not hold back there, unless the build blows up to
> taking hours and consuming GBs of disk space in the installation
> directory (and even then...).
> 
>> Also, it doesn't really matter, but is Python really needed as a builddep? I
>> guess I'll try it out.
> 
> It was added for a reason probably, but I can double check on that...
> 
> Could be to avoid relying on the system Python (which could also be
> Python 3).
> 
> 
> regards,
> 
> Kenneth
> 
>>
>> Cheers
>> André
>>
>> - Am 28. Mai 2020 um 22:13 schrieb Kenneth Hoste kenneth.ho...@ugent.be:
>>
>>> We have requests for CP2K 7.1, so it's on my TODO list.
>>>
>>> I didn't get very far yet, but I'll share what I have:
>>>
>>> * changes to CP2K easyblock:
>>> https://github.com/easybuilders/easybuild-easyblocks/pull/2069
>>>
>>> * easyconfigs for CP2K 7.1 + deps (doesn't work yet):
>>> https://github.com/easybuilders/easybuild-easyconfigs/pull/10714
>>>
>>> To test:
>>>
>>> eb --include-easyblocks-from-pr 2069 --from-pr 10714 --robot
>>>
>>>
>>> Any help is welcome :)
>>>
>>>
>>> regards,
>>>
>>> Kenneth
>>>
>>> On 28/05/2020 15:55, André Gemünd wrote:
>>>> Dear Loris,
>>>>
>>>> I just found this message from January because I was looking for CP2k 7.1 
>>>> as
>>>> well. Did you make any

Re: [easybuild] EB/EC for CP2K 7.1?

2020-05-28 Thread André Gemünd
Dear Loris,

I just found this message from January because I was looking for CP2k 7.1 as 
well. Did you make any progress with that?

Best Greetings
André

- Am 27. Jan 2020 um 15:51 schrieb Loris Bennett loris.benn...@fu-berlin.de:

> Hi,
> 
> I was wondering whether any work is going on regarding an EB/EC for CP2L
> 7.1.  The directory structure seems to have changed such that there is
> no longer a directory 'makefiles' in the top-level directory, so this
> bit of the EC
> 
>  def build_step(self):
>"""Start the actual build
>- go into makefiles dir
>- patch Makefile
>-build_and_install
>"""
> 
>makefiles = os.path.join(self.cfg['start_dir'], 'makefiles')
>change_dir(makefiles)
> 
> fails.
> 
> Cheers,
> 
> Loris
> 
> --
> Dr. Loris Bennett (Mr.)
> ZEDAT, Freie Universität Berlin Email loris.benn...@fu-berlin.de

-- 
Dipl.-Inf. André Gemünd, Leiter IT / Head of IT
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] CP2K problems

2019-09-20 Thread André Gemünd
Hi Kenneth,

indeed its now much better, if not exactly as good as in your case:

Summary of the regression tester run from 2019-09-19_15-37-19 using 
Linux-x86-64-foss popt 
Number of FAILED  tests 72
Number of WRONG   tests 3
Number of CORRECT tests 2974
Number of NEW tests 18
Total number of   tests 3067
--
Number of LEAKING tests 0
Number of memory  leaks 0
--


In fact, it now leaves only the same ABORT reasons as in the MPICH case:

$ sed -n '/\[ABORT/{n;p;}' 
/opt/software/easybuild/software/CP2K/6.1-foss-2019a/TEST-Linux-x86-64-foss-popt-2019-09-19_15-37-19/error_summary
 | sort | uniq -c | sort -k1,1 -n -r
 38  *  \___/KS energy is an abnormal value (NaN/Inf).  
   *
  1  *  \___/   exist. Data directory path: 
   *


Would you mind checking the ABORT reasons in your case?

Thanks and Greetings
André

- Am 19. Sep 2019 um 15:28 schrieb Andre Gemuend 
andre.gemu...@scai.fraunhofer.de:

> Hi Kenneth,
> 
> thanks for the feedback! In the meantime we also found that we probably didn't
> have the patched OpenBLAS on that installation (it was installed before the
> patch was released). We rebuilt CP2k and all of the dependencies and it seems
> many test cases don't run into the SCF divergence issue anymore. I'm currently
> running the full test suite to check and will report back.
> 
> I was also just now preparing a mail about some new results. We built a new
> MPICH toolchain and CP2k based on that and received much less errors.
> 
> Greetings
> André
> 
> - Am 19. Sep 2019 um 14:42 schrieb Kenneth Hoste kenneth.ho...@ugent.be:
> 
>> Dear André,
>> 
>> On 17/09/2019 18:49, André Gemünd wrote:
>>> Dear EasyBuilders,
>>> 
>>> we are currently trying to use the CP2k config that is shipped with the
>>> easyconfigs, more specifically CP2K-6.1-foss-2019a.eb. Unfortunately, we are
>>> seeing a lot of runtime issues with this version. Also the CP2K regression 
>>> test
>>> suite is not very happy. This is the summary we get:
>>> 
>>> Summary of the regression tester run from 2019-09-11_13-29-39 using
>>> Linux-x86-64-foss popt
>>> Number of FAILED  tests 288
>>> Number of WRONG   tests 559
>>> Number of CORRECT tests 2203
>>> Number of NEW tests 17
>>> Total number of   tests 3067
>>> --
>>> Number of LEAKING tests 0
>>> Number of memory  leaks 0
>>> --
>>> 
>>> When looking at the error_summary, we see mostly "SCF not converged" (55 
>>> cases)
>>> and "tr(Ap_j*p_j) < 0" (51 cases).
>>> 
>>> I'm curious if other users see the same or if it has something to do with 
>>> our
>>> environment?
>>> 
>>> We are on CentOS 7.6 and have Xeon Gold (Skylake EP) on these compute nodes.
>>> 
>>> We would be happy for any help or suggestions.
>> 
>> Can you share a CP2K input that triggers some of the problems you're
>> seeing, so I can try with our CP2K/6.1-foss-2019a installation on Intel
>> Skylake (Intel Xeon Gold 6140)?
>> 
>> The regression test isn't 100% (but the CP2K developers told me
>> themselves that not all tests are expected to pass all the time):
>> 
>> - Summary -
>> Number of FAILED  tests 49
>> Number of WRONG   tests 3
>> Number of CORRECT tests 2997
>> Number of NEW     tests 18
>> Total number of   tests 3067
>> 
>> 
>> Are you aware of the issues with OpenBLAS 0.3.5 (which is a part of
>> foss/2019a)?
>> We had to add patches to OpenBLAS 0.3.5 in recent EasyBuild versions to
>> fix problems on Intel Skylake, perhaps the problems you're seeing with
>> CP2K are related?
>> 
>> See also https://lists.ugent.be/wws/arc/easybuild/2019-08/msg00015.html .
>> 
>> 
>> regards,
>> 
>> Kenneth
> 
> --
> Dipl.-Inf. André Gemünd, Leiter IT-S
> Fraunhofer-Institute for Algorithms and Scientific Computing
> andre.gemu...@scai.fraunhofer.de
> Tel: +49 2241 14-2193
> /C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend

-- 
Dipl.-Inf. André Gemünd, Leiter IT-S
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] CP2K problems

2019-09-19 Thread André Gemünd
Hi Kenneth,

thanks for the feedback! In the meantime we also found that we probably didn't 
have the patched OpenBLAS on that installation (it was installed before the 
patch was released). We rebuilt CP2k and all of the dependencies and it seems 
many test cases don't run into the SCF divergence issue anymore. I'm currently 
running the full test suite to check and will report back.

I was also just now preparing a mail about some new results. We built a new 
MPICH toolchain and CP2k based on that and received much less errors. 

Greetings
André

- Am 19. Sep 2019 um 14:42 schrieb Kenneth Hoste kenneth.ho...@ugent.be:

> Dear André,
> 
> On 17/09/2019 18:49, André Gemünd wrote:
>> Dear EasyBuilders,
>> 
>> we are currently trying to use the CP2k config that is shipped with the
>> easyconfigs, more specifically CP2K-6.1-foss-2019a.eb. Unfortunately, we are
>> seeing a lot of runtime issues with this version. Also the CP2K regression 
>> test
>> suite is not very happy. This is the summary we get:
>> 
>> Summary of the regression tester run from 2019-09-11_13-29-39 using
>> Linux-x86-64-foss popt
>> Number of FAILED  tests 288
>> Number of WRONG   tests 559
>> Number of CORRECT tests 2203
>> Number of NEW tests 17
>> Total number of   tests 3067
>> --
>> Number of LEAKING tests 0
>> Number of memory  leaks 0
>> --
>> 
>> When looking at the error_summary, we see mostly "SCF not converged" (55 
>> cases)
>> and "tr(Ap_j*p_j) < 0" (51 cases).
>> 
>> I'm curious if other users see the same or if it has something to do with our
>> environment?
>> 
>> We are on CentOS 7.6 and have Xeon Gold (Skylake EP) on these compute nodes.
>> 
>> We would be happy for any help or suggestions.
> 
> Can you share a CP2K input that triggers some of the problems you're
> seeing, so I can try with our CP2K/6.1-foss-2019a installation on Intel
> Skylake (Intel Xeon Gold 6140)?
> 
> The regression test isn't 100% (but the CP2K developers told me
> themselves that not all tests are expected to pass all the time):
> 
> - Summary -
> Number of FAILED  tests 49
> Number of WRONG   tests 3
> Number of CORRECT tests 2997
> Number of NEW tests 18
> Total number of   tests 3067
> 
> 
> Are you aware of the issues with OpenBLAS 0.3.5 (which is a part of
> foss/2019a)?
> We had to add patches to OpenBLAS 0.3.5 in recent EasyBuild versions to
> fix problems on Intel Skylake, perhaps the problems you're seeing with
> CP2K are related?
> 
> See also https://lists.ugent.be/wws/arc/easybuild/2019-08/msg00015.html .
> 
> 
> regards,
> 
> Kenneth

-- 
Dipl.-Inf. André Gemünd, Leiter IT-S
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] CP2K problems

2019-09-19 Thread André Gemünd
Some additional information from our debugging. Many of the tests seem to time 
out, so we're investigating MPI problems. 

Because the CI/CD tests at https://dashboard.cp2k.org were mostly using MPICH 
we built a new gmpolf-2019 toolchain and tried with that. 

And indeed, that gives much better results:

Summary of the regression tester run from 2019-09-18_18-46-51 using 
Linux-x86-64-gmpolf popt 
Number of FAILED  tests 38
Number of WRONG   tests 3
Number of CORRECT tests 3007
Number of NEW tests 19
Total number of   tests 3067
--
Number of LEAKING tests 0
Number of memory  leaks 0
--

The only abort reason left is the following:

 38  *  \___/KS energy is an abnormal value (NaN/Inf).  
   *

Which could be related to the Skylake CPUs, c.f. 
https://github.com/xianyi/OpenBLAS/issues/2029

Although the OpenBLAS we use should already have the patch for that (PR #8227). 
Any feedback or recommendations?

Best Greetings
André

- Am 17. Sep 2019 um 18:49 schrieb Andre Gemuend 
andre.gemu...@scai.fraunhofer.de:

> Dear EasyBuilders,
> 
> we are currently trying to use the CP2k config that is shipped with the
> easyconfigs, more specifically CP2K-6.1-foss-2019a.eb. Unfortunately, we are
> seeing a lot of runtime issues with this version. Also the CP2K regression 
> test
> suite is not very happy. This is the summary we get:
> 
> Summary of the regression tester run from 2019-09-11_13-29-39 using
> Linux-x86-64-foss popt
> Number of FAILED  tests 288
> Number of WRONG   tests 559
> Number of CORRECT tests 2203
> Number of NEW tests 17
> Total number of   tests 3067
> --
> Number of LEAKING tests 0
> Number of memory  leaks 0
> --
> 
> When looking at the error_summary, we see mostly "SCF not converged" (55 
> cases)
> and "tr(Ap_j*p_j) < 0" (51 cases).
> 
> I'm curious if other users see the same or if it has something to do with our
> environment?
> 
> We are on CentOS 7.6 and have Xeon Gold (Skylake EP) on these compute nodes.
> 
> We would be happy for any help or suggestions.
> 
> Best Greetings
> --
> Dipl.-Inf. André Gemünd, Leiter IT-S
> Fraunhofer-Institute for Algorithms and Scientific Computing
> andre.gemu...@scai.fraunhofer.de
> Tel: +49 2241 14-2193
> /C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend

-- 
Dipl.-Inf. André Gemünd, Leiter IT-S
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


[easybuild] CP2K problems

2019-09-17 Thread André Gemünd
Dear EasyBuilders,

we are currently trying to use the CP2k config that is shipped with the 
easyconfigs, more specifically CP2K-6.1-foss-2019a.eb. Unfortunately, we are 
seeing a lot of runtime issues with this version. Also the CP2K regression test 
suite is not very happy. This is the summary we get:

Summary of the regression tester run from 2019-09-11_13-29-39 using 
Linux-x86-64-foss popt 
Number of FAILED  tests 288
Number of WRONG   tests 559
Number of CORRECT tests 2203
Number of NEW tests 17
Total number of   tests 3067
--
Number of LEAKING tests 0
Number of memory  leaks 0
--

When looking at the error_summary, we see mostly "SCF not converged" (55 cases) 
and "tr(Ap_j*p_j) < 0" (51 cases).

I'm curious if other users see the same or if it has something to do with our 
environment? 

We are on CentOS 7.6 and have Xeon Gold (Skylake EP) on these compute nodes.

We would be happy for any help or suggestions.

Best Greetings
-- 
Dipl.-Inf. André Gemünd, Leiter IT-S
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] Updated Lmod RPM package needed for EasyBuild v3.7.0

2018-09-26 Thread André Gemünd
Hi Andreas,

afaik Update 6 is still under development though (notice the Factory folder), 
so packages could still change or have issues. 

Best Greetings
Andre

- Am 26. Sep 2018 um 15:32 schrieb Andreas Hilboll hilb...@uni-bremen.de:

> We just yesterday installed 7.8.1 from
> http://build.openhpc.community/OpenHPC:/1.3:/Update6:/Factory/CentOS_7/x86_64/lmod-ohpc-7.8.1-5.1.ohpc.1.3.6.x86_64.rpm,
> and everything's working smoothly.
> 
> Cheers,
>  Andreas
> 
> 
> André Gemünd  writes:
> 
>> Dear Ole,
>>
>> maybe the OpenHPC for CentOS 7 package can be useful for that?
>> It provides 7.7.14 currently:
>>
>> http://build.openhpc.community/OpenHPC:/1.3:/Update5/CentOS_7/src/lmod-ohpc-7.7.14-3.1.src.rpm
>>
>> Greetings
>> André
>>
>> - Am 26. Sep 2018 um 13:25 schrieb Ole Holm Nielsen
>> ole.h.niel...@fysik.dtu.dk:
>>
>>> Dear Kenneth,
>>> 
>>> Thanks a lot for the answer. Since CentOS 7 is a very popular
>>> HPC
>>> cluster OS, it would be great to obtain an authoritative RPM
>>> package for
>>> Lmod as required by EasyBuild.  It's very unfortunate that EPEL
>>> hasn't
>>> updated the Lmod RPM in 2 years.  Your .spec file for Lmod 6.6
>>> seems to
>>> me not to be very much plug-and-play :-(
>>> 
>>> If no-one can provide an Lmod 6.6.3 RPM for CentOS 7, then it
>>> would save
>>> us a lot of trouble if EB could still use Lmod 6.5.1 until a
>>> newer RPM
>>> becomes available.
>>> 
>>> Thanks a lot,
>>> Ole
>>> 
>>> 
>>> On 26-09-2018 11:38, Kenneth Hoste wrote:
>>>> Dear Ole,
>>>> 
>>>> We build our own Lmod RPMs so we can stay on top of recent
>>>> developments.
>>>> 
>>>> You can find our .spec file in
>>>> https://github.com/hpcugent/Lmod-UGent.
>>>> 
>>>> If you go back in history a bit, you should be able to find a
>>>> .spec file
>>>> for Lmod 6.x.
>>>> 
>>>> Of course, you'll need to customize this w.r.t. Lmod
>>>> configuration & such.
>>>> 
>>>> I should also mention that the Lmod 6.6.3 requirement may be a
>>>> bit more
>>>> than is strictly required...
>>>> In theory a slightly older Lmod 6.x should be fine too, but I
>>>> kept
>>>> running into problems left & right when testing on top of
>>>> older Lmod 6.x
>>>> versions, so I figured going with the latest 6.x was a
>>>> reasonable
>>>> compromise (I didn't want to force people to switch to Lmod
>>>> 7).
>>>> 
>>>> If that's a big problem, I can try and reconsider that version
>>>> requirement to loosen it up a bit (as long as all the tests
>>>> pass, that
>>>> it), and issue a quick EasyBuild 3.7.1.
>>>> 
>>>> On the other hand, this is good motivation to update your Lmod
>>>> installation, there have been *a lot* of improvements to Lmod
>>>> since
>>>> version 6.5.1 (which was released Aug 2016...).
>>>> 
>>>> 
>>>> regards,
>>>> 
>>>> Kenneth
>>>> 
>>>> On 26/09/2018 11:25, Ole Holm Nielsen wrote:
>>>>> Regarding upgrading to EB 3.7.0:
>>>>>
>>>>> On 25-09-2018 15:09, Kenneth Hoste wrote:
>>>>>> Note that the minimal version requirement for Lmod has been
>>>>>> bumped to
>>>>>> 6.6.3.
>>>>>
>>>>> We use CentOS 7 and the 2 years old Lmod RPM package provided
>>>>> by the
>>>>> EPEL repository: Lmod-6.5.1-2.el7.x86_64.rpm
>>>>>
>>>>> Can anyone suggest the best way to get or build an RPM
>>>>> package of a
>>>>> recent version of Lmod [1] that works well on CentOS 7?
>>>>>
>>>>> Thanks,
>>>>> Ole
>>>>>
> >> >> [1] https://github.com/TACC/Lmod

-- 
André Gemünd
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] Updated Lmod RPM package needed for EasyBuild v3.7.0

2018-09-26 Thread André Gemünd
Dear Ole,

maybe the OpenHPC for CentOS 7 package can be useful for that? It provides 
7.7.14 currently:

http://build.openhpc.community/OpenHPC:/1.3:/Update5/CentOS_7/src/lmod-ohpc-7.7.14-3.1.src.rpm

Greetings
André

- Am 26. Sep 2018 um 13:25 schrieb Ole Holm Nielsen 
ole.h.niel...@fysik.dtu.dk:

> Dear Kenneth,
> 
> Thanks a lot for the answer. Since CentOS 7 is a very popular HPC
> cluster OS, it would be great to obtain an authoritative RPM package for
> Lmod as required by EasyBuild.  It's very unfortunate that EPEL hasn't
> updated the Lmod RPM in 2 years.  Your .spec file for Lmod 6.6 seems to
> me not to be very much plug-and-play :-(
> 
> If no-one can provide an Lmod 6.6.3 RPM for CentOS 7, then it would save
> us a lot of trouble if EB could still use Lmod 6.5.1 until a newer RPM
> becomes available.
> 
> Thanks a lot,
> Ole
> 
> 
> On 26-09-2018 11:38, Kenneth Hoste wrote:
>> Dear Ole,
>> 
>> We build our own Lmod RPMs so we can stay on top of recent developments.
>> 
>> You can find our .spec file in https://github.com/hpcugent/Lmod-UGent.
>> 
>> If you go back in history a bit, you should be able to find a .spec file
>> for Lmod 6.x.
>> 
>> Of course, you'll need to customize this w.r.t. Lmod configuration & such.
>> 
>> I should also mention that the Lmod 6.6.3 requirement may be a bit more
>> than is strictly required...
>> In theory a slightly older Lmod 6.x should be fine too, but I kept
>> running into problems left & right when testing on top of older Lmod 6.x
>> versions, so I figured going with the latest 6.x was a reasonable
>> compromise (I didn't want to force people to switch to Lmod 7).
>> 
>> If that's a big problem, I can try and reconsider that version
>> requirement to loosen it up a bit (as long as all the tests pass, that
>> it), and issue a quick EasyBuild 3.7.1.
>> 
>> On the other hand, this is good motivation to update your Lmod
>> installation, there have been *a lot* of improvements to Lmod since
>> version 6.5.1 (which was released Aug 2016...).
>> 
>> 
>> regards,
>> 
>> Kenneth
>> 
>> On 26/09/2018 11:25, Ole Holm Nielsen wrote:
>>> Regarding upgrading to EB 3.7.0:
>>>
>>> On 25-09-2018 15:09, Kenneth Hoste wrote:
>>>> Note that the minimal version requirement for Lmod has been bumped to
>>>> 6.6.3.
>>>
>>> We use CentOS 7 and the 2 years old Lmod RPM package provided by the
>>> EPEL repository: Lmod-6.5.1-2.el7.x86_64.rpm
>>>
>>> Can anyone suggest the best way to get or build an RPM package of a
>>> recent version of Lmod [1] that works well on CentOS 7?
>>>
>>> Thanks,
>>> Ole
>>>
> >> [1] https://github.com/TACC/Lmod

-- 
André Gemünd
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] Toolchain Updates

2018-05-16 Thread André Gemünd
Dear EasyBuilders,

please let me add to that.

Coincidentally, there is a bug in the compiler version of Intel 2018a that lets 
the compiler segfault during compilation of MUMPS: 
https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/754459

Wouldn't it make sense to have the upgrade be used in the 2018a toolchain?

Greetings
André 

- Am 16. Mai 2018 um 10:27 schrieb Andre Gemuend 
andre.gemu...@scai.fraunhofer.de:

> Dear List,
> 
> I apologize upfront if this question has been asked multiple times before (I
> feel I have seen it, but don't find it anymore).
> 
> We're currently preparing a few recipes and would like to use Intel 2018.2,
> meaning icc/icpc/ifort 18.0.2 and mpi 2018.2.199 instead of version 1 thats in
> intel-2018a. To give back something, I was thinking of submitting them for the
> first time (yay), but wonder if it makes sense to specify as toolchain intel
> 2018.02 then, or if I should instead use 2018a? This still somehow baffles me.
> We like to have most things build with the latest compilers. Hope someone can
> shed some light on this for me.
> 
> Best Greetings
> --
> André Gemünd
> Fraunhofer-Institute for Algorithms and Scientific Computing
> andre.gemu...@scai.fraunhofer.de
> Tel: +49 2241 14-2193
> /C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend

-- 
André Gemünd
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


[easybuild] Specifying greater than for builddependencies

2018-05-16 Thread André Gemünd
Dear EasyBuilders,

another question that has cropped up for us is how to specify the minimum 
required CMake version as builddependency. In our view its unnecessary to 
install CMake for specific toolchains, and also to specify any fixed version. 
We just need to specify "CMake > 2.8.12", replicating the often used 
cmake_minimum_required macro in CMakeLists.txt. Of course, it would even be 
nicer to do this from the CMakeMake easyblock and automatically find the 
required version in the CMakeLists.txt, but the first idea would be enough for 
now I think. Is it possible?

Best Greetings
-- 
André Gemünd
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


[easybuild] Toolchain Updates

2018-05-16 Thread André Gemünd
Dear List,

I apologize upfront if this question has been asked multiple times before (I 
feel I have seen it, but don't find it anymore). 

We're currently preparing a few recipes and would like to use Intel 2018.2, 
meaning icc/icpc/ifort 18.0.2 and mpi 2018.2.199 instead of version 1 thats in 
intel-2018a. To give back something, I was thinking of submitting them for the 
first time (yay), but wonder if it makes sense to specify as toolchain intel 
2018.02 then, or if I should instead use 2018a? This still somehow baffles me. 
We like to have most things build with the latest compilers. Hope someone can 
shed some light on this for me.

Best Greetings
-- 
André Gemünd
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] Name change of cuDNN tgz files

2018-03-14 Thread André Gemünd
We build Tensorflow on our CentOS 6 cluster. I think disabling jemalloc should 
be enough to avoid the HUGEPAGE problem.

Greetings
André

- Am 14. Mrz 2018 um 12:03 schrieb Jack Perdue j-per...@tamu.edu:

> p.s. so far, for non-GPU, I have a 6x speed up over the Anaconda3/5.1.0
> version
> on an AVX2-based cluster (1 node, 28 CPUs).  (stick with -O2 [EB
> default]... -O3 ["opt": True] doesn't help).
> 
> Preparing to run same benchmark[*] with the GPU(s) (2xTesla80).
> 
> https://github.com/tensorflow/benchmarks/blob/master/scripts/tf_cnn_benchmarks/README.md
> 
> Some other notes:
> 
> Don't even try to build TensorFlow on RHEL/CentOS6. its older
> kernel doesn't have MADV_HUGEPAGE support.
> 
> For TensorFlow, I customized for our site with:
> 
> cuda_compute_capabilities = ['3.5', '3.7']  # for Tesla K20 (ada) and
> K80 (terra)
> 
> I could have left out 3.5 since ada is RHEL6.
> 
> jack
> 
> 
> On 03/14/2018 05:48 AM, Jack Perdue wrote:
>> +1 !
>>
>> I struggled with the same issue (I have no idea where Stephane got
>> his/her copy).
>>
>> FWIW, here's (attached) what I came up with which includes
>> that fix and a cleanup of the duplicate libs.
>>
>> Jack Perdue
>> Lead Systems Administrator
>> High Performance Research Computing
>> TAMU Division of Research
>> j-per...@tamu.edu    http://hprc.tamu.edu
>> HPRC Helpdesk: h...@hprc.tamu.edu
>>
>> On 03/14/2018 05:35 AM, Joachim Hein wrote:
>>> Hi,
>>>
>>> I am trying TensorFlow-1.5.0-goolfc-2017b-Python-3.6.3.eb .  It is
>>> looking for a file cudnn-9.0-linux-x64-v7.0.5.15.tgz  , however I am
>>> currently getting cudnn-9.0-linux-x64-v7.tgz from the Nvidea download
>>> site.  The sha256 sum of the file I just downloaded agrees with the
>>> one in the EB-config.  After renaming my download to the name
>>> expected by EB, cuDNN builds.
>>>
>>> Can the config be upgraded to handle both, old and new name?  Is that
>>> something EB supports?  Otherwise we should leave a comment inside
>>> the config, that renaming is a work around (one needs a manual
>>> download of sources anyway).
>>>
>>> Any comments?
>>>
>>> Best wishes
>>>    Joachim
>>>
>>>
>>>

-- 
André Gemünd
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] How to setup EB in the global /etc/profile.d/ ?

2016-10-13 Thread André Gemünd
Hi Ward,

sorry for interfering with this thread, but *is* there actually a global config 
file? I asked this some time ago on the ml, and as far as I remember there are 
only local config files ($XDG_...), no global config e.g. under the 
installation prefix.

Cheers
Andre

- Am 13. Okt 2016 um 13:54 schrieb Ward Poelmans ward.poelm...@ugent.be:

> On 13-10-16 13:48, Ole Holm Nielsen wrote:
>> I would like to enable the EB/Lmod modules to all users in a global
>> bash/tcsh setup script.  On CentOS 7.2 initialization is done using
>> scripts in /etc/profile.d/
>> 
>> Basically I want every user shell to execute these commands:
>> 
>> export EASYBUILD_MODULES_TOOL=Lmod
>> export EASYBUILD_PREFIX=/home/modules
>> module use $EASYBUILD_PREFIX/modules/all
>> module load EasyBuild
> 
> The settings you can put in a global config file? Much safer then
> environment variables.
> 
> For loading Easybuild by default, I would use a Lmod default collection.
> 
> Ward


Re: [easybuild] Beginner questions

2016-03-11 Thread André Gemünd
Thanks for the input everyone! It will probably take me some time digesting and 
trying out these ideas.

Cheers
-- 
André Gemünd
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend


Re: [easybuild] Beginner questions

2016-03-11 Thread André Gemünd
Dear Alan, 

thanks for the feedback!

> We use a module file to control the settings you mention, it simply sets the
> equivalent EASYBUILD_* for the settings you require. Create the module and put
> it somewhere in accessible to the MODULEPATH (you may want a separate 
> protected
> directory for people who use this module).

Ah, yes, an obvious solution that I somehow didn't think about, because there 
already is an EasyBuild module created by the bootstrap script and I didn't 
want to modify that in case it would be modified in upgrades. But its a good 
idea, maybe a protected admin module.

> There's a later intel-2016b.eb already available which has a 2016 update and 
> an
> intel-2016.02 in the devel branch that has the very latest intel compilers. 

True, but we need to offer multiple versions at the same time, currently 
starting at XE 2012. Of the seperate point releases (2013, 2015, etc.), we'd 
like to offer the latest minor version, e.g. 2015 upd 6. 

> issue is that you will not find so many easyconfigs for these toolchains yet
> since they are relatively new. The ones you do find are the latest and 
> greatest
> however and you can add software from intel-2015b using
> --try-toolchain=intel,2016b (just beware of the notes in that link!).

Regarding applications its okay if we build them with the latest and greatest 
Intel toolchain. Thanks for the hint with try-toolchain, this seems very 
helpful! 

Greetings
-- 
André Gemünd
Fraunhofer-Institute for Algorithms and Scientific Computing
andre.gemu...@scai.fraunhofer.de
Tel: +49 2241 14-2193
/C=DE/O=Fraunhofer/OU=SCAI/OU=People/CN=Andre Gemuend