Re: [easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Ole Holm Nielsen
Dear Kenneth, On 9/28/23 10:49, Kenneth Hoste wrote: Not seeing the problem with OpenBLAS 0.3.23 is encouraging, that probably means a fix is hiding in either OpenBLAS 0.3.22 or 0.3.23 that we may be able to backport to 0.3.21. I don't see anything obvious in the release notes though (see ht

Re: [easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Kenneth Hoste
Hi Ole, On 28/09/2023 10:45, Ole Holm Nielsen wrote: Dear Kenneth, On 9/28/23 09:42, Kenneth Hoste wrote: I suspect the problem is more with OpenBLAS than GCC. OpenBLAS 0.3.20 probably doesn't detect AMD Genoa (Zen4) correctly yet, and doesn't try to use AVX-512 instructions there. OpenBLA

Re: [easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Ole Holm Nielsen
Dear Kenneth, On 9/28/23 09:42, Kenneth Hoste wrote: I suspect the problem is more with OpenBLAS than GCC. OpenBLAS 0.3.20 probably doesn't detect AMD Genoa (Zen4) correctly yet, and doesn't try to use AVX-512 instructions there. OpenBLAS 0.3.21 detects Genoa, enbales AVX-512, but there's a

Re: [easybuild] OpenMPI-4.1.4-GCC-12.2.0.eb Sanity check failed on AMD "Genoa" node

2023-09-28 Thread Ole Holm Nielsen
Dear Kenneth, On 9/28/23 10:07, Kenneth Hoste wrote: Unfortunately, building the foss-2022b toolchain exits during the testing phase of OpenMPI-4.1.4-GCC-12.2.0.eb as shown below.  Does anyone have ideas about what might be wrong? ... By default OpenMPI is being configured with "--with-verbs",

Re: [easybuild] OpenMPI-4.1.4-GCC-12.2.0.eb Sanity check failed on AMD "Genoa" node

2023-09-28 Thread Kenneth Hoste
Dear Ole, On 26/09/2023 08:24, Ole Holm Nielsen wrote: I'm starting EasyBuild up on our new AMD "Genoa" platform with 1 AMD EPYC 9124 16-Core Processor with 2 threads/core, 384 GB RAM, Ethernet network only, and AlmaLinux 8.8 OS. Unfortunately, building the foss-2022b toolchain exits during t

Re: [easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Kenneth Hoste
Dear Ole, I suspect the problem is more with OpenBLAS than GCC. OpenBLAS 0.3.20 probably doesn't detect AMD Genoa (Zen4) correctly yet, and doesn't try to use AVX-512 instructions there. OpenBLAS 0.3.21 detects Genoa, enbales AVX-512, but there's a bug in a kernel being used. I would try a

[easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Ole Holm Nielsen
It's interesting that while attempting to build the foss-2022a toolchain in stead of foss-2022b, the build of OpenBLAS with GCC 11.3.0 succeeds without errors: == processing EasyBuild easyconfig /home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.20-GCC-11.3.0.