Re: [easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Ole Holm Nielsen

Dear Kenneth,

On 9/28/23 10:49, Kenneth Hoste wrote:
Not seeing the problem with OpenBLAS 0.3.23 is encouraging, that probably 
means a fix is hiding in either OpenBLAS 0.3.22 or 0.3.23 that we may be 
able to backport to 0.3.21.


I don't see anything obvious in the release notes though (see 
https://github.com/OpenMathLib/OpenBLAS/releases) at first glance.


Can you try and see if there's a problem with OpenBLAS 0.3.22, by using:

eb --try-software-version 0.3.22 OpenBLAS-0.3.23-GCC-12.3.0.eb

That would help narrow things down (a bit).


That try failed:

$ eb --try-software-version 0.3.22 OpenBLAS-0.3.23-GCC-12.3.0.eb
== Temporary log file in case of crash /tmp/eb-ljmkzhs7/easybuild-l1cmgr72.log
== found valid index for 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs, so using it...
== found valid index for 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs, so using it...
== processing EasyBuild easyconfig 
/tmp/eb-ljmkzhs7/tweaked_easyconfigs/OpenBLAS-0.3.22-GCC-12.3.0.eb

== building and installing OpenBLAS/0.3.22-GCC-12.3.0...
== fetching files...
== ... (took 6 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 49 secs)
== FAILED: Installation ended unsuccessfully (build directory: 
/dev/shm/OpenBLAS/0.3.22/GCC-12.3.0): build failed (first 300 chars): cmd 
" make -j 32 libs netlib shared  BINARY='64'  CC='gcc'  FC='gfortran' 
MAKE_NB_JOBS='-1'  USE_OPENMP='1'  USE_THREAD='1'  CFLAGS='-O2 
-ftree-vectorize -march=native -fno-math-errno' " exited with exit code 2 
and output:
/home/modules/software/binutils/2.40-GCCcore-12.3.0/bin/ld: warning: 
/tmp/eb (took 56 secs)
== Results of the build can be found in the log file(s) 
/tmp/eb-ljmkzhs7/easybuild-OpenBLAS-0.3.22-20230928.104942.hoMjh.log
ERROR: Build of 
/tmp/eb-ljmkzhs7/tweaked_easyconfigs/OpenBLAS-0.3.22-GCC-12.3.0.eb failed 
(err: 'build failed (first 300 chars): cmd " make -j 32 libs netlib shared 
 BINARY=\'64\'  CC=\'gcc\'  FC=\'gfortran\'  MAKE_NB_JOBS=\'-1\' 
USE_OPENMP=\'1\'  USE_THREAD=\'1\'  CFLAGS=\'-O2 -ftree-vectorize 
-march=native -fno-math-errno\' " exited with exit code 2 and 
output:\n/home/modules/software/binutils/2.40-GCCcore-12.3.0/bin/ld: 
warning: /tmp/eb')


The logfile ends with:

== 2023-09-28 10:50:39,240 filetools.py:2012 INFO Removing lock 
/home/modules/software/.locks/_home_modules_software_OpenBLAS_0.3.22-GCC-12.3.0.lock...
== 2023-09-28 10:50:39,241 filetools.py:383 INFO Path 
/home/modules/software/.locks/_home_modules_software_OpenBLAS_0.3.22-GCC-12.3.0.lock 
successfully removed.
== 2023-09-28 10:50:39,241 filetools.py:2016 INFO Lock removed: 
/home/modules/software/.locks/_home_modules_software_OpenBLAS_0.3.22-GCC-12.3.0.lock
== 2023-09-28 10:50:39,241 easyblock.py:4277 WARNING build failed (first 
300 chars): cmd " make -j 32 libs netlib shared  BINARY='64'  CC='gcc' 
FC='gfortran'  MAKE_NB_JOBS='-1'  USE_OPENMP='1'  USE_THREAD='1' 
CFLAGS='-O2 -ftree-vectorize -march=native -fno-math-errno' " exited with 
exit code 2 and output:

/home/modules/software/binutils/2.40-GCCcore-12.3.0/bin/ld: warning: /tmp/eb
== 2023-09-28 10:50:39,241 easyblock.py:328 INFO Closing log for 
application name OpenBLAS version 0.3.22




Best regards,
Ole


Re: [easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Kenneth Hoste

Hi Ole,

On 28/09/2023 10:45, Ole Holm Nielsen wrote:

Dear Kenneth,

On 9/28/23 09:42, Kenneth Hoste wrote:

I suspect the problem is more with OpenBLAS than GCC.

OpenBLAS 0.3.20 probably doesn't detect AMD Genoa (Zen4) correctly 
yet, and doesn't try to use AVX-512 instructions there.


OpenBLAS 0.3.21 detects Genoa, enbales AVX-512, but there's a bug in a 
kernel being used.


I would try and see whether you observe any problems with more recent 
OpenBLAS versions, like OpenBLAS-0.3.23-GCC-12.3.0.eb .


That version build correctly:

$ eb OpenBLAS-0.3.23-GCC-12.3.0.eb -r
(lines deleted)
== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.23-GCC-12.3.0.eb

== building and installing OpenBLAS/0.3.23-GCC-12.3.0...
== fetching files...
== ... (took 6 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 57 secs)
== testing...
== ... (took 2 mins 34 secs)
== installing...
== ... (took 2 secs)
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
== cleaning up...
== creating module...
== permissions...
== packaging...
== COMPLETED: Installation ended successfully (took 3 mins 42 secs)
== Results of the build can be found in the log file(s) 
/home/modules/software/OpenBLAS/0.3.23-GCC-12.3.0/easybuild/easybuild-OpenBLAS-0.3.23-20230928.103500.log

== Build succeeded for 22 out of 22

If not, we may be able to trace down the fix and patch OpenBLAS 0.3.21 
to fix the problem you're seeing...


So is there any hope that foss-2022b.eb with OpenBLAS/0.3.21-GCC-12.2.0 
can be made to work correctly on AMD Genoa nodes?



Not seeing the problem with OpenBLAS 0.3.23 is encouraging, that 
probably means a fix is hiding in either OpenBLAS 0.3.22 or 0.3.23 that 
we may be able to backport to 0.3.21.


I don't see anything obvious in the release notes though (see 
https://github.com/OpenMathLib/OpenBLAS/releases) at first glance.


Can you try and see if there's a problem with OpenBLAS 0.3.22, by using:

eb --try-software-version 0.3.22 OpenBLAS-0.3.23-GCC-12.3.0.eb

That would help narrow things down (a bit).


regards,

Kenneth



Thanks,
Ole


On 28/09/2023 09:26, Ole Holm Nielsen wrote:
It's interesting that while attempting to build the foss-2022a 
toolchain in stead of foss-2022b, the build of OpenBLAS with GCC 
11.3.0 succeeds without errors:


== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.20-GCC-11.3.0.eb

== building and installing OpenBLAS/0.3.20-GCC-11.3.0...
== fetching files...
== ... (took 4 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 56 secs)
== testing...
== ... (took 2 mins 24 secs)
== installing...
== ... (took 1 secs)
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
== cleaning up...
== creating module...
== permissions...
== packaging...
== COMPLETED: Installation ended successfully (took 3 mins 28 secs)

The only difference here appears to be GCC version 12.2.0 versus 11.3.0!

Any ideas about what's causing this error in the tests?

Perhaps GCC version 12.2.0 tries to use the new AVX-512 instructions 
in AMD Genoa and has a bug?


Thanks,
Ole


On 9/26/23 08:04, Ole Holm Nielsen wrote:
I'm starting EasyBuild up on our new AMD "Genoa" platform with 1 AMD 
EPYC 9124 16-Core Processor with 2 threads/core, 384 GB RAM, and 
AlmaLinux 8.8 OS.


Unfortunately, building the foss-2022b toolchain exits during the 
testing phase of OpenBLAS-0.3.21-GCC-12.2.0.eb as shown below.  Does 
anyone have ideas about what might be wrong?


$ eb foss-2022b.eb -r
(lines deleted)
== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.21-GCC-12.2.0.eb

== building and installing OpenBLAS/0.3.21-GCC-12.2.0...
== fetching files...
== ... (took 7 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 53 secs)
== testing...
== ... (took 12 secs)
== FAILED: Installation ended unsuccessfully (build directory: 
/dev/shm/OpenBLAS/0.3.21/GCC-12.2.0): build failed (first 300 
chars): cmd " make tests  BINARY='64'  CC='gcc'  FC='gfortran' 
MAKE_NB_JOBS='-1' USE_OPENMP='1'  USE_THREAD='1' " exited with exit 
code 2 and output:
/home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: 
/tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies 
executable stack

/ (took 1 min 14 secs)
== Results of the build can be found in the log file(s) 
/tmp/eb-74m3kzgo/easybuild-OpenBLAS-0.3.21-20230925.161149.UfDUO.log
ERROR: Build of 

Re: [easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Ole Holm Nielsen

Dear Kenneth,

On 9/28/23 09:42, Kenneth Hoste wrote:

I suspect the problem is more with OpenBLAS than GCC.

OpenBLAS 0.3.20 probably doesn't detect AMD Genoa (Zen4) correctly yet, 
and doesn't try to use AVX-512 instructions there.


OpenBLAS 0.3.21 detects Genoa, enbales AVX-512, but there's a bug in a 
kernel being used.


I would try and see whether you observe any problems with more recent 
OpenBLAS versions, like OpenBLAS-0.3.23-GCC-12.3.0.eb .


That version build correctly:

$ eb OpenBLAS-0.3.23-GCC-12.3.0.eb -r
(lines deleted)
== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.23-GCC-12.3.0.eb

== building and installing OpenBLAS/0.3.23-GCC-12.3.0...
== fetching files...
== ... (took 6 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 57 secs)
== testing...
== ... (took 2 mins 34 secs)
== installing...
== ... (took 2 secs)
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
== cleaning up...
== creating module...
== permissions...
== packaging...
== COMPLETED: Installation ended successfully (took 3 mins 42 secs)
== Results of the build can be found in the log file(s) 
/home/modules/software/OpenBLAS/0.3.23-GCC-12.3.0/easybuild/easybuild-OpenBLAS-0.3.23-20230928.103500.log

== Build succeeded for 22 out of 22

If not, we may be able to trace down the fix and patch OpenBLAS 0.3.21 to 
fix the problem you're seeing...


So is there any hope that foss-2022b.eb with OpenBLAS/0.3.21-GCC-12.2.0 
can be made to work correctly on AMD Genoa nodes?


Thanks,
Ole


On 28/09/2023 09:26, Ole Holm Nielsen wrote:
It's interesting that while attempting to build the foss-2022a toolchain 
in stead of foss-2022b, the build of OpenBLAS with GCC 11.3.0 succeeds 
without errors:


== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.20-GCC-11.3.0.eb

== building and installing OpenBLAS/0.3.20-GCC-11.3.0...
== fetching files...
== ... (took 4 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 56 secs)
== testing...
== ... (took 2 mins 24 secs)
== installing...
== ... (took 1 secs)
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
== cleaning up...
== creating module...
== permissions...
== packaging...
== COMPLETED: Installation ended successfully (took 3 mins 28 secs)

The only difference here appears to be GCC version 12.2.0 versus 11.3.0!

Any ideas about what's causing this error in the tests?

Perhaps GCC version 12.2.0 tries to use the new AVX-512 instructions in 
AMD Genoa and has a bug?


Thanks,
Ole


On 9/26/23 08:04, Ole Holm Nielsen wrote:
I'm starting EasyBuild up on our new AMD "Genoa" platform with 1 AMD 
EPYC 9124 16-Core Processor with 2 threads/core, 384 GB RAM, and 
AlmaLinux 8.8 OS.


Unfortunately, building the foss-2022b toolchain exits during the 
testing phase of OpenBLAS-0.3.21-GCC-12.2.0.eb as shown below.  Does 
anyone have ideas about what might be wrong?


$ eb foss-2022b.eb -r
(lines deleted)
== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.21-GCC-12.2.0.eb

== building and installing OpenBLAS/0.3.21-GCC-12.2.0...
== fetching files...
== ... (took 7 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 53 secs)
== testing...
== ... (took 12 secs)
== FAILED: Installation ended unsuccessfully (build directory: 
/dev/shm/OpenBLAS/0.3.21/GCC-12.2.0): build failed (first 300 chars): 
cmd " make tests  BINARY='64'  CC='gcc'  FC='gfortran' 
MAKE_NB_JOBS='-1' USE_OPENMP='1'  USE_THREAD='1' " exited with exit 
code 2 and output:
/home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: 
/tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies 
executable stack

/ (took 1 min 14 secs)
== Results of the build can be found in the log file(s) 
/tmp/eb-74m3kzgo/easybuild-OpenBLAS-0.3.21-20230925.161149.UfDUO.log
ERROR: Build of 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.21-GCC-12.2.0.eb failed (err: 'build failed (first 300 chars): cmd " make tests BINARY=\'64\'  CC=\'gcc\'  FC=\'gfortran\'  MAKE_NB_JOBS=\'-1\' USE_OPENMP=\'1\'  USE_THREAD=\'1\' " exited with exit code 2 and output:\n/home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: /tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies executable stack\n/')



The log file shows some an error in test_kernel_regress.c:50:

(lines deleted)
./openblas_utest
TEST 1/37 max:smax_zero [OK]
TEST 2/37 max:dmax_positive [OK]
TEST 3/37 max:smax_negative [OK]
TEST 4/37 

Re: [easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Kenneth Hoste

Dear Ole,

I suspect the problem is more with OpenBLAS than GCC.

OpenBLAS 0.3.20 probably doesn't detect AMD Genoa (Zen4) correctly yet, 
and doesn't try to use AVX-512 instructions there.


OpenBLAS 0.3.21 detects Genoa, enbales AVX-512, but there's a bug in a 
kernel being used.


I would try and see whether you observe any problems with more recent 
OpenBLAS versions, like OpenBLAS-0.3.23-GCC-12.3.0.eb .


If not, we may be able to trace down the fix and patch OpenBLAS 0.3.21 
to fix the problem you're seeing...



regards,

Kenneth

On 28/09/2023 09:26, Ole Holm Nielsen wrote:
It's interesting that while attempting to build the foss-2022a toolchain 
in stead of foss-2022b, the build of OpenBLAS with GCC 11.3.0 succeeds 
without errors:


== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.20-GCC-11.3.0.eb

== building and installing OpenBLAS/0.3.20-GCC-11.3.0...
== fetching files...
== ... (took 4 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 56 secs)
== testing...
== ... (took 2 mins 24 secs)
== installing...
== ... (took 1 secs)
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
== cleaning up...
== creating module...
== permissions...
== packaging...
== COMPLETED: Installation ended successfully (took 3 mins 28 secs)

The only difference here appears to be GCC version 12.2.0 versus 11.3.0!

Any ideas about what's causing this error in the tests?

Perhaps GCC version 12.2.0 tries to use the new AVX-512 instructions in 
AMD Genoa and has a bug?


Thanks,
Ole


On 9/26/23 08:04, Ole Holm Nielsen wrote:
I'm starting EasyBuild up on our new AMD "Genoa" platform with 1 AMD 
EPYC 9124 16-Core Processor with 2 threads/core, 384 GB RAM, and 
AlmaLinux 8.8 OS.


Unfortunately, building the foss-2022b toolchain exits during the 
testing phase of OpenBLAS-0.3.21-GCC-12.2.0.eb as shown below.  Does 
anyone have ideas about what might be wrong?


$ eb foss-2022b.eb -r
(lines deleted)
== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.21-GCC-12.2.0.eb

== building and installing OpenBLAS/0.3.21-GCC-12.2.0...
== fetching files...
== ... (took 7 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 53 secs)
== testing...
== ... (took 12 secs)
== FAILED: Installation ended unsuccessfully (build directory: 
/dev/shm/OpenBLAS/0.3.21/GCC-12.2.0): build failed (first 300 chars): 
cmd " make tests  BINARY='64'  CC='gcc'  FC='gfortran'  
MAKE_NB_JOBS='-1' USE_OPENMP='1'  USE_THREAD='1' " exited with exit 
code 2 and output:
/home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: 
/tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies 
executable stack

/ (took 1 min 14 secs)
== Results of the build can be found in the log file(s) 
/tmp/eb-74m3kzgo/easybuild-OpenBLAS-0.3.21-20230925.161149.UfDUO.log
ERROR: Build of 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.21-GCC-12.2.0.eb failed (err: 'build failed (first 300 chars): cmd " make tests BINARY=\'64\'  CC=\'gcc\'  FC=\'gfortran\'  MAKE_NB_JOBS=\'-1\' USE_OPENMP=\'1\'  USE_THREAD=\'1\' " exited with exit code 2 and output:\n/home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: /tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies executable stack\n/')



The log file shows some an error in test_kernel_regress.c:50:

(lines deleted)
./openblas_utest
TEST 1/37 max:smax_zero [OK]
TEST 2/37 max:dmax_positive [OK]
TEST 3/37 max:smax_negative [OK]
TEST 4/37 min:smin_zero [OK]
TEST 5/37 min:dmin_positive [OK]
TEST 6/37 min:smin_negative [OK]
TEST 7/37 amax:damax [OK]
TEST 8/37 amax:samax [OK]
TEST 9/37 ismax:negative_step_2 [OK]
TEST 10/37 ismax:positive_step_2 [OK]
TEST 11/37 ismin:negative_step_2 [OK]
TEST 12/37 ismin:positive_step_2 [OK]
TEST 13/37 drotmg:drotmg_D1_big_D2_big_flag_zero [OK]
TEST 14/37 drotmg:rotmg_D1eqD2_X1eqX2 [OK]
TEST 15/37 drotmg:rotmg_issue1452 [OK]
TEST 16/37 drotmg:rotmg [OK]
TEST 17/37 axpy:caxpy_inc_0 [OK]
TEST 18/37 axpy:saxpy_inc_0 [OK]
TEST 19/37 axpy:zaxpy_inc_0 [OK]
TEST 20/37 axpy:daxpy_inc_0 [OK]
TEST 21/37 zdotu:zdotu_offset_1 [OK]
TEST 22/37 zdotu:zdotu_n_1 [OK]
TEST 23/37 dsdot:dsdot_n_1 [OK]
TEST 24/37 swap:cswap_inc_0 [OK]
TEST 25/37 swap:sswap_inc_0 [OK]
TEST 26/37 swap:zswap_inc_0 [OK]
TEST 27/37 swap:dswap_inc_0 [OK]
TEST 28/37 rot:csrot_inc_0 [OK]
TEST 29/37 rot:srot_inc_0 [OK]
TEST 30/37 rot:zdrot_inc_0 [OK]
TEST 31/37 rot:drot_inc_0 [OK]
TEST 32/37 dnrm2:dnrm2_tiny [OK]
TEST 33/37 dnrm2:dnrm2_inf [OK]
TEST 34/37 potrf:smoketest_trivial [OK]
TEST 35/37 potrf:bug_695 [OK]
TEST 36/37 kernel_regress:skx_avx [FAIL]
   ERR: test_kernel_regress.c:50  expected 

[easybuild] Re: OpenBLAS-0.3.21-GCC-12.2.0.eb testing failed om AMD "Genoa" node

2023-09-28 Thread Ole Holm Nielsen
It's interesting that while attempting to build the foss-2022a toolchain 
in stead of foss-2022b, the build of OpenBLAS with GCC 11.3.0 succeeds 
without errors:


== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.20-GCC-11.3.0.eb

== building and installing OpenBLAS/0.3.20-GCC-11.3.0...
== fetching files...
== ... (took 4 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 56 secs)
== testing...
== ... (took 2 mins 24 secs)
== installing...
== ... (took 1 secs)
== taking care of extensions...
== restore after iterating...
== postprocessing...
== sanity checking...
== cleaning up...
== creating module...
== permissions...
== packaging...
== COMPLETED: Installation ended successfully (took 3 mins 28 secs)

The only difference here appears to be GCC version 12.2.0 versus 11.3.0!

Any ideas about what's causing this error in the tests?

Perhaps GCC version 12.2.0 tries to use the new AVX-512 instructions in 
AMD Genoa and has a bug?


Thanks,
Ole


On 9/26/23 08:04, Ole Holm Nielsen wrote:
I'm starting EasyBuild up on our new AMD "Genoa" platform with 1 AMD EPYC 
9124 16-Core Processor with 2 threads/core, 384 GB RAM, and AlmaLinux 8.8 OS.


Unfortunately, building the foss-2022b toolchain exits during the testing 
phase of OpenBLAS-0.3.21-GCC-12.2.0.eb as shown below.  Does anyone have 
ideas about what might be wrong?


$ eb foss-2022b.eb -r
(lines deleted)
== processing EasyBuild easyconfig 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.21-GCC-12.2.0.eb

== building and installing OpenBLAS/0.3.21-GCC-12.2.0...
== fetching files...
== ... (took 7 secs)
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== ... (took 53 secs)
== testing...
== ... (took 12 secs)
== FAILED: Installation ended unsuccessfully (build directory: 
/dev/shm/OpenBLAS/0.3.21/GCC-12.2.0): build failed (first 300 chars): cmd 
" make tests  BINARY='64'  CC='gcc'  FC='gfortran'  MAKE_NB_JOBS='-1' 
USE_OPENMP='1'  USE_THREAD='1' " exited with exit code 2 and output:
/home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: 
/tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies 
executable stack

/ (took 1 min 14 secs)
== Results of the build can be found in the log file(s) 
/tmp/eb-74m3kzgo/easybuild-OpenBLAS-0.3.21-20230925.161149.UfDUO.log
ERROR: Build of 
/home/modules/software/EasyBuild/4.8.1/easybuild/easyconfigs/o/OpenBLAS/OpenBLAS-0.3.21-GCC-12.2.0.eb failed (err: 'build failed (first 300 chars): cmd " make tests BINARY=\'64\'  CC=\'gcc\'  FC=\'gfortran\'  MAKE_NB_JOBS=\'-1\' USE_OPENMP=\'1\'  USE_THREAD=\'1\' " exited with exit code 2 and output:\n/home/modules/software/binutils/2.39-GCCcore-12.2.0/bin/ld: warning: /tmp/eb-74m3kzgo/ccy1Gkzg.o: missing .note.GNU-stack section implies executable stack\n/')



The log file shows some an error in test_kernel_regress.c:50:

(lines deleted)
./openblas_utest
TEST 1/37 max:smax_zero [OK]
TEST 2/37 max:dmax_positive [OK]
TEST 3/37 max:smax_negative [OK]
TEST 4/37 min:smin_zero [OK]
TEST 5/37 min:dmin_positive [OK]
TEST 6/37 min:smin_negative [OK]
TEST 7/37 amax:damax [OK]
TEST 8/37 amax:samax [OK]
TEST 9/37 ismax:negative_step_2 [OK]
TEST 10/37 ismax:positive_step_2 [OK]
TEST 11/37 ismin:negative_step_2 [OK]
TEST 12/37 ismin:positive_step_2 [OK]
TEST 13/37 drotmg:drotmg_D1_big_D2_big_flag_zero [OK]
TEST 14/37 drotmg:rotmg_D1eqD2_X1eqX2 [OK]
TEST 15/37 drotmg:rotmg_issue1452 [OK]
TEST 16/37 drotmg:rotmg [OK]
TEST 17/37 axpy:caxpy_inc_0 [OK]
TEST 18/37 axpy:saxpy_inc_0 [OK]
TEST 19/37 axpy:zaxpy_inc_0 [OK]
TEST 20/37 axpy:daxpy_inc_0 [OK]
TEST 21/37 zdotu:zdotu_offset_1 [OK]
TEST 22/37 zdotu:zdotu_n_1 [OK]
TEST 23/37 dsdot:dsdot_n_1 [OK]
TEST 24/37 swap:cswap_inc_0 [OK]
TEST 25/37 swap:sswap_inc_0 [OK]
TEST 26/37 swap:zswap_inc_0 [OK]
TEST 27/37 swap:dswap_inc_0 [OK]
TEST 28/37 rot:csrot_inc_0 [OK]
TEST 29/37 rot:srot_inc_0 [OK]
TEST 30/37 rot:zdrot_inc_0 [OK]
TEST 31/37 rot:drot_inc_0 [OK]
TEST 32/37 dnrm2:dnrm2_tiny [OK]
TEST 33/37 dnrm2:dnrm2_inf [OK]
TEST 34/37 potrf:smoketest_trivial [OK]
TEST 35/37 potrf:bug_695 [OK]
TEST 36/37 kernel_regress:skx_avx [FAIL]
   ERR: test_kernel_regress.c:50  expected 0.000e+00, got 6.734e+01 (diff 
-6.734e+01, tol 1.000e-10)

TEST 37/37 fork:safety_after_fork_in_parent [OK]
RESULTS: 37 tests (36 ok, 1 failed, 0 skipped) ran in 3 ms
make[1]: *** [Makefile:52: run_test] Error 1
make[1]: Leaving directory 
'/dev/shm/OpenBLAS/0.3.21/GCC-12.2.0/OpenBLAS-0.3.21/utest'

make: *** [Makefile:150: tests] Error 2
  (at easybuild/tools/run.py:681 in parse_cmd_output)
== 2023-09-25 16:13:04,292 build_log.py:267 INFO ... (took 12 secs)
== 2023-09-25 16:13:04,292 filetools.py:2012 INFO Removing lock