We're seeing testing failures on an AMD EPYC 7402 Rome server running CentOS 7.9 when building SciPy-bundle-2020.11-intel-2020b.eb. The corresponding foss-2020b build works without problems.

I realize that we're using the Intel compilers on an AMD Rome platform, and that dragons may be lurking! So I wonder if other sites with AMD Rome nodes have seen this problem and perhaps found a fix?

This is what we see:

$ eb GPAW-21.1.0-intel-2020b-ASE-3.21.1.eb -r
== Temporary log file in case of crash /tmp/eb-1cYcYH/easybuild-g6UgUd.log
== found valid index for /home/modules/software/EasyBuild/4.3.4/easybuild/easyconfigs, so using it... == found valid index for /home/modules/software/EasyBuild/4.3.4/easybuild/easyconfigs, so using it...
== resolving dependencies ...
== processing EasyBuild easyconfig /home/modules/software/EasyBuild/4.3.4/easybuild/easyconfigs/s/SciPy-bundle/SciPy-bundle-2020.11-intel-2020b.eb
== building and installing SciPy-bundle/2020.11-intel-2020b...
== fetching files...
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== testing...
== installing...
== taking care of extensions...
== installing extension numpy 1.19.4 (1/8)...
==      configuring...
==      building...
==      testing...
== FAILED: Installation ended unsuccessfully (build directory: /home/modules/build/SciPybundle/2020.11/intel-2020b): build failed (first 300 chars): cmd "export PYTHONPATH=/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages:$PYTHONPATH && unset LDFLAGS && cd .. && /home/modules/software/Python/3.8.6-GCCcore-10.2.0/bin/python -c 'import sys; import numpy; sys.exit(not numpy.test(verbose=2))' " exited with exit code 1 and output:
NumPy version 1 (took 3 min 58 sec)
== Results of the build can be found in the log file(s) /tmp/eb-1cYcYH/easybuild-SciPy-bundle-2020.11-20210429.214727.rHXha.log ERROR: Build of /home/modules/software/EasyBuild/4.3.4/easybuild/easyconfigs/s/SciPy-bundle/SciPy-bundle-2020.11-intel-2020b.eb failed (err: 'build failed (first 300 chars): cmd "export PYTHONPATH=/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages:$PYTHONPATH && unset LDFLAGS && cd .. && /home/modules/software/Python/3.8.6-GCCcore-10.2.0/bin/python -c \'import sys; import numpy; sys.exit(not numpy.test(verbose=2))\' " exited with exit code 1 and output:\nNumPy version 1')


The EB log file contains some errors about NaNs:

$ grep '^E' /tmp/eb-1cYcYH/easybuild-g6UgUd.log
E           AssertionError:
E           Arrays are not almost equal to 6 decimals
E
E           x and y nan location mismatch:
E            x: array([2., 1.], dtype=float32)
E            y: array([nan, nan], dtype=float32)
E               AssertionError:
E               Arrays are not almost equal to 6 decimals
E                ACTUAL: array([2.+1.j, 1.+2.j], dtype=complex64)
E                DESIRED: array([nan+nanj, nan+nanj], dtype=complex64)
E               AssertionError: In test case: <LinalgCase: csingle>
E
E               Traceback (most recent call last):
E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 572, in assert_almost_equal
E                   assert_almost_equal(actualr, desiredr, decimal=decimal)
E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 579, in assert_almost_equal E return assert_array_almost_equal(actual, desired, decimal, err_msg) E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 1042, in assert_array_almost_equal E assert_array_compare(compare, x, y, err_msg=err_msg, verbose=verbose, E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 764, in assert_array_compare E flagged = func_assert_same_pos(x, y, func=isnan, hasval='nan') E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 740, in func_assert_same_pos
E                   raise AssertionError(msg)
E               AssertionError:
E               Arrays are not almost equal to 6 decimals
E
E               x and y nan location mismatch:
E                x: array([2., 1.], dtype=float32)
E                y: array([nan, nan], dtype=float32)
E
E During handling of the above exception, another exception occurred:
E
E               Traceback (most recent call last):
E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/linalg/tests/test_linalg.py", line 350, in check_cases
E                   case.check(self.do)
E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/linalg/tests/test_linalg.py", line 85, in check
E                   do(self.a, self.b, tags=self.tags)
E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/linalg/tests/test_linalg.py", line 460, in do
E                   assert_almost_equal(b, dot_generalized(a, x))
E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/linalg/tests/test_linalg.py", line 41, in assert_almost_equal
E                   old_assert_almost_equal(a, b, decimal=decimal, **kw)
E File "/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py", line 575, in assert_almost_equal
E                   raise AssertionError(_build_err_msg())
E               AssertionError:
E               Arrays are not almost equal to 6 decimals
E                ACTUAL: array([2.+1.j, 1.+2.j], dtype=complex64)
E                DESIRED: array([nan+nanj, nan+nanj], dtype=complex64)
E       AssertionError:
E       Arrays are not almost equal to 2 decimals
E        ACTUAL: nan
E        DESIRED: 12.65

The complete EB log can be provided if anyone cares.

Thanks,
Ole

--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark

Reply via email to