We're seeing testing failures on an AMD EPYC 7402 Rome server running
CentOS 7.9 when building SciPy-bundle-2020.11-intel-2020b.eb. The
corresponding foss-2020b build works without problems.
I realize that we're using the Intel compilers on an AMD Rome platform,
and that dragons may be lurking! So I wonder if other sites with AMD
Rome nodes have seen this problem and perhaps found a fix?
This is what we see:
$ eb GPAW-21.1.0-intel-2020b-ASE-3.21.1.eb -r
== Temporary log file in case of crash /tmp/eb-1cYcYH/easybuild-g6UgUd.log
== found valid index for
/home/modules/software/EasyBuild/4.3.4/easybuild/easyconfigs, so using it...
== found valid index for
/home/modules/software/EasyBuild/4.3.4/easybuild/easyconfigs, so using it...
== resolving dependencies ...
== processing EasyBuild easyconfig
/home/modules/software/EasyBuild/4.3.4/easybuild/easyconfigs/s/SciPy-bundle/SciPy-bundle-2020.11-intel-2020b.eb
== building and installing SciPy-bundle/2020.11-intel-2020b...
== fetching files...
== creating build dir, resetting environment...
== unpacking...
== patching...
== preparing...
== configuring...
== building...
== testing...
== installing...
== taking care of extensions...
== installing extension numpy 1.19.4 (1/8)...
== configuring...
== building...
== testing...
== FAILED: Installation ended unsuccessfully (build directory:
/home/modules/build/SciPybundle/2020.11/intel-2020b): build failed
(first 300 chars): cmd "export
PYTHONPATH=/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages:$PYTHONPATH
&& unset LDFLAGS && cd .. &&
/home/modules/software/Python/3.8.6-GCCcore-10.2.0/bin/python -c 'import
sys; import numpy; sys.exit(not numpy.test(verbose=2))' " exited with
exit code 1 and output:
NumPy version 1 (took 3 min 58 sec)
== Results of the build can be found in the log file(s)
/tmp/eb-1cYcYH/easybuild-SciPy-bundle-2020.11-20210429.214727.rHXha.log
ERROR: Build of
/home/modules/software/EasyBuild/4.3.4/easybuild/easyconfigs/s/SciPy-bundle/SciPy-bundle-2020.11-intel-2020b.eb
failed (err: 'build failed (first 300 chars): cmd "export
PYTHONPATH=/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages:$PYTHONPATH
&& unset LDFLAGS && cd .. &&
/home/modules/software/Python/3.8.6-GCCcore-10.2.0/bin/python -c
\'import sys; import numpy; sys.exit(not numpy.test(verbose=2))\' "
exited with exit code 1 and output:\nNumPy version 1')
The EB log file contains some errors about NaNs:
$ grep '^E' /tmp/eb-1cYcYH/easybuild-g6UgUd.log
E AssertionError:
E Arrays are not almost equal to 6 decimals
E
E x and y nan location mismatch:
E x: array([2., 1.], dtype=float32)
E y: array([nan, nan], dtype=float32)
E AssertionError:
E Arrays are not almost equal to 6 decimals
E ACTUAL: array([2.+1.j, 1.+2.j], dtype=complex64)
E DESIRED: array([nan+nanj, nan+nanj], dtype=complex64)
E AssertionError: In test case: <LinalgCase: csingle>
E
E Traceback (most recent call last):
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py",
line 572, in assert_almost_equal
E assert_almost_equal(actualr, desiredr, decimal=decimal)
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py",
line 579, in assert_almost_equal
E return assert_array_almost_equal(actual, desired,
decimal, err_msg)
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py",
line 1042, in assert_array_almost_equal
E assert_array_compare(compare, x, y, err_msg=err_msg,
verbose=verbose,
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py",
line 764, in assert_array_compare
E flagged = func_assert_same_pos(x, y, func=isnan,
hasval='nan')
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py",
line 740, in func_assert_same_pos
E raise AssertionError(msg)
E AssertionError:
E Arrays are not almost equal to 6 decimals
E
E x and y nan location mismatch:
E x: array([2., 1.], dtype=float32)
E y: array([nan, nan], dtype=float32)
E
E During handling of the above exception, another
exception occurred:
E
E Traceback (most recent call last):
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/linalg/tests/test_linalg.py",
line 350, in check_cases
E case.check(self.do)
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/linalg/tests/test_linalg.py",
line 85, in check
E do(self.a, self.b, tags=self.tags)
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/linalg/tests/test_linalg.py",
line 460, in do
E assert_almost_equal(b, dot_generalized(a, x))
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/linalg/tests/test_linalg.py",
line 41, in assert_almost_equal
E old_assert_almost_equal(a, b, decimal=decimal, **kw)
E File
"/tmp/eb-1cYcYH/tmpkHZver/lib/python3.8/site-packages/numpy/testing/_private/utils.py",
line 575, in assert_almost_equal
E raise AssertionError(_build_err_msg())
E AssertionError:
E Arrays are not almost equal to 6 decimals
E ACTUAL: array([2.+1.j, 1.+2.j], dtype=complex64)
E DESIRED: array([nan+nanj, nan+nanj], dtype=complex64)
E AssertionError:
E Arrays are not almost equal to 2 decimals
E ACTUAL: nan
E DESIRED: 12.65
The complete EB log can be provided if anyone cares.
Thanks,
Ole
--
Ole Holm Nielsen
PhD, Senior HPC Officer
Department of Physics, Technical University of Denmark