Hello Gabriele, Thank you for contributing these.
The test suites are quick running parfiles with small grids, so running them on large numbers of MPI ranks (they are designed for 1 or 2 MPI ranks) can lead to unexpected situations (such as an MPI rank having no grid points at all). Generally, if the tests work for 1,2,4 ranks (4 being the largest number of procs requested by any test.ccl file) then this is sufficient. In principle even running on more MPI ranks should work, so if you know which tests fail with the larger number of MPI ranks and were to list them in a ticket, maybe someone could look into this. Note that you can undersubscribe compute node, in particular for tests, if you do not need / want to use all cores. Can you create a pull request for the "linux" architecture file with the changes for the AMD compiler you found, please? So far it sees you mostly only changed the detection part, does it then not also require some changes in the "set values" part of the file? Eg default values for optimization, preprocessor or so? Yours, Roland > Hello, > > Two days ago, I opened a PR to the simfactory repo to add Expanse, > the newest machine at the San Diego Supercomputing Center, based on > AMD Epyc "Rome" CPUs and part of XSEDE. In the meantime, I realized > that some tests are failing miserably, but I couldn't figure out why. > > Before I describe what I found, let me start with a side node on AMD > compilers. > > <side node> > > There are four compilers available on Expanse: GNU, Intel, AMD, and PGI. > I did not touch the PGI compilers. I briefly tried (and failed) to compile > with > the AMD compilers (aocc and flang). I did not try hard, and it seems that > most of the libraries on Expanse are compiled with gcc anyways. > > A first step to support these compilers is adding the lines: > > elif test "`$F90 --version 2>&1 | grep AMD`" ; then > LINUX_F90_COMP=AMD > else > > elif test "`$CC --version 2>&1 | grep AMD`" ; then > LINUX_C_COMP=AMD > fi > > elif test "`$CC --version 2>&1 | grep AMD`" ; then > LINUX_CXX_COMP=AMD > fi > > in the obvious places in flesh/lib/make/known-architecture/linux. > > </side node> > > I successfully compiled the Einstein Toolkit with > - gcc 10.2.0 and OpenMPI 4.0.4 > - gcc 9.2.0 and OpenMPI 4.0.4 > - intel 2019 and Intel MPI 2019 > > I noticed that some tests, like ADMMass/tov_carpet.par, gave > completely incorrect results. For example, the expected value is 1.3, > but I would find 1.6. > > I disabled all the optimizations, but the test would keep failing. At the > end, I noticed that if I ran with 8/16/32 MPI processes per node, and > the corresponding number of OpenMP threads (128/N_MPI), the test > would fail, but if I ran with 4/2/1 MPI processes, the test would pass. > > Most of my experiments were with gcc 10, but the test fails also with > the Intel suite. > > I tried increasing the OMP_STACK_SIZE to a very large value, but > it didn't help. > > Any idea of what the problem might be? > > Gabriele -- My email is as private as my paper mail. I therefore support encrypting and signing email messages. Get my PGP key from http://pgp.mit.edu .
pgpcE98ipYwXG.pgp
Description: OpenPGP digital signature
_______________________________________________ Users mailing list [email protected] http://lists.einsteintoolkit.org/mailman/listinfo/users
