Hello, Some network configurations were recently changed on SDSC's Expanse and I wanted to update the Simfactory entry to add an env variable (as recommended by XSEDE's help desk).
I did so and ran the tests and found numerous failures. According to an expanse_2_64.log file I have on my computer, these tests did not fail in the past. The tests fail only for 2 MPI processes. An example of a test that fails is rnsA2, and this is the the tail of the log file: INFO (Carpet): MPI is enabled INFO (Carpet): Carpet is running on 2 processes WARNING level 0 from host exp-4-26.expanse.sdsc.edu process 0 in thorn Carpet, file /home/sbozzolo/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:148: -> The environment variable CACTUS_NUM_PROCS is set to 1, but there are 2 MPI processes. This may indicate a severe problem with the MPI startup mechanism. Rank 0 with PID 1194507 received signal 6 cactus_sim: /home/sbozzolo/Cactus/arrangements/Carpet/Carpet/src/helpers.cc:275: int Carpet::Abort(const cGH*, int): Assertion `0' failed. Writing backtrace to rnsA2/backtrace.0.txt Rank 1 with PID 1194508 received signal 6 Writing backtrace to rnsA2/backtrace.1.txt srun: error: exp-4-26: tasks 0-1: Aborted (core dumped) IBRUN: launch command: srun -n 2 --ntasks-per-node 2 /expanse/lustre/projects/uic383/sbozzolo/ettests_2proc/SIMFACTORY/exe/cactus_sim -L 3 /expanse/lustre/projects/uic383/sbozzolo/ettests_2proc/output-0000/arrangements/EinsteinInitialData/Hydro_RNSID/test/rnsA2.par IBRUN: MPI job exited with code: 134 Other tests behave similarly, e.g. Vaidya2: INFO (Carpet): MPI is enabled INFO (Carpet): Carpet is running on 2 processes WARNING level 0 from host exp-4-26.expanse.sdsc.edu process 0 in thorn Carpet, file /home/sbozzolo/Cactus/arrangements/Carpet/Carpet/src/SetupGH.cc:148: -> The environment variable CACTUS_NUM_PROCS is set to 1, but there are 2 MPI processes. This may indicate a severe problem with the MPI startup mechanism. Rank 0 with PID 1183519 received signal 6 Writing backtrace to Vaidya2/backtrace.0.txt cactus_sim: /home/sbozzolo/Cactus/arrangements/Carpet/Carpet/src/helpers.cc:275: int Carpet::Abort(const cGH*, int): Assertion `0' failed. Rank 1 with PID 1183520 received signal 6 Writing backtrace to Vaidya2/backtrace.1.txt srun: error: exp-4-26: tasks 0-1: Aborted (core dumped) IBRUN: launch command: srun -n 2 --ntasks-per-node 2 /expanse/lustre/projects/uic383/sbozzolo/ettests_2proc/SIMFACTORY/exe/cactus_sim -L 3 /expanse/lustre/projects/uic383/sbozzolo/ettests_2proc/output-0000/arrangements/EinsteinExact/EinsteinExact_Test/test/Vaidya2.par IBRUN: MPI job exited with code: 134 Given that I see in the testsuite_results repo the same failing tests (as run by Roland), I can exclude that the new env variable that I added is the reason for the failures. Any idea of what is going on? Thanks, Gabriele
_______________________________________________ Users mailing list [email protected] http://lists.einsteintoolkit.org/mailman/listinfo/users
