Hi Loris, On 10/09/2021 13:09, Loris Bennett wrote: > Hi, > > When building > > impi/2021.2.0-intel-compilers-2021.2.0 > > with EB 4.4.2 I am getting the following error > > == 2021-09-10 11:30:42,184 run.py:233 INFO running cmd: mpirun -n 40 > /trinity/shared/easybuild/build/impi/2021.2.0/intel-compilers-2021.2.0/mpi_test > == 2021-09-10 11:30:43,012 run.py:635 INFO parse_log_for_error msg: > Command used: mpirun -n 40 > /trinity/shared/easybuild/build/impi/2021.2.0/intel-compilers-2021.2.0/mpi_test > == 2021-09-10 11:30:43,013 run.py:637 INFO parse_log_for_error (some may > be harmless) regExp (?<![(,-]|\w)(?:error|segmentation > fault|failed)(?![(,-]|\.?\w) found: > admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: assign_context > command failed: Cannot allocate memory > admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: assign_context > command failed: Cannot allocate memory > admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: assign_context > command failed: Cannot allocate memory > admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: assign_context > command failed: Cannot allocate memory > Abort(1615503) on node 17 (rank 17 in comm 0): Fatal error in PMPI_Init: > Other MPI error, error stack: > create_endpoint(2284)........: OFI endpoint open failed > (ofi_init.c:2284:create_endpoint:Invalid argument) > == 2021-09-10 11:30:43,014 run.py:594 WARNING Found 6 errors in command > output (output: admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: > assign_context command failed: Cannot allocate memory > admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: > assign_context command failed: Cannot allocate memory > admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: > assign_context command failed: Cannot allocate memory > admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: > assign_context command failed: Cannot allocate memory > Abort(1615503) on node 17 (rank 17 in comm 0): Fatal error in > PMPI_Init: Other MPI error, error stack: > create_endpoint(2284)........: OFI endpoint open failed > (ofi_init.c:2284:create_endpoint:Invalid argument)) > > Any ideas what might be going wrong?
In what type of environment are you running this? A Slurm job with restricted available memory? I assume the system you're seeing this on has 40 cores (based on the "mpirun -n 40")? Can you try using "eb --parallel 10" or "eb --parallel 2" to restrict the number of MPI processes it's starting for the test, and see if that helps? regards, Kenneth > > Cheers, > > Loris >

