Hi Loris,

On 10/09/2021 13:09, Loris Bennett wrote:
> Hi,
>
> When building
>   
>    impi/2021.2.0-intel-compilers-2021.2.0
>
> with EB 4.4.2 I am getting the following error
>
>    == 2021-09-10 11:30:42,184 run.py:233 INFO running cmd: mpirun -n 40 
> /trinity/shared/easybuild/build/impi/2021.2.0/intel-compilers-2021.2.0/mpi_test
>    == 2021-09-10 11:30:43,012 run.py:635 INFO parse_log_for_error msg: 
> Command used: mpirun -n 40 
> /trinity/shared/easybuild/build/impi/2021.2.0/intel-compilers-2021.2.0/mpi_test
>    == 2021-09-10 11:30:43,013 run.py:637 INFO parse_log_for_error (some may 
> be harmless) regExp (?<![(,-]|\w)(?:error|segmentation 
> fault|failed)(?![(,-]|\.?\w) found:
>    admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: assign_context 
> command failed: Cannot allocate memory
>    admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: assign_context 
> command failed: Cannot allocate memory
>    admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: assign_context 
> command failed: Cannot allocate memory
>    admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: assign_context 
> command failed: Cannot allocate memory
>    Abort(1615503) on node 17 (rank 17 in comm 0): Fatal error in PMPI_Init: 
> Other MPI error, error stack:
>    create_endpoint(2284)........: OFI endpoint open failed 
> (ofi_init.c:2284:create_endpoint:Invalid argument)
>    == 2021-09-10 11:30:43,014 run.py:594 WARNING Found 6 errors in command 
> output (output: admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: 
> assign_context command failed: Cannot allocate memory
>            admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: 
> assign_context command failed: Cannot allocate memory
>            admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: 
> assign_context command failed: Cannot allocate memory
>            admin.curta.zedat.fu-berlin.de.17654hfi_userinit_internal: 
> assign_context command failed: Cannot allocate memory
>            Abort(1615503) on node 17 (rank 17 in comm 0): Fatal error in 
> PMPI_Init: Other MPI error, error stack:
>            create_endpoint(2284)........: OFI endpoint open failed 
> (ofi_init.c:2284:create_endpoint:Invalid argument))
>
> Any ideas what might be going wrong?

In what type of environment are you running this? A Slurm job with 
restricted available memory?

I assume the system you're seeing this on has 40 cores (based on the 
"mpirun -n 40")?

Can you try using "eb --parallel 10" or "eb --parallel 2" to restrict 
the number of MPI processes it's starting for the test, and see if that 
helps?


regards,

Kenneth

>
> Cheers,
>
> Loris
>

Reply via email to