Dear Kenneth, thank you for your reply.I am working on a Debian machine without Infiniband. The batch system is torque/maui, there is no slurm. I installed libpmi0 and libpmi0-dev (,created some symlinks to make OpenMPI find it) and changed the easyconfig file of OpenMPI to use pmi. Then I recompiled it.
But when I use this OpenMPI installation, it still throws the same error (see below).
In addition, it claims to have pmi: --- ompi_info | grep pmi MCA pmix: s1 (MCA v2.1.0, API v2.0.0, Component v2.1.1)MCA pmix: pmix112 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v2.1.1) --- So I think the problem has nothing to do with FFTW, but only with OpenMPI. Regards, Holger mpirun -n 6 ./montecarlo -------------------------------------------------------------------------- A requested component was not found, or was unable to be opened. This means that this component is either not installed or is unable to be used on your system (e.g., sometimes this means that shared libraries that the component requires are unable to be found/loaded). Note that Open MPI stopped checking at the first component that it did not find. Host: fb11-nx-main Framework: ess Component: pmi -------------------------------------------------------------------------- -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_base_open failed --> Returned value Error (-1) instead of ORTE_SUCCESS -------------------------------------------------------------------------- -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): ompi_mpi_init: orte_init failed --> Returned "Error" (-1) instead of "Success" (0) -------------------------------------------------------------------------- [fb11-nx-main:27614] *** An error occurred in MPI_Init [fb11-nx-main:27614] *** on a NULL communicator [fb11-nx-main:27614] *** Unknown error [fb11-nx-main:27614] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort -------------------------------------------------------------------------- An MPI process is aborting at a time when it cannot guarantee that all of its peer processes in the job will be killed properly. You should double check that everything has shut down cleanly. Reason: Before MPI_INIT completed Local host: fb11-nx-main PID: 27614 --------------------------------------------------------------------------[fb11-nx-main:27614] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 116
On 11.10.2017 10:54, Kenneth Hoste wrote:
Dear Holger, In what kind of environment are you trying to build FFTW here?You may have to rebuild OpenMPI while specifying the correct location for PMI, cfr. https://github.com/easybuilders/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/o/OpenMPI/OpenMPI-2.1.1-GCC-6.4.0-2.28.eb#L22 .regards, Kenneth On 10/10/2017 17:40, Holger Angenent wrote:Dear all,I am trying to compile foss-2017a on a Debian system. OpenMPI builds to the end, but when FFTW is using it for its checks, it throws an error (see below). The same happenes with foss-2017b. On a CentOS 7 based machine, both compile without error.Do you have an idea, if an OS dependency might be missing? Did anybody succeed building foss on Debian?Best regards, Holger This is the step failing:mpirun -np 1 /home/ms/f/fhms166656/.local/easybuild/build/FFTW/3.3.6/gompi-2017a/fftw-3.3.6-pl2/mpi/mpi-bench --verbose=1 --verify 'ofrd5x4x12v6' --verify 'ifrd5x4x12v6' --verify 'obcd5x4x12v6' --verify 'ibcd5x4x12v6' --verify 'ofcd5x4x12v6' --verify 'ifcd5x4x12v6' --verify 'ok7hx13e10x13b' --verify 'ik7hx13e10x13b' --verify 'obr[7x7v1' --verify 'ibr[7x7v1' --verify 'obc[7x7v1' --verify 'ibc[7x7v1' --verify 'ofc[7x7v1' --verify 'ifc[7x7v1' --verify 'ok]8hx6o01x4hx11e10' --verify 'ik]8hx6o01x4hx11e10' --verify 'obr9x6' --verify 'ibr9x6' --verify 'ofr9x6' --verify 'ifr9x6' --verify 'obc9x6' --verify 'ibc9x6' --verify 'ofc9x6' --verify 'ifc9x6' --verify 'ok6e10x12e10' --verify 'ik6e10x12e10' --verify 'ofr]9x4x9x10' --verify 'ifr]9x4x9x10' --verify 'obc]9x4x9x10' --verify 'ibc]9x4x9x10' --verify 'ofc]9x4x9x10' --verify 'ifc]9x4x9x10' --verify 'ofrd]10x8x8x12' --verify 'ifrd]10x8x8x12' --verify 'obcd]10x8x8x12' --verify 'ibcd]10x8x8x12' --verify 'ofcd]10x8x8x12' --verify 'ifcd]10x8x8x12' --verify 'ofrd]2x8x8v7' --verify 'ifrd]2x8x8v7' --verify 'obcd]2x8x8v7' --verify 'ibcd]2x8x8v7' --------------------------------------------------------------------------A requested component was not found, or was unable to be opened. This means that this component is either not installed or is unable to be used on your system (e.g., sometimes this means that shared libraries that the component requires are unable to be found/loaded). Note that Open MPI stopped checking at the first component that it did not find. Host: fb11-nx-main Framework: ess Component: pmi-------------------------------------------------------------------------- [fb11-nx-main:23484] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 116 --------------------------------------------------------------------------It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_base_open failed --> Returned value Error (-1) instead of ORTE_SUCCESS-------------------------------------------------------------------------- --------------------------------------------------------------------------It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process canfail during MPI_INIT; some of which are due to configuration or environmentproblems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer):
-- Westfälische Wilhelms-Universität Münster (WWU) Zentrum für Informationsverarbeitung (ZIV) Röntgenstraße 7-13 48149 Münster +49-(0)251-83 31569 [email protected] www.uni-muenster.de/ZIV
smime.p7s
Description: S/MIME Cryptographic Signature

