I think I found it. There was an OpenMPI installation from the Debian Jessie repo. When I removed it and rebuild OpenMPI with EasyBuild, it does not produce the error anymore.

Thank you for your help,
Holger

On 11.10.2017 13:17, Markus Geimer wrote:
Holger,

For SLURM, you need to tell the batch system to actually use the PMI
interface (either by using 'srun --mpi=pmi' or setting 'MpiDefault=pmi'
in slurm.conf).  Maybe something similar is required for Torque/Maui?

Hth,
Markus


On 10/11/2017 12:52 PM, Holger Angenent wrote:
Dear Kenneth,

thank you for your reply.
I am working on a Debian machine without Infiniband. The batch system is
torque/maui, there is no slurm.
I installed libpmi0 and libpmi0-dev (,created some symlinks to make
OpenMPI find it) and changed the easyconfig file of OpenMPI to use pmi.
Then I recompiled it.

But when I use this OpenMPI installation, it still throws the same error
(see below).

In addition, it claims to have pmi:
---
ompi_info | grep pmi
                 MCA pmix: s1 (MCA v2.1.0, API v2.0.0, Component v2.1.1)
                 MCA pmix: pmix112 (MCA v2.1.0, API v2.0.0, Component
v2.1.1)
                  MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v2.1.1)
---

So I think the problem has nothing to do with FFTW, but only with OpenMPI.

Regards,
Holger




mpirun -n 6 ./montecarlo
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded).  Note that
Open MPI stopped checking at the first component that it did not find.

Host:      fb11-nx-main
Framework: ess
Component: pmi
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

   orte_ess_base_open failed
   --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

   ompi_mpi_init: orte_init failed
   --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
[fb11-nx-main:27614] *** An error occurred in MPI_Init
[fb11-nx-main:27614] *** on a NULL communicator
[fb11-nx-main:27614] *** Unknown error
[fb11-nx-main:27614] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
An MPI process is aborting at a time when it cannot guarantee that all
of its peer processes in the job will be killed properly.  You should
double check that everything has shut down cleanly.

   Reason:     Before MPI_INIT completed
   Local host: fb11-nx-main
   PID:        27614
--------------------------------------------------------------------------
[fb11-nx-main:27614] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
runtime/orte_init.c at line 116



On 11.10.2017 10:54, Kenneth Hoste wrote:
Dear Holger,

In what kind of environment are you trying to build FFTW here?

You may have to rebuild OpenMPI while specifying the correct location
for PMI, cfr.
https://github.com/easybuilders/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/o/OpenMPI/OpenMPI-2.1.1-GCC-6.4.0-2.28.eb#L22
.



regards,

Kenneth

On 10/10/2017 17:40, Holger Angenent wrote:
Dear all,

I am trying to compile foss-2017a on a Debian system. OpenMPI builds
to the end, but when FFTW is using it for its checks, it throws an
error (see below). The same happenes with foss-2017b. On a CentOS 7
based machine, both compile without error.

Do you have an idea, if an OS dependency might be missing? Did
anybody succeed building foss on Debian?

Best regards,

Holger



This is the step failing:

mpirun -np 1
/home/ms/f/fhms166656/.local/easybuild/build/FFTW/3.3.6/gompi-2017a/fftw-3.3.6-pl2/mpi/mpi-bench
--verbose=1   --verify 'ofrd5x4x12v6' --verify 'ifrd5x4x12v6'
--verify 'obcd5x4x12v6' --verify 'ibcd5x4x12v6' --verify
'ofcd5x4x12v6' --verify 'ifcd5x4x12v6' --verify 'ok7hx13e10x13b'
--verify 'ik7hx13e10x13b' --verify 'obr[7x7v1' --verify 'ibr[7x7v1'
--verify 'obc[7x7v1' --verify 'ibc[7x7v1' --verify 'ofc[7x7v1'
--verify 'ifc[7x7v1' --verify 'ok]8hx6o01x4hx11e10' --verify
'ik]8hx6o01x4hx11e10' --verify 'obr9x6' --verify 'ibr9x6' --verify
'ofr9x6' --verify 'ifr9x6' --verify  'obc9x6' --verify 'ibc9x6'
--verify 'ofc9x6' --verify 'ifc9x6' --verify 'ok6e10x12e10' --verify
'ik6e10x12e10' --verify 'ofr]9x4x9x10' --verify 'ifr]9x4x9x10'
--verify 'obc]9x4x9x10' --verify 'ibc]9x4x9x10' --verify
'ofc]9x4x9x10' --verify 'ifc]9x4x9x10' --verify 'ofrd]10x8x8x12'
--verify 'ifrd]10x8x8x12' --verify 'obcd]10x8x8x12' --verify
'ibcd]10x8x8x12' --verify 'ofcd]10x8x8x12' --verify 'ifcd]10x8x8x12'
--verify 'ofrd]2x8x8v7' --verify 'ifrd]2x8x8v7' --verify
'obcd]2x8x8v7' --verify 'ibcd]2x8x8v7'
--------------------------------------------------------------------------

A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded). Note that
Open MPI stopped checking at the first component that it did not find.

Host:      fb11-nx-main
Framework: ess
Component: pmi
--------------------------------------------------------------------------

[fb11-nx-main:23484] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in
file runtime/orte_init.c at line 116
--------------------------------------------------------------------------

It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

   orte_ess_base_open failed
   --> Returned value Error (-1) instead of ORTE_SUCCESS
--------------------------------------------------------------------------

--------------------------------------------------------------------------

It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

--
Dr. Markus Geimer
Juelich Supercomputing Centre
Institute for Advanced Simulation
Forschungszentrum Juelich GmbH
52425 Juelich, Germany

Phone:  +49-2461-61-1773
Fax:    +49-2461-61-6656
E-Mail: [email protected]
WWW:    http://www.fz-juelich.de/jsc


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------


--
Westfälische Wilhelms-Universität Münster (WWU)
Zentrum für Informationsverarbeitung (ZIV)
Röntgenstraße 7-13
48149 Münster
+49-(0)251-83 31569
[email protected]
www.uni-muenster.de/ZIV


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to