update: It's probably a bug. Trying with foss/2016b or upper is working fine. The strange thing is that I have the same openmpi as foss/2016a (1.10.2) compiled with gcc 4.4.7 (the one from Centos) and with this version it's working.
Conclusion: I'll begin to use foss/2016b and foss/2017a Best 2017-02-23 11:13 GMT+01:00 Yann Sagon <[email protected]>: > Hello Guilherme, > > the workaround indeed works like a charm:) > > As I have this problem with other mpi software like Rmpi, I continued to > investigate. > In my case the mpi subsystem is available in the node where I compile. > It appears that since I'm compiling openmpi with --with-pmi and > --with-slurm, I'm not able to launch any mpi binary in singleton mode (ie > without using mpirun or similar) and this is the problem when easybuild try > to execute a mpi enabled binary without using mpirun. Maybe it's a bug in > OpenMPI as well. I have tried with foss/2017a and it's working fine. But > then I have another problem, Theano supports CUDA but CUDA doesn't support > GCC > 5. > > So the question is: What should I do in order to make singleton works with > OMPI provided by foss/2016a or similar. Does other people here using slurm > and pmi have the same issue? > > Thanks > > 2017-02-22 15:43 GMT+01:00 Peretti-Pezzi Guilherme <[email protected]>: > >> Hi >> >> We have a similar problem when building Theano on login nodes: MPI >> subsystem is not available, so the sanity check fails. >> >> Our workaround (not approved by Kenneth) is to skip the sanity check, see >> here: >> https://github.com/eth-cscs/production/pull/84/files#diff-43 >> 63a39e679c51f12df983df71d3eaa9R33 >> >> -- >> Cheers, >> G. >> >> From: <[email protected]> on behalf of Yann Sagon < >> [email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Wednesday 22 February 2017 14:38 >> To: "[email protected]" <[email protected]> >> Subject: Re: [easybuild] Theano orte_init >> >> No it's not submitted to a scheduler (the scheduler is configured in >> EasyBuild but I'm not using it, ie. not using --job) >> >> 2017-02-22 12:52 GMT+01:00 Robert Schmidt <[email protected]>: >> >>> As a default sanity check for python packages, easybuild will try to >>> import the main module. In this case it seems like the import was enough to >>> trigger an mpi failure. >>> >>> It is a bit weird that the import is enough to cause that problem, but >>> maybe there is something else in the environment that makes it think it is >>> being run as part of an MPI job? Is the build being submitted to a >>> scheduler? >>> >>> On Wed, Feb 22, 2017 at 5:36 AM Yann Sagon <[email protected]> wrote: >>> >>>> Hello, >>>> >>>> I'm back again with my problem (I had a similar problem with Rmpi), >>>> maybe someone has an idea of what is going on. >>>> >>>> I tried to install Theano-0.8.2-foss-2016a-Python-2.7.11.eb and had >>>> the following error: >>>> >>>> >>>> == 2017-02-22 11:19:25,788 build_log.py:147 ERROR EasyBuild crashed >>>> with an error (at ?:124 in __init__): Sanity check failed: Theano failed to >>>> install, cmd '/opt/ebsofts/MPI/GCC/4.9.3-2. >>>> 25/OpenMPI/1.10.2/Python/2.7.11/bin/python -c "import theano"' (stdin: >>>> None) output: ------------------------------ >>>> -------------------------------------------- >>>> It looks like orte_init failed for some reason; your parallel process is >>>> likely to abort. There are many reasons that a parallel process can >>>> fail during orte_init; some of which are due to configuration or >>>> environment problems. This failure appears to be an internal failure; >>>> here's some additional information (which may only be relevant to an >>>> Open MPI developer): >>>> >>>> PMI2_Job_GetId failed failed >>>> --> Returned value (null) (14) instead of ORTE_SUCCESS >>>> >>>> It worth be said that I have compiled OpenMPI with the following >>>> additional flags: >>>> >>>> configopts += '--with-slurm --with-pmi ' >>>> >>>> Thanks for any suggestion >>>> >>>> -- >>>> Yann SAGON >>>> Ingénieur système HPC >>>> 24 Rue du Général-Dufour >>>> 1211 Genève 4 - Suisse >>>> Tél. : +41 (0)22 379 7737 <+41%2022%20379%2077%2037> >>>> [email protected] - www.unige.ch >>>> >>> >> >> >> -- >> Yann SAGON >> Ingénieur système HPC >> 24 Rue du Général-Dufour >> 1211 Genève 4 - Suisse >> Tél. : +41 (0)22 379 7737 <022%20379%2077%2037> >> [email protected] - www.unige.ch >> >> > > > -- > Yann SAGON > Ingénieur système HPC > 24 Rue du Général-Dufour > 1211 Genève 4 - Suisse > Tél. : +41 (0)22 379 7737 <022%20379%2077%2037> > [email protected] - www.unige.ch > -- Yann SAGON Ingénieur système HPC 24 Rue du Général-Dufour 1211 Genève 4 - Suisse Tél. : +41 (0)22 379 7737 [email protected] - www.unige.ch

