Hello Guilherme, the workaround indeed works like a charm:)
As I have this problem with other mpi software like Rmpi, I continued to investigate. In my case the mpi subsystem is available in the node where I compile. It appears that since I'm compiling openmpi with --with-pmi and --with-slurm, I'm not able to launch any mpi binary in singleton mode (ie without using mpirun or similar) and this is the problem when easybuild try to execute a mpi enabled binary without using mpirun. Maybe it's a bug in OpenMPI as well. I have tried with foss/2017a and it's working fine. But then I have another problem, Theano supports CUDA but CUDA doesn't support GCC > 5. So the question is: What should I do in order to make singleton works with OMPI provided by foss/2016a or similar. Does other people here using slurm and pmi have the same issue? Thanks 2017-02-22 15:43 GMT+01:00 Peretti-Pezzi Guilherme <[email protected]>: > Hi > > We have a similar problem when building Theano on login nodes: MPI > subsystem is not available, so the sanity check fails. > > Our workaround (not approved by Kenneth) is to skip the sanity check, see > here: > https://github.com/eth-cscs/production/pull/84/files#diff- > 4363a39e679c51f12df983df71d3eaa9R33 > > -- > Cheers, > G. > > From: <[email protected]> on behalf of Yann Sagon < > [email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Wednesday 22 February 2017 14:38 > To: "[email protected]" <[email protected]> > Subject: Re: [easybuild] Theano orte_init > > No it's not submitted to a scheduler (the scheduler is configured in > EasyBuild but I'm not using it, ie. not using --job) > > 2017-02-22 12:52 GMT+01:00 Robert Schmidt <[email protected]>: > >> As a default sanity check for python packages, easybuild will try to >> import the main module. In this case it seems like the import was enough to >> trigger an mpi failure. >> >> It is a bit weird that the import is enough to cause that problem, but >> maybe there is something else in the environment that makes it think it is >> being run as part of an MPI job? Is the build being submitted to a >> scheduler? >> >> On Wed, Feb 22, 2017 at 5:36 AM Yann Sagon <[email protected]> wrote: >> >>> Hello, >>> >>> I'm back again with my problem (I had a similar problem with Rmpi), >>> maybe someone has an idea of what is going on. >>> >>> I tried to install Theano-0.8.2-foss-2016a-Python-2.7.11.eb and had the >>> following error: >>> >>> >>> == 2017-02-22 11:19:25,788 build_log.py:147 ERROR EasyBuild crashed with >>> an error (at ?:124 in __init__): Sanity check failed: Theano failed to >>> install, cmd '/opt/ebsofts/MPI/GCC/4.9.3-2. >>> 25/OpenMPI/1.10.2/Python/2.7.11/bin/python -c "import theano"' (stdin: >>> None) output: ------------------------------ >>> -------------------------------------------- >>> It looks like orte_init failed for some reason; your parallel process is >>> likely to abort. There are many reasons that a parallel process can >>> fail during orte_init; some of which are due to configuration or >>> environment problems. This failure appears to be an internal failure; >>> here's some additional information (which may only be relevant to an >>> Open MPI developer): >>> >>> PMI2_Job_GetId failed failed >>> --> Returned value (null) (14) instead of ORTE_SUCCESS >>> >>> It worth be said that I have compiled OpenMPI with the following >>> additional flags: >>> >>> configopts += '--with-slurm --with-pmi ' >>> >>> Thanks for any suggestion >>> >>> -- >>> Yann SAGON >>> Ingénieur système HPC >>> 24 Rue du Général-Dufour >>> 1211 Genève 4 - Suisse >>> Tél. : +41 (0)22 379 7737 <+41%2022%20379%2077%2037> >>> [email protected] - www.unige.ch >>> >> > > > -- > Yann SAGON > Ingénieur système HPC > 24 Rue du Général-Dufour > 1211 Genève 4 - Suisse > Tél. : +41 (0)22 379 7737 <022%20379%2077%2037> > [email protected] - www.unige.ch > > -- Yann SAGON Ingénieur système HPC 24 Rue du Général-Dufour 1211 Genève 4 - Suisse Tél. : +41 (0)22 379 7737 [email protected] - www.unige.ch

