Thanks Paul, at first glance, something is going wrong in the sec module under solaris. I will keep digging tomorrow
Cheers, Gilles On Tuesday, August 23, 2016, Paul Hargrove <phhargr...@lbl.gov> wrote: > On Solaris 11.3 on x86-64: > > $ mpirun -mca btl sm,self,openib -np 2 -host pcp-d-3,pcp-d-4 > examples/ring_c' > [pcp-d-4:25075] PMIX ERROR: NOT-SUPPORTED in file > /shared/OMPI/openmpi-2.0.1rc1-solaris11-x86-ib-gcc/openmpi- > 2.0.1rc1/opal/mca/pmix/pmix112/pmix/src/server/pmix_server_listener.c at > line 529 > [pcp-d-4:25078] PMIX ERROR: UNREACHABLE in file > /shared/OMPI/openmpi-2.0.1rc1-solaris11-x86-ib-gcc/openmpi- > 2.0.1rc1/opal/mca/pmix/pmix112/pmix/src/client/pmix_client.c at line 983 > [pcp-d-4:25078] PMIX ERROR: UNREACHABLE in file > /shared/OMPI/openmpi-2.0.1rc1-solaris11-x86-ib-gcc/openmpi- > 2.0.1rc1/opal/mca/pmix/pmix112/pmix/src/client/pmix_client.c at line 199 > -------------------------------------------------------------------------- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > ompi_mpi_init: ompi_rte_init failed > --> Returned "(null)" (-43) instead of "Success" (0) > -------------------------------------------------------------------------- > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > *** and potentially your MPI job) > [pcp-d-4:25078] Local abort before MPI_INIT completed completed > successfully, but am not able to aggregate error messages, and not able to > guarantee that all other processes were killed! > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun detected that one or more processes exited with non-zero status, > thus causing > the job to be terminated. The first process to do so was: > > Process name: [[25599,1],1] > Exit code: 1 > -------------------------------------------------------------------------- > > -Paul > > -- > Paul H. Hargrove phhargr...@lbl.gov > <javascript:_e(%7B%7D,'cvml','phhargr...@lbl.gov');> > Computer Languages & Systems Software (CLaSS) Group > Computer Science Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel