Gilles, Yes every $CC invocation in opal/mca/pmix/pmix1xx includes "-D_REENTRANT". However, they don't include "-mt". I believe we concluded (when we had problems previously) that "-mt" was the proper flag (at compile and link) for multi-threaded with the Studio compilers.
-Paul On Sat, Sep 19, 2015 at 11:29 PM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Paul, > > Can you please double check pmix1xx is compiled with -D_REENTRANT ? > We ran into similar issues in the past, and they only occurred with > Solaris > > Cheers, > > Gilles > > > On Sunday, September 20, 2015, Paul Hargrove <phhargr...@lbl.gov> wrote: > >> Ralph, >> The output from the requested run is attached. >> -Paul >> >> On Sat, Sep 19, 2015 at 9:46 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >>> Ah, okay - that makes more sense. I’ll have to let Brice see if he can >>> figure out how to silence the hwloc error message as I can’t find where it >>> came from. The other errors are real and are the reason why the job was >>> terminated. >>> >>> The problem is that we are trying to establish a communication between >>> the app and the daemon via unix domain socket, and we failed to do so. The >>> error tells me that we were able to create and connect to the socket, but >>> failed when the daemon tried to do a blocking send to the app. >>> >>> Can you rerun it with -mca pmix_base_verbose 10? It will tell us the >>> value of the error number that was returned >>> >>> Thanks >>> Ralph >>> >>> >>> On Sep 19, 2015, at 9:37 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: >>> >>> Ralph, >>> >>> No it did not run. >>> The complete output (which I really should have included in the first >>> place) is below. >>> >>> -Paul >>> >>> $ mpirun -mca btl sm,self -np 2 examples/ring_c' >>> Error opening /devices/pci@0,0:reg: Permission denied >>> [pcp-d-3:26054] PMIX ERROR: ERROR in file >>> /export/home/phargrov/OMPI/openmpi-master-solaris11-x64-ss12u3/openmpi-dev-2559-g567c9e3/opal/mca/pmix/pmix1xx/pmix/src/client/pmix_client.c >>> at line 181 >>> [pcp-d-3:26053] PMIX ERROR: UNREACHABLE in file >>> /export/home/phargrov/OMPI/openmpi-master-solaris11-x64-ss12u3/openmpi-dev-2559-g567c9e3/opal/mca/pmix/pmix1xx/pmix/src/server/pmix_server_listener.c >>> at line 463 >>> >>> -------------------------------------------------------------------------- >>> It looks like MPI_INIT failed for some reason; your parallel process is >>> likely to abort. There are many reasons that a parallel process can >>> fail during MPI_INIT; some of which are due to configuration or >>> environment >>> problems. This failure appears to be an internal failure; here's some >>> additional information (which may only be relevant to an Open MPI >>> developer): >>> >>> ompi_mpi_init: ompi_rte_init failed >>> --> Returned "(null)" (-43) instead of "Success" (0) >>> >>> -------------------------------------------------------------------------- >>> *** An error occurred in MPI_Init >>> *** on a NULL communicator >>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, >>> *** and potentially your MPI job) >>> [pcp-d-3:26054] Local abort before MPI_INIT completed completed >>> successfully, but am not able to aggregate error messages, and not able to >>> guarantee that all other processes were killed! >>> ------------------------------------------------------- >>> Primary job terminated normally, but 1 process returned >>> a non-zero exit code.. Per user-direction, the job has been aborted. >>> ------------------------------------------------------- >>> >>> -------------------------------------------------------------------------- >>> mpirun detected that one or more processes exited with non-zero status, >>> thus causing >>> the job to be terminated. The first process to do so was: >>> >>> Process name: [[11371,1],0] >>> Exit code: 1 >>> >>> -------------------------------------------------------------------------- >>> >>> On Sat, Sep 19, 2015 at 8:50 PM, Ralph Castain <r...@open-mpi.org> wrote: >>> >>>> Paul, can you clarify something for me? The error in this case >>>> indicates that the client wasn’t able to reach the daemon - this should >>>> have resulted in termination of the job. Did the job actually run? >>>> >>>> >>>> On Sep 18, 2015, at 2:50 AM, Ralph Castain <r...@open-mpi.org> wrote: >>>> >>>> I'm on travel right now, but it should be an easy fix when I return. >>>> Sorry for the annoyance >>>> >>>> >>>> On Thu, Sep 17, 2015 at 11:13 PM, Paul Hargrove <phhargr...@lbl.gov> >>>> wrote: >>>> >>>>> Any suggestion how I (as a non-root user) can avoid seeing this hwloc >>>>> error message on every run? >>>>> >>>>> -Paul >>>>> >>>>> On Thu, Sep 17, 2015 at 11:00 PM, Gilles Gouaillardet < >>>>> gil...@rist.or.jp> wrote: >>>>> >>>>>> Paul, >>>>>> >>>>>> IIRC, the "Permission denied" is coming from hwloc that cannot >>>>>> collect all the info it would like. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Gilles >>>>>> >>>>>> On 9/18/2015 2:34 PM, Paul Hargrove wrote: >>>>>> >>>>>> Tried tonight's master tarball on Solaris 11.2 on x86-64 with the >>>>>> Studio Compilers (default ILP32 output) and saw the following result >>>>>> >>>>>> $ mpirun -mca btl sm,self -np 2 examples/ring_c' >>>>>> Error opening /devices/pci@0,0:reg: Permission denied >>>>>> [pcp-d-4:00492] PMIX ERROR: ERROR in file >>>>>> /export/home/phargrov/OMPI/openmpi-master-solaris11-x86-ss12u3/openmpi-dev-2559-g567c9e3/opal/mca/pmix/pmix1xx/pmix/src/client/pmix_client.c >>>>>> at line 181 >>>>>> [pcp-d-4:00491] PMIX ERROR: UNREACHABLE in file >>>>>> /export/home/phargrov/OMPI/openmpi-master-solaris11-x86-ss12u3/openmpi-dev-2559-g567c9e3/opal/mca/pmix/pmix1xx/pmix/src/server/pmix_server_listener.c >>>>>> at line 463 >>>>>> >>>>>> I don't know if the Permission denied error is related to the >>>>>> subsequent PMIX errors, but any message that says "UNREACHABLE" is >>>>>> clearly >>>>>> worth reporting. >>>>>> >>>>>> -Paul >>>>>> >>>>>> -- >>>>>> Paul H. Hargrove phhargr...@lbl.gov >>>>>> Computer Languages & Systems Software (CLaSS) Group >>>>>> Computer Science Department Tel: +1-510-495-2352 >>>>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing listde...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/devel/2015/09/18074.php >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> devel mailing list >>>>>> de...@open-mpi.org >>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>>> Link to this post: >>>>>> http://www.open-mpi.org/community/lists/devel/2015/09/18075.php >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Paul H. Hargrove phhargr...@lbl.gov >>>>> Computer Languages & Systems Software (CLaSS) Group >>>>> Computer Science Department Tel: +1-510-495-2352 >>>>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> Link to this post: >>>>> http://www.open-mpi.org/community/lists/devel/2015/09/18076.php >>>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/devel/2015/09/18078.php >>>> >>> >>> >>> >>> -- >>> Paul H. Hargrove phhargr...@lbl.gov >>> Computer Languages & Systems Software (CLaSS) Group >>> Computer Science Department Tel: +1-510-495-2352 >>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2015/09/18080.php >>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2015/09/18081.php >>> >> >> >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> Computer Languages & Systems Software (CLaSS) Group >> Computer Science Department Tel: +1-510-495-2352 >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/09/18083.php > -- Paul H. Hargrove phhargr...@lbl.gov Computer Languages & Systems Software (CLaSS) Group Computer Science Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900