Gilles,

Yes every $CC invocation in opal/mca/pmix/pmix1xx includes "-D_REENTRANT".
However, they don't include "-mt".
I believe we concluded (when we had problems previously) that "-mt" was the
proper flag (at compile and link) for multi-threaded with the Studio
compilers.

-Paul

On Sat, Sep 19, 2015 at 11:29 PM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Paul,
>
> Can you please double check pmix1xx is compiled with -D_REENTRANT ?
> We ran into similar issues in the past, and they only occurred with
> Solaris
>
> Cheers,
>
> Gilles
>
>
> On Sunday, September 20, 2015, Paul Hargrove <phhargr...@lbl.gov> wrote:
>
>> Ralph,
>> The output from the requested run is attached.
>> -Paul
>>
>> On Sat, Sep 19, 2015 at 9:46 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>
>>> Ah, okay - that makes more sense. I’ll have to let Brice see if he can
>>> figure out how to silence the hwloc error message as I can’t find where it
>>> came from. The other errors are real and are the reason why the job was
>>> terminated.
>>>
>>> The problem is that we are trying to establish a communication between
>>> the app and the daemon via unix domain socket, and we failed to do so. The
>>> error tells me that we were able to create and connect to the socket, but
>>> failed when the daemon tried to do a blocking send to the app.
>>>
>>> Can you rerun it with -mca pmix_base_verbose 10? It will tell us the
>>> value of the error number that was returned
>>>
>>> Thanks
>>> Ralph
>>>
>>>
>>> On Sep 19, 2015, at 9:37 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>>
>>> Ralph,
>>>
>>> No it did not run.
>>> The complete output (which I really should have included in the first
>>> place) is below.
>>>
>>> -Paul
>>>
>>> $ mpirun -mca btl sm,self -np 2 examples/ring_c'
>>> Error opening /devices/pci@0,0:reg: Permission denied
>>> [pcp-d-3:26054] PMIX ERROR: ERROR in file
>>> /export/home/phargrov/OMPI/openmpi-master-solaris11-x64-ss12u3/openmpi-dev-2559-g567c9e3/opal/mca/pmix/pmix1xx/pmix/src/client/pmix_client.c
>>> at line 181
>>> [pcp-d-3:26053] PMIX ERROR: UNREACHABLE in file
>>> /export/home/phargrov/OMPI/openmpi-master-solaris11-x64-ss12u3/openmpi-dev-2559-g567c9e3/opal/mca/pmix/pmix1xx/pmix/src/server/pmix_server_listener.c
>>> at line 463
>>>
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or
>>> environment
>>> problems.  This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>>   ompi_mpi_init: ompi_rte_init failed
>>>   --> Returned "(null)" (-43) instead of "Success" (0)
>>>
>>> --------------------------------------------------------------------------
>>> *** An error occurred in MPI_Init
>>> *** on a NULL communicator
>>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>>> ***    and potentially your MPI job)
>>> [pcp-d-3:26054] Local abort before MPI_INIT completed completed
>>> successfully, but am not able to aggregate error messages, and not able to
>>> guarantee that all other processes were killed!
>>> -------------------------------------------------------
>>> Primary job  terminated normally, but 1 process returned
>>> a non-zero exit code.. Per user-direction, the job has been aborted.
>>> -------------------------------------------------------
>>>
>>> --------------------------------------------------------------------------
>>> mpirun detected that one or more processes exited with non-zero status,
>>> thus causing
>>> the job to be terminated. The first process to do so was:
>>>
>>>   Process name: [[11371,1],0]
>>>   Exit code:    1
>>>
>>> --------------------------------------------------------------------------
>>>
>>> On Sat, Sep 19, 2015 at 8:50 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>
>>>> Paul, can you clarify something for me? The error in this case
>>>> indicates that the client wasn’t able to reach the daemon - this should
>>>> have resulted in termination of the job. Did the job actually run?
>>>>
>>>>
>>>> On Sep 18, 2015, at 2:50 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>
>>>> I'm on travel right now, but it should be an easy fix when I return.
>>>> Sorry for the annoyance
>>>>
>>>>
>>>> On Thu, Sep 17, 2015 at 11:13 PM, Paul Hargrove <phhargr...@lbl.gov>
>>>> wrote:
>>>>
>>>>> Any suggestion how I (as a non-root user) can avoid seeing this hwloc
>>>>> error message on every run?
>>>>>
>>>>> -Paul
>>>>>
>>>>> On Thu, Sep 17, 2015 at 11:00 PM, Gilles Gouaillardet <
>>>>> gil...@rist.or.jp> wrote:
>>>>>
>>>>>> Paul,
>>>>>>
>>>>>> IIRC, the "Permission denied" is coming from hwloc that cannot
>>>>>> collect all the info it would like.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Gilles
>>>>>>
>>>>>> On 9/18/2015 2:34 PM, Paul Hargrove wrote:
>>>>>>
>>>>>> Tried tonight's master tarball on Solaris 11.2 on x86-64 with the
>>>>>> Studio Compilers  (default ILP32 output) and saw the following result
>>>>>>
>>>>>> $ mpirun -mca btl sm,self -np 2 examples/ring_c'
>>>>>> Error opening /devices/pci@0,0:reg: Permission denied
>>>>>> [pcp-d-4:00492] PMIX ERROR: ERROR in file
>>>>>> /export/home/phargrov/OMPI/openmpi-master-solaris11-x86-ss12u3/openmpi-dev-2559-g567c9e3/opal/mca/pmix/pmix1xx/pmix/src/client/pmix_client.c
>>>>>> at line 181
>>>>>> [pcp-d-4:00491] PMIX ERROR: UNREACHABLE in file
>>>>>> /export/home/phargrov/OMPI/openmpi-master-solaris11-x86-ss12u3/openmpi-dev-2559-g567c9e3/opal/mca/pmix/pmix1xx/pmix/src/server/pmix_server_listener.c
>>>>>> at line 463
>>>>>>
>>>>>> I don't know if the Permission denied error is related to the
>>>>>> subsequent PMIX errors, but any message that says "UNREACHABLE" is 
>>>>>> clearly
>>>>>> worth reporting.
>>>>>>
>>>>>> -Paul
>>>>>>
>>>>>> --
>>>>>> Paul H. Hargrove                          phhargr...@lbl.gov
>>>>>> Computer Languages & Systems Software (CLaSS) Group
>>>>>> Computer Science Department               Tel: +1-510-495-2352
>>>>>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing listde...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> Link to this post: 
>>>>>> http://www.open-mpi.org/community/lists/devel/2015/09/18074.php
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> de...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> Link to this post:
>>>>>> http://www.open-mpi.org/community/lists/devel/2015/09/18075.php
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Paul H. Hargrove                          phhargr...@lbl.gov
>>>>> Computer Languages & Systems Software (CLaSS) Group
>>>>> Computer Science Department               Tel: +1-510-495-2352
>>>>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>>>>
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> Link to this post:
>>>>> http://www.open-mpi.org/community/lists/devel/2015/09/18076.php
>>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/devel/2015/09/18078.php
>>>>
>>>
>>>
>>>
>>> --
>>> Paul H. Hargrove                          phhargr...@lbl.gov
>>> Computer Languages & Systems Software (CLaSS) Group
>>> Computer Science Department               Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/09/18080.php
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2015/09/18081.php
>>>
>>
>>
>>
>> --
>> Paul H. Hargrove                          phhargr...@lbl.gov
>> Computer Languages & Systems Software (CLaSS) Group
>> Computer Science Department               Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/09/18083.php
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to