Ralph,

FYI, here is attached the patch i am working on (still testing ...)

aa207ad2f3de5b649e5439d06dca90d86f5a82c2 should be reverted then.

Cheers,

Gilles


On 2014/11/04 13:56, Paul Hargrove wrote:
> Ralph,
>
> You will see from the message I sent a moment ago that -D_REENTRANT on
> Solaris appears to be the problem.
> However, I will also try the trunk tarball as you have requested.
>
> -Paul
>
>
> On Mon, Nov 3, 2014 at 8:53 PM, Ralph Castain <rhc.open...@gmail.com> wrote:
>
>> Hmmm...Paul, would you be able to try this with the latest trunk tarball?
>> This looks familiar to me, and I wonder if we are just missing a changeset
>> from the trunk that fixed the handshake issues we had with failing over
>> from one transport to another.
>>
>> Ralph
>>
>> On Nov 3, 2014, at 7:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> Ralph,
>>
>> Requested output is attached.
>>
>> I have a Linux/x86 system with the same network configuration and will
>> soon be able to determine if the problem is specific to Solaris.
>>
>> -Paul
>>
>>
>> On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain <rhc.open...@gmail.com>
>> wrote:
>>
>>> Could you please set -mca oob_base_verbose 20? I'm not sure why the
>>> connection is failing.
>>>
>>> Thanks
>>> Ralph
>>>
>>> On Nov 3, 2014, at 5:56 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>>
>>> Not clear if the following failure is Solaris-specific, but it *IS* a
>>> regression relative to 1.8.3.
>>>
>>> The system has 2 IPV4 interfaces:
>>>    Ethernet on 172.16.0.119/16
>>>    IPoIB on 172.18.0.119/16
>>>
>>> $ ifconfig bge0
>>> bge0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500
>>> index 2
>>>         inet 172.16.0.119 netmask ffff0000 broadcast 172.16.255.255
>>> $ ifconfig pFFFF.ibp0
>>> pFFFF.ibp0:
>>> flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 2044
>>> index 3
>>>         inet 172.18.0.119 netmask ffff0000 broadcast 172.18.255.255
>>>
>>> However, I get a message from mca/oob/tcp about not being able to
>>> communicate between these two interfaces ON THE SAME NODE:
>>>
>>> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun
>>> -mca btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c
>>> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0).
>>> ------------------------------------------------------------
>>> A process or daemon was unable to complete a TCP connection
>>> to another process:
>>>   Local host:    pcp-j-19
>>>   Remote host:   172.18.0.119
>>> This is usually caused by a firewall on the remote host. Please
>>> check that any firewall (e.g., iptables) has been disabled and
>>> try again.
>>> ------------------------------------------------------------
>>>
>>> Let me know what sort of verbose options I should use to gather any
>>> additional info you may need.
>>>
>>> -Paul
>>>
>>> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain <rhc.open...@gmail.com>
>>> wrote:
>>>
>>>> Hi folks
>>>>
>>>> I know 1.8.4 isn't entirely complete just yet, but I'd like to get a
>>>> head start on the testing so we can release by Fri Nov 7th. So please take
>>>> a little time and test the current tarball:
>>>>
>>>> http://www.open-mpi.org/software/ompi/v1.8/
>>>>
>>>> Thanks
>>>> Ralph
>>>>
>>>>
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php
>>>>
>>>
>>>
>>> --
>>> Paul H. Hargrove                          phhargr...@lbl.gov
>>> Future Technologies Group
>>> Computer and Data Sciences Department     Tel: +1-510-495-2352
>>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>>  _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php
>>>
>>>
>>>
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/devel/2014/11/16161.php
>>>
>>
>>
>> --
>> Paul H. Hargrove                          phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department     Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>  <oob_base_verbose=20.txt>_______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/11/16162.php
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2014/11/16163.php
>>
>
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16165.php

diff --git a/config/opal_config_pthreads.m4 b/config/opal_config_pthreads.m4
index 7bc0bfe..eb7803c 100644
--- a/config/opal_config_pthreads.m4
+++ b/config/opal_config_pthreads.m4
@@ -178,7 +178,14 @@ AC_DEFUN([OPAL_INTL_POSIX_THREADS_PLAIN_C], [
 #
 if test "$opal_pthread_c_success" = "0"; then
   AC_MSG_CHECKING([if C compiler and POSIX threads work as is])
-
+  case "${host_cpu}-${host-_os}" in
+    *solaris*)
+      if test "`echo $CPPFLAGS | $GREP 'D_REENTRANT'`" = ""; then
+        PTHREAD_CPPFLAGS="-D_REENTRANT"
+        CPPFLAGS="$CPPFLAGS $PTHREAD_CPPFLAGS"
+      fi
+    ;;
+  esac
   AC_LANG_PUSH(C)
   OPAL_INTL_PTHREAD_TRY_LINK(opal_pthread_c_success=1,
                             opal_pthread_c_success=0)
@@ -198,7 +205,14 @@ AC_DEFUN([OPAL_INTL_POSIX_THREADS_PLAIN_CXX], [
 #
 if test "$opal_pthread_cxx_success" = "0"; then
   AC_MSG_CHECKING([if C++ compiler and POSIX threads work as is])
-
+  case "${host_cpu}-${host-_os}" in
+    *solaris*)
+      if test "`echo $CXXCPPFLAGS | $GREP 'D_REENTRANT'`" = ""; then
+        PTHREAD_CXXCPPFLAGS="-D_REENTRANT"
+        CXXCPPFLAGS="$CXXCPPFLAGS $PTHREAD_CXXCPPFLAGS"
+      fi
+    ;;
+  esac
   AC_LANG_PUSH(C++)
   OPAL_INTL_PTHREAD_TRY_LINK(opal_pthread_cxx_success=1, 
                             opal_pthread_cxx_success=0)

Reply via email to