Ralph, FYI, here is attached the patch i am working on (still testing ...)
aa207ad2f3de5b649e5439d06dca90d86f5a82c2 should be reverted then. Cheers, Gilles On 2014/11/04 13:56, Paul Hargrove wrote: > Ralph, > > You will see from the message I sent a moment ago that -D_REENTRANT on > Solaris appears to be the problem. > However, I will also try the trunk tarball as you have requested. > > -Paul > > > On Mon, Nov 3, 2014 at 8:53 PM, Ralph Castain <rhc.open...@gmail.com> wrote: > >> Hmmm...Paul, would you be able to try this with the latest trunk tarball? >> This looks familiar to me, and I wonder if we are just missing a changeset >> from the trunk that fixed the handshake issues we had with failing over >> from one transport to another. >> >> Ralph >> >> On Nov 3, 2014, at 7:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: >> >> Ralph, >> >> Requested output is attached. >> >> I have a Linux/x86 system with the same network configuration and will >> soon be able to determine if the problem is specific to Solaris. >> >> -Paul >> >> >> On Mon, Nov 3, 2014 at 7:11 PM, Ralph Castain <rhc.open...@gmail.com> >> wrote: >> >>> Could you please set -mca oob_base_verbose 20? I'm not sure why the >>> connection is failing. >>> >>> Thanks >>> Ralph >>> >>> On Nov 3, 2014, at 5:56 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: >>> >>> Not clear if the following failure is Solaris-specific, but it *IS* a >>> regression relative to 1.8.3. >>> >>> The system has 2 IPV4 interfaces: >>> Ethernet on 172.16.0.119/16 >>> IPoIB on 172.18.0.119/16 >>> >>> $ ifconfig bge0 >>> bge0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 >>> index 2 >>> inet 172.16.0.119 netmask ffff0000 broadcast 172.16.255.255 >>> $ ifconfig pFFFF.ibp0 >>> pFFFF.ibp0: >>> flags=1001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FIXEDMTU> mtu 2044 >>> index 3 >>> inet 172.18.0.119 netmask ffff0000 broadcast 172.18.255.255 >>> >>> However, I get a message from mca/oob/tcp about not being able to >>> communicate between these two interfaces ON THE SAME NODE: >>> >>> $ /shared/OMPI/openmpi-1.8.4rc1-solaris11-x86-ib-ss12u3/INST/bin/mpirun >>> -mca btl sm,self,openib -np 1 -host pcp-j-19 examples/ring_c >>> [pcp-j-19:00899] mca_oob_tcp_accept: accept() failed: Error 0 (0). >>> ------------------------------------------------------------ >>> A process or daemon was unable to complete a TCP connection >>> to another process: >>> Local host: pcp-j-19 >>> Remote host: 172.18.0.119 >>> This is usually caused by a firewall on the remote host. Please >>> check that any firewall (e.g., iptables) has been disabled and >>> try again. >>> ------------------------------------------------------------ >>> >>> Let me know what sort of verbose options I should use to gather any >>> additional info you may need. >>> >>> -Paul >>> >>> On Fri, Oct 31, 2014 at 7:14 PM, Ralph Castain <rhc.open...@gmail.com> >>> wrote: >>> >>>> Hi folks >>>> >>>> I know 1.8.4 isn't entirely complete just yet, but I'd like to get a >>>> head start on the testing so we can release by Fri Nov 7th. So please take >>>> a little time and test the current tarball: >>>> >>>> http://www.open-mpi.org/software/ompi/v1.8/ >>>> >>>> Thanks >>>> Ralph >>>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/devel/2014/10/16138.php >>>> >>> >>> >>> -- >>> Paul H. Hargrove phhargr...@lbl.gov >>> Future Technologies Group >>> Computer and Data Sciences Department Tel: +1-510-495-2352 >>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/11/16160.php >>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: >>> http://www.open-mpi.org/community/lists/devel/2014/11/16161.php >>> >> >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> Future Technologies Group >> Computer and Data Sciences Department Tel: +1-510-495-2352 >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> <oob_base_verbose=20.txt>_______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/11/16162.php >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2014/11/16163.php >> > > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16165.php
diff --git a/config/opal_config_pthreads.m4 b/config/opal_config_pthreads.m4 index 7bc0bfe..eb7803c 100644 --- a/config/opal_config_pthreads.m4 +++ b/config/opal_config_pthreads.m4 @@ -178,7 +178,14 @@ AC_DEFUN([OPAL_INTL_POSIX_THREADS_PLAIN_C], [ # if test "$opal_pthread_c_success" = "0"; then AC_MSG_CHECKING([if C compiler and POSIX threads work as is]) - + case "${host_cpu}-${host-_os}" in + *solaris*) + if test "`echo $CPPFLAGS | $GREP 'D_REENTRANT'`" = ""; then + PTHREAD_CPPFLAGS="-D_REENTRANT" + CPPFLAGS="$CPPFLAGS $PTHREAD_CPPFLAGS" + fi + ;; + esac AC_LANG_PUSH(C) OPAL_INTL_PTHREAD_TRY_LINK(opal_pthread_c_success=1, opal_pthread_c_success=0) @@ -198,7 +205,14 @@ AC_DEFUN([OPAL_INTL_POSIX_THREADS_PLAIN_CXX], [ # if test "$opal_pthread_cxx_success" = "0"; then AC_MSG_CHECKING([if C++ compiler and POSIX threads work as is]) - + case "${host_cpu}-${host-_os}" in + *solaris*) + if test "`echo $CXXCPPFLAGS | $GREP 'D_REENTRANT'`" = ""; then + PTHREAD_CXXCPPFLAGS="-D_REENTRANT" + CXXCPPFLAGS="$CXXCPPFLAGS $PTHREAD_CXXCPPFLAGS" + fi + ;; + esac AC_LANG_PUSH(C++) OPAL_INTL_PTHREAD_TRY_LINK(opal_pthread_cxx_success=1, opal_pthread_cxx_success=0)