Ralph, Yes, it failed. Sorry, had meant to include more of the output than I did (see below).
My Solaris systems moved (physically relocated the disks) yesterday between what *should* have been essentially identical hardware. At the moment I am looking into the ssh message, though I am sure I should have all the host keys associated with the correct hostnames and IPs already. -Paul full output: $ mpirun -mca btl sm,self,verbs -np 2 -host pcp-j-31,pcp-j-35 examples/ring_c' [pcp-j-35:01400] [/shared/OMPI/openmpi-master-solaris11-x64-ib-ss12u3/openmpi-dev-1351-gccba8ce/orte/mca/oob/tcp/oob_tcp_common.c:103] setsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol (99) ssh_exchange_identification: Connection closed by remote host^M -------------------------------------------------------------------------- ORTE was unable to reliably start one or more daemons. This usually is caused by: * not finding the required libraries and/or binaries on one or more nodes. Please check your PATH and LD_LIBRARY_PATH settings, or configure OMPI with --enable-orterun-prefix-by-default * lack of authority to execute on one or more specified nodes. Please verify your allocation and authorities. * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). Please check with your sys admin to determine the correct location to use. * compilation of the orted with dynamic libraries when static are required (e.g., on Cray). Please check your configure cmd line and consider using one of the contrib/platform definitions for your system type. * an inability to create a connection back to mpirun due to a lack of common network interfaces and/or no route found between them. Please check network connectivity (including firewalls and network routing requirements). -------------------------------------------------------------------------- On Fri, Mar 20, 2015 at 7:13 AM, Ralph Castain <r...@open-mpi.org> wrote: > Hi Paul > > It should have kept running, albeit with that warning - did the program > actually fail? > > > On Mar 19, 2015, at 10:05 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Seen earlier today with last night's master tarball: > > $ mpirun -mca btl sm,self,verbs -np 2 -host pcp-j-31,pcp-j-35 > examples/ring_c' > [pcp-j-35:01400] > [/shared/OMPI/openmpi-master-solaris11-x64-ib-ss12u3/openmpi-dev-1351-gccba8ce/orte/mca/oob/tcp/oob_tcp_common.c:103] > setsockopt(TCP_KEEPALIVE) failed: Option not supported by protocol (99) > > -Paul > > -- > Paul H. Hargrove phhargr...@lbl.gov > Computer Languages & Systems Software (CLaSS) Group > Computer Science Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/03/17138.php > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/03/17139.php > -- Paul H. Hargrove phhargr...@lbl.gov Computer Languages & Systems Software (CLaSS) Group Computer Science Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900