On Fri, Jul 08, 2011 at 12:09:09PM -0700, Steve Kargl wrote: > On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote: > > > > The easiest way to fix this is likely to use the btl_tcp_if_include > > or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly > > which interfaces to use: > > > > http://www.open-mpi.org/faq/?category=tcp#tcp-selection > > > > Perhaps, I'm again misreading the output, but it appears that > 1.4.4rc2 does not even see the 2nd nic. >
So, now, I'm very confused! Using '--mca btl_tcp_if_include bge1,bge0' seems to work even though openmpi says that bge1 is invalid, and if I reverse the interfaces to '--mca btl_tcp_if_include bge0,bge1' the process appears stuck. :( hpc:kargl[341] /usr/local/openmpi-1.4.4/bin/mpiexec --mca btl_base_verbose 10 \ --mca btl_tcp_if_include bge1,bge0 --mca btl tcp,self -machinefile mf1 ./z ... [node11.cimu.org][[13885,1],1][btl_tcp_component.c:468:\ mca_btl_tcp_component_create_instances] invalid interface "bge1" [node11.cimu.org:22024] select: init of component tcp returned success 0: hpc.apl.washington.edu 1: node11.cimu.org Latency: 0.000073644 Sync Time: 0.000147468 Now starting main loop 0: 0 bytes 16384 times --> 0.00 Mbps in 0.000073622 sec 1: 1 bytes 16384 times --> 0.10 Mbps in 0.000073617 sec 2: 2 bytes 3395 times --> 0.21 Mbps in 0.000073634 sec 3: 3 bytes 1697 times --> 0.31 Mbps in 0.000073611 sec ... 126: 12582914 bytes 3 times --> 720.84 Mbps in 0.133178830 sec [hpc.apl.washington.edu:12390] mca: base: close: component self closed [hpc.apl.washington.edu:12390] mca: base: close: unloading component self [hpc.apl.washington.edu:12390] mca: base: close: component tcp closed [hpc.apl.washington.edu:12390] mca: base: close: unloading component tcp [node11.cimu.org:22024] mca: base: close: component self closed [node11.cimu.org:22024] mca: base: close: unloading component self [node11.cimu.org:22024] mca: base: close: component tcp closed [node11.cimu.org:22024] mca: base: close: unloading component tcp hpc:kargl[342] /usr/local/openmpi-1.4.4/bin/mpiexec --mca btl_base_verbose 10 \ --mca btl_tcp_if_include bge0,bge1 --mca btl tcp,self -machinefile mf1 ./z ... [node11.cimu.org][[13868,1],1][btl_tcp_component.c:468:\ mca_btl_tcp_component_create_instances] invalid interface "bge1" [node11.cimu.org:22048] select: init of component tcp returned success 0: hpc.apl.washington.edu 1: node11.cimu.org and nothing! -- Steve