On Fri, Jul 08, 2011 at 12:09:09PM -0700, Steve Kargl wrote:
> On Fri, Jul 08, 2011 at 02:19:27PM -0400, Jeff Squyres wrote:
> > 
> > The easiest way to fix this is likely to use the btl_tcp_if_include
> > or btl_tcp_if_exclude MCA parameters -- i.e., tell OMPI exactly
> > which interfaces to use:
> > 
> >     http://www.open-mpi.org/faq/?category=tcp#tcp-selection
> > 
> 
> Perhaps, I'm again misreading the output, but it appears that
> 1.4.4rc2 does not even see the 2nd nic.
> 

So, now, I'm very confused!  Using '--mca btl_tcp_if_include bge1,bge0'
seems to work even though openmpi says that bge1 is invalid, and if I
reverse the interfaces to '--mca btl_tcp_if_include bge0,bge1' the 
process appears stuck. :(


hpc:kargl[341] /usr/local/openmpi-1.4.4/bin/mpiexec --mca btl_base_verbose 10 \
  --mca btl_tcp_if_include bge1,bge0 --mca btl tcp,self -machinefile mf1 ./z
...
[node11.cimu.org][[13885,1],1][btl_tcp_component.c:468:\
mca_btl_tcp_component_create_instances] invalid interface "bge1"
[node11.cimu.org:22024] select: init of component tcp returned success
0: hpc.apl.washington.edu
1: node11.cimu.org
Latency: 0.000073644
Sync Time: 0.000147468
Now starting main loop
  0:         0 bytes 16384 times -->    0.00 Mbps in 0.000073622 sec
  1:         1 bytes 16384 times -->    0.10 Mbps in 0.000073617 sec
  2:         2 bytes 3395  times -->    0.21 Mbps in 0.000073634 sec
  3:         3 bytes 1697  times -->    0.31 Mbps in 0.000073611 sec
...
126:  12582914 bytes    3  times -->  720.84 Mbps in 0.133178830 sec
[hpc.apl.washington.edu:12390] mca: base: close: component self closed
[hpc.apl.washington.edu:12390] mca: base: close: unloading component self
[hpc.apl.washington.edu:12390] mca: base: close: component tcp closed
[hpc.apl.washington.edu:12390] mca: base: close: unloading component tcp
[node11.cimu.org:22024] mca: base: close: component self closed
[node11.cimu.org:22024] mca: base: close: unloading component self
[node11.cimu.org:22024] mca: base: close: component tcp closed
[node11.cimu.org:22024] mca: base: close: unloading component tcp


hpc:kargl[342] /usr/local/openmpi-1.4.4/bin/mpiexec --mca btl_base_verbose 10 \
--mca btl_tcp_if_include bge0,bge1 --mca btl tcp,self -machinefile mf1 ./z
...
[node11.cimu.org][[13868,1],1][btl_tcp_component.c:468:\
mca_btl_tcp_component_create_instances] invalid interface "bge1"
[node11.cimu.org:22048] select: init of component tcp returned success
0: hpc.apl.washington.edu
1: node11.cimu.org

and nothing!

-- 
Steve

Reply via email to