On Jul 19, 2008, at 7:06 AM, Bill Broadley wrote:

I built openib-1.2.6 on centos-5.2 with gcc-4.3.1.

I did a tar xvzf, cd openib-1.2.6, mkdir obj, cd obj:
(I put gcc-4.3.1/bin first in my path)
../configure --prefix=/opt/pkg/openmpi-1.2.6 --enable-shared -- enable-debug

If I look in config.log I see:
MCA_btl_ALL_COMPONENTS=' self sm gm mvapi mx openib portals tcp udapl'
MCA_btl_DSO_COMPONENTS=' self sm openib tcp'

So both openib and tcp are available and have many parameters under
ompi_info --param btl tcp
ompi_info --param btl openib

Yet, when I run a MPI program I can't get use TCP:
# which mpirun
/opt/pkg/openmpi-1.2.6/bin/mpirun
# mpirun -mca btl ^openib -np 2 -machinefile m ./relay 1
compute-0-1.local compute-0-0.local
size= 1, 131072 hops, 2 nodes in 0.304 sec ( 2.320 us/hop) 1683 KB/sec

Or if I try the inverse:
# mpirun -mca btl self,tcp -np 2 -machinefile m ./relay 1
compute-0-1.local compute-0-0.local
size= 1, 131072 hops, 2 nodes in 0.313 sec ( 2.386 us/hop) 1637 KB/sec

2.3us is definitely faster than GigE. I don't have IPoverIB setup, ifconfig -a shows ib0, but it has no IP address.


Sorry for the delay in replying.

What exactly is the relay program timing? Can you run a standard benchmark like NetPIPE, perchance? (http://www.scl.ameslab.gov/ netpipe/)

--
Jeff Squyres
Cisco Systems

Reply via email to