On Jul 19, 2008, at 7:06 AM, Bill Broadley wrote:
I built openib-1.2.6 on centos-5.2 with gcc-4.3.1.
I did a tar xvzf, cd openib-1.2.6, mkdir obj, cd obj:
(I put gcc-4.3.1/bin first in my path)
../configure --prefix=/opt/pkg/openmpi-1.2.6 --enable-shared --
enable-debug
If I look in config.log I see:
MCA_btl_ALL_COMPONENTS=' self sm gm mvapi mx openib portals tcp udapl'
MCA_btl_DSO_COMPONENTS=' self sm openib tcp'
So both openib and tcp are available and have many parameters under
ompi_info --param btl tcp
ompi_info --param btl openib
Yet, when I run a MPI program I can't get use TCP:
# which mpirun
/opt/pkg/openmpi-1.2.6/bin/mpirun
# mpirun -mca btl ^openib -np 2 -machinefile m ./relay 1
compute-0-1.local compute-0-0.local
size= 1, 131072 hops, 2 nodes in 0.304 sec ( 2.320 us/hop)
1683 KB/sec
Or if I try the inverse:
# mpirun -mca btl self,tcp -np 2 -machinefile m ./relay 1
compute-0-1.local compute-0-0.local
size= 1, 131072 hops, 2 nodes in 0.313 sec ( 2.386 us/hop)
1637 KB/sec
2.3us is definitely faster than GigE. I don't have IPoverIB setup,
ifconfig -a shows ib0, but it has no IP address.
Sorry for the delay in replying.
What exactly is the relay program timing? Can you run a standard
benchmark like NetPIPE, perchance? (http://www.scl.ameslab.gov/
netpipe/)
--
Jeff Squyres
Cisco Systems