George,

Looks like I have some values already set for openib and gm bandwidth:
# ompi_info --param all all |grep -i band
                MCA btl: parameter "btl_gm_bandwidth" (current value: "250")
                MCA btl: parameter "btl_mvapi_bandwidth" (current value: "800")
                         Approximate maximum bandwidth of interconnect
                MCA btl: parameter "btl_openib_bandwidth" (current
value: "800")
                         Approximate maximum bandwidth of interconnect

whereas, ompi_info reports no available parameters dealing with latency:
# ompi_info --param all all |grep -i laten
<no output>

Also, I'm not entirely sure what value to set the latency to,
especially for tcp. It depends on so many factors and varies. Why does
the latency value have effect on message striping? I can see how
knowing the bandwidth limitations of available interconnects would
allow you to proportionally divide up the message among them, but
latency? Especially for large message sizes the time should be
dominated by the bandwidth limitations.

Finally, what are the units for bandwidth and latency mca parameters
and how did you arrive at the values you set in your params file? Is
there a description of the message striping algorithm somewhere (other
than code :) )?

Thanks,
Alex.


On 2/8/07, George Bosilca <bosi...@cs.utk.edu> wrote:
In order to get any performance improvement from stripping the
messages over multiple interconnects one has to specify the latency
and bandwidth for these interconnects, and to make sure that any of
them don't ask for exclusivity. I'm usually running over multiple TCP
interconnects and here is my mca-params.conf file:
btl_tcp_if_include = eth0,eth1
btl_tcp_max_rdma_size = 524288

btl_tcp_latency_eth0 = 47
btl_tcp_bandwidth_eth0 = 587

btl_tcp_latency_eth1 = 51
btl_tcp_bandwidth_eth1 = 233

Something similar has to be done for openib and gm, in order to allow
us to strip the messages correctly.

   Thanks,
     george.

On Feb 8, 2007, at 12:02 PM, Alex Tumanov wrote:

> Hello Jeff. Thanks for pointing out NetPipe to me. I've played around
> with it a little in hope to see clear evidence/effect of message
> striping in OpenMPI. Unfortunately, what I saw is that the result of
> running NPmpi over several interconnects is identical to running it
> over a single fastest one :-( That was not the expected behavior, and
> I'm hoping that I'm doing something wrong. I'm using NetPIPE_3.6.2
> over OMPI 1.1.4. NetPipe was compiled by making sure Open MPI's mpicc
> can be found and simply  running 'make mpi' under NetPIPE_3.6.2
> directory.
>
> I experimented with 3 interconnects: openib, gm, and gig-e.
> Specifically, I found that the times (and, correspondingly, bandwidth)
> reported for openib+gm is pretty much identical to the times reported
> for just openib. Here are the commands I used to initiate the
> benchmark:
>
> #  mpirun -H f0-0,c0-0 --prefix $MPIHOME --mca btl openib,gm,self
> ~/NPmpi > ~/testdir/ompi/netpipe/ompi_netpipe_openib+gm.log 2>&1
> #  mpirun -H f0-0,c0-0 --prefix $MPIHOME --mca btl openib,self ~/NPmpi
>> ompi_netpipe_openib.log 2>&1
>
> Similarly, for tcp+gm the reported times were identical to just
> running the benchmark over gm alone. The commands were:
> #  mpirun -H f0-0,c0-0 --prefix $MPIHOME --mca btl tcp,gm,self --mca
> btl_tcp_if_exclude lo,ib0,ib1 ~/NPmpi
> #  mpirun -H f0-0,c0-0 --prefix $MPIHOME --mca btl gm,self ~/NPmpi
>
> Orthogonally, I've also observed that trying to use any combination of
> interconnects that includes openib (except using it exclusively) fails
> as soon as the benchmark reaches trials with 1.5MB message sizes. In
> fact the CPU load remained at 100% on the headnode, but no further
> output is sent to the log file or the screen (see the tails below).
> This behavior is fairly consistent and may be of interest to Open MPI
> development community. If anybody has tried using openib in
> combination with other interconnects please let me know what issues
> you've encountered and what tips and tricks you could share in this
> regard.
>
> Many thanks. Keep up the good work!
>
> Sincerely,
> Alex.
>
> Tails (the log file name reflects the combination of interconnects in
> that CL order):
> # tail ompi_netpipe_gm+openib.log
> 101:  786432 bytes     38 times -->   3582.46 Mbps in    1674.83 usec
> 102:  786435 bytes     39 times -->   3474.50 Mbps in    1726.87 usec
> 103: 1048573 bytes     19 times -->   3592.47 Mbps in    2226.87 usec
> 104: 1048576 bytes     22 times -->   3515.15 Mbps in    2275.86 usec
> 105: 1048579 bytes     21 times -->   3480.22 Mbps in    2298.71 usec
> 106: 1572861 bytes     21 times -->   4174.76 Mbps in    2874.41 usec
> 107: 1572864 bytes     23 times --> mpirun: killing job...
>
> # tail ompi_netpipe_openib+gm.log
> 100:  786429 bytes     45 times -->   3477.98 Mbps in    1725.13 usec
> 101:  786432 bytes     38 times -->   3578.94 Mbps in    1676.47 usec
> 102:  786435 bytes     39 times -->   3480.66 Mbps in    1723.82 usec
> 103: 1048573 bytes     19 times -->   3594.26 Mbps in    2225.76 usec
> 104: 1048576 bytes     22 times -->   3517.46 Mbps in    2274.37 usec
> 105: 1048579 bytes     21 times -->   3482.13 Mbps in    2297.45 usec
> 106: 1572861 bytes     21 times --> mpirun: killing job...
>
> # tail ompi_netpipe_openib+tcp+gm.log
> 100:  786429 bytes     45 times -->   3481.45 Mbps in    1723.41 usec
> 101:  786432 bytes     38 times -->   3575.83 Mbps in    1677.93 usec
> 102:  786435 bytes     39 times -->   3479.05 Mbps in    1724.61 usec
> 103: 1048573 bytes     19 times -->   3589.68 Mbps in    2228.61 usec
> 104: 1048576 bytes     22 times -->   3517.96 Mbps in    2274.05 usec
> 105: 1048579 bytes     21 times -->   3484.12 Mbps in    2296.14 usec
> 106: 1572861 bytes     21 times --> mpirun: killing job...
>
> # tail -5 ompi_netpipe_openib.log
> 119: 6291456 bytes      5 times -->   4036.63 Mbps in   11891.10 usec
> 120: 6291459 bytes      5 times -->   4005.81 Mbps in   11982.61 usec
> 121: 8388605 bytes      3 times -->   4033.78 Mbps in   15866.00 usec
> 122: 8388608 bytes      3 times -->   4025.50 Mbps in   15898.66 usec
> 123: 8388611 bytes      3 times -->   4017.58 Mbps in   15929.98 usec
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

"Half of what I say is meaningless; but I say it so that the other
half may reach you"
                                   Kahlil Gibran


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to