using 2 HCAs on the same PCI-Exp bus (as well as 2 ports from the same HCA)
will not improve performance, PCI-Exp is the bottleneck.


On Mon, Oct 20, 2008 at 2:28 AM, Mostyn Lewis <mostyn.le...@sun.com> wrote:

> Well, here's what I see with the IMB PingPong test using two ConnectX DDR
> cards
> in each of 2 machines. I'm just quoting the last line at 10 repetitions of
> the 4194304 bytes.
>
> Scali_MPI_Connect-5.6.4-59151: (multi rail setup in /etc/dat.conf)
>       #bytes #repetitions      t[usec]   Mbytes/sec
>      4194304           10      2198.24      1819.63
> mvapich2-1.2rc2: (MV2_NUM_HCAS=2 MV2_NUM_PORTS=1)
>       #bytes #repetitions      t[usec]   Mbytes/sec
>      4194304           10      2427.24      1647.96
> OpenMPI SVN 19772:
>       #bytes #repetitions      t[usec]   Mbytes/sec
>      4194304           10      3676.35      1088.03
>
> Repeatable within bounds.
>
> This is OFED-1.3.1 and I peered at
> /sys/class/infiniband/mlx4_0/ports/1/counters/port_rcv_packets
> and
> /sys/class/infiniband/mlx4_1/ports/1/counters/port_rcv_packets
> on one of the 2 machines and looked at what happened for Scali
> and OpenMPI.
>
> Scali packets:
> HCA 0 port1 = 115116625 - 114903198 = 213427
> HCA 1 port1 =  78099566 -  77886143 = 213423
> --------------------------------------------
>                                      426850
> OpenMPI packets:
> HCA 0 port1 = 115233624 - 115116625 = 116999
> HCA 1 port1 =  78216425 -  78099566 = 116859
> --------------------------------------------
>                                      233858
>
> Scali is set up so that data larger than 8192 bytes is striped
> across the 2 HCAs using 8192 bytes per HCA in a round robin fashion.
>
> So, it seems that OpenMPI is using both ports but strangley ends
> up with a Mbytes/sec rate which is worse than a single HCA only.
> If I use a --mca btl_openib_if_exclude mlx41:1, we get
>       #bytes #repetitions      t[usec]   Mbytes/sec
>      4194304           10      3080.59      1298.45
>
> So, what's taking so long? Is this a threading question?
>
> DM
>
>
> On Sun, 19 Oct 2008, Jeff Squyres wrote:
>
>  On Oct 18, 2008, at 9:19 PM, Mostyn Lewis wrote:
>>
>>  Can OpenMPI do like Scali and MVAPICH2 and utilize 2 IB HCAs per machine
>>> to approach double the bandwidth on simple tests such as IMB PingPong?
>>>
>>
>>
>> Yes.  OMPI will automatically (and aggressively) use as many active ports
>> as you have.  So you shouldn't need to list devices+ports -- OMPI will
>> simply use all ports that it finds in the active state.  If your ports are
>> on physically separate IB networks, then each IB network will require a
>> different subnet ID so that OMPI can compute reachability properly.
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to