On Tue, 27 Aug 2019 14:36:54 -0500
Cooper Burns via users <users@lists.open-mpi.org> wrote:

> Hello all,
> 
> I have been doing some MPI benchmarking on an Infiniband cluster.
> 
> Specs are:
> 12 cores/node
> 2.9ghz/core
> Infiniband interconnect (TCP also available)
> 
> Some runtime numbers:
> 192 cores total: (16 nodes)
> IntelMPI:
> 0.4 seconds
> OpenMPI 3.1.3 (--mca btl ^tcp):
> 2.5 seconds
> OpenMPI 3.1.3 (--mca btl ^openib):
> 26 seconds

5x is quite a difference...

Here are a few possible reasons I can think of:

1) The app was placed/pinned differently by the two MPIs. Often this
would probably not cause such a big difference.

2) Bad luck wrt collective performance. Different MPIs have different
weak spots across the parameter space of
numranks,transfersize,mpi-collective.

3) You're not on Mellanox infiniband but Qlogic/Intel (Truescale)
infiniband. Using openib there is better than tcp but not ideal (it
uses psm for native transport).

4) You changed more than the MPI. For example Intel compilers +
intel-mpi vs OpenMPI + gcc.

/Peter K 
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to