Gary,

I previously missed the fact you are running on a 10GbE network, and I
still assumes you are not running a debug build.

maybe you need to increase send/recv buffer sizes
ompi_info --all | grep btl_tcp
will list the parameters that can be tuned,
then you can
mpirun --mca btl_tcp_<name> <value>

Cheers,

Gilles

On Friday, March 11, 2016, Jackson, Gary L. <gary.jack...@jhuapl.edu> wrote:

> I re-ran all experiments with 1.10.2 configured the way you specified. My
> results are here:
>
> https://www.dropbox.com/s/4v4jaxe8sflgymj/collected.pdf?dl=0
>
> Some remarks:
>
>    1. OpenMPI had poor performance relative to raw TCP and IMPI across
>    all MTUs.
>    2. Those issues appeared at larger message sizes.
>    3. Intel MPI and raw TCP were comparable across message sizes and MTUs.
>
> With respect to some other concerns:
>
>    1. I verified that the MTU values I'm using are correct with tracepath.
>    2. I am using a placement group.
>
> --
> Gary Jackson
>
> From: users <users-boun...@open-mpi.org
> <javascript:_e(%7B%7D,'cvml','users-boun...@open-mpi.org');>> on behalf
> of Gilles Gouaillardet <gil...@rist.or.jp
> <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>>
> Reply-To: Open MPI Users <us...@open-mpi.org
> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>>
> Date: Tuesday, March 8, 2016 at 11:07 PM
> To: Open MPI Users <us...@open-mpi.org
> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>>
> Subject: Re: [OMPI users] Poor performance on Amazon EC2 with TCP
>
> Jackson,
>
> one more thing, how did you build openmpi ?
>
> if you built from git (and without VPATH), then --enable-debug is
> automatically set, and this is hurting performance.
> if not already done, i recommend you download the latest openmpi tarball
> (1.10.2) and
> ./configure --with-platform=contrib/platform/optimized --prefix=...
> last but not least, you can
> mpirun --mca mpi_leave_pinned 1 <your benchmark>
> (that being said, i am not sure this is useful with TCP networks ...)
>
> Cheers,
>
> Gilles
>
>
>
> On 3/9/2016 11:34 AM, Rayson Ho wrote:
>
> If you are using instance types that support SR-IOV (aka. "enhanced
> networking" in AWS), then turn it on. We saw huge differences when SR-IOV
> is enabled
>
>
> http://blogs.scalablelogic.com/2013/12/enhanced-networking-in-aws-cloud.html
>
> http://blogs.scalablelogic.com/2014/01/enhanced-networking-in-aws-cloud-part-2.html
>
> Make sure you start your instances with a placement group -- otherwise,
> the instances can be data centers apart!
>
> And check that jumbo frames are enabled properly:
>
> http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/network_mtu.html
>
> But still, it is interesting that Intel MPI is getting a 2X speedup with
> the same setup! Can you post the raw numbers so that we can take a deeper
> look??
>
> Rayson
>
> ==================================================
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
> http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html
>
>
>
>
> On Tue, Mar 8, 2016 at 9:08 AM, Jackson, Gary L. <
> <javascript:_e(%7B%7D,'cvml','gary.jack...@jhuapl.edu');>
> gary.jack...@jhuapl.edu
> <javascript:_e(%7B%7D,'cvml','gary.jack...@jhuapl.edu');>> wrote:
>
>>
>> I've built OpenMPI 1.10.1 on Amazon EC2. Using NetPIPE, I'm seeing about
>> half the performance for MPI over TCP as I do with raw TCP. Before I start
>> digging in to this more deeply, does anyone know what might cause that?
>>
>> For what it's worth, I see the same issues with MPICH, but I do not see
>> it with Intel MPI.
>>
>> --
>> Gary Jackson
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/03/28659.php
>>
>
>
>
> _______________________________________________
> users mailing listus...@open-mpi.org 
> <javascript:_e(%7B%7D,'cvml','us...@open-mpi.org');>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2016/03/28665.php
>
>
>

Reply via email to