Re: [OMPI devel] poor btl sm latency

Jeff Squyres Wed, 15 Feb 2012 13:29:42 -0500

Something is definitely wrong -- 1.4us is way too high for a 0 or 1 byte HRT 
ping pong.  What is this all2all benchmark, btw?  Is it measuring an 
MPI_ALLTOALL, or a pingpong?


FWIW, on an older Nehalem machine running NetPIPE/MPI, I'm getting about .27us 
latencies for short messages over sm and binding to socket.


On Feb 14, 2012, at 7:20 AM, Matthias Jurenz wrote:

> I've built Open MPI 1.5.5rc1 (tarball from Web) with CFLAGS=-O3. 
> Unfortunately, also without any effect.
> 
> Here some results with enabled binding reports:
> 
> $ mpirun *--bind-to-core* --report-bindings -np 2 ./all2all_ompi1.5.5
> [n043:61313] [[56788,0],0] odls:default:fork binding child [[56788,1],1] to 
> cpus 0002
> [n043:61313] [[56788,0],0] odls:default:fork binding child [[56788,1],0] to 
> cpus 0001
> latency: 1.415us
> 
> $ mpirun *-mca maffinity hwloc --bind-to-core* --report-bindings -np 2 
> ./all2all_ompi1.5.5
> [n043:61469] [[49736,0],0] odls:default:fork binding child [[49736,1],1] to 
> cpus 0002
> [n043:61469] [[49736,0],0] odls:default:fork binding child [[49736,1],0] to 
> cpus 0001
> latency: 1.4us
> 
> $ mpirun *-mca maffinity first_use --bind-to-core* --report-bindings -np 2 
> ./all2all_ompi1.5.5
> [n043:61508] [[49681,0],0] odls:default:fork binding child [[49681,1],1] to 
> cpus 0002
> [n043:61508] [[49681,0],0] odls:default:fork binding child [[49681,1],0] to 
> cpus 0001
> latency: 1.4us
> 
> 
> $ mpirun *--bind-to-socket* --report-bindings -np 2 ./all2all_ompi1.5.5
> [n043:61337] [[56780,0],0] odls:default:fork binding child [[56780,1],1] to 
> socket 0 cpus 0001
> [n043:61337] [[56780,0],0] odls:default:fork binding child [[56780,1],0] to 
> socket 0 cpus 0001
> latency: 4.0us
> 
> $ mpirun *-mca maffinity hwloc --bind-to-socket* --report-bindings -np 2 
> ./all2all_ompi1.5.5 
> [n043:61615] [[49914,0],0] odls:default:fork binding child [[49914,1],1] to 
> socket 0 cpus 0001
> [n043:61615] [[49914,0],0] odls:default:fork binding child [[49914,1],0] to 
> socket 0 cpus 0001
> latency: 4.0us
> 
> $ mpirun *-mca maffinity first_use --bind-to-socket* --report-bindings -np 2 
> ./all2all_ompi1.5.5 
> [n043:61639] [[49810,0],0] odls:default:fork binding child [[49810,1],1] to 
> socket 0 cpus 0001
> [n043:61639] [[49810,0],0] odls:default:fork binding child [[49810,1],0] to 
> socket 0 cpus 0001
> latency: 4.0us
> 
> 
> If socket-binding is enabled it seems that all ranks are bind to the very 
> first 
> core of one and the same socket. Is it intended? I expected that each rank 
> gets its own socket (i.e. 2 ranks -> 2 sockets)...
> 
> Matthias
> 
> On Monday 13 February 2012 22:36:50 Jeff Squyres wrote:
>> Also, double check that you have an optimized build, not a debugging build.
>> 
>> SVN and HG checkouts default to debugging builds, which add in lots of
>> latency.
>> 
>> On Feb 13, 2012, at 10:22 AM, Ralph Castain wrote:
>>> Few thoughts
>>> 
>>> 1. Bind to socket is broken in 1.5.4 - fixed in next release
>>> 
>>> 2. Add --report-bindings to cmd line and see where it thinks the procs
>>> are bound
>>> 
>>> 3. Sounds lime memory may not be local - might be worth checking mem
>>> binding.
>>> 
>>> Sent from my iPad
>>> 
>>> On Feb 13, 2012, at 7:07 AM, Matthias Jurenz <matthias.jurenz@tu-
> dresden.de> wrote:
>>>> Hi Sylvain,
>>>> 
>>>> thanks for the quick response!
>>>> 
>>>> Here some results with enabled process binding. I hope I used the
>>>> parameters correctly...
>>>> 
>>>> bind two ranks to one socket:
>>>> $ mpirun -np 2 --bind-to-core ./all2all
>>>> $ mpirun -np 2 -mca mpi_paffinity_alone 1 ./all2all
>>>> 
>>>> bind two ranks to two different sockets:
>>>> $ mpirun -np 2 --bind-to-socket ./all2all
>>>> 
>>>> All three runs resulted in similar bad latencies (~1.4us).
>>>> 
>>>> :-(
>>>> 
>>>> Matthias
>>>> 
>>>> On Monday 13 February 2012 12:43:22 sylvain.jeau...@bull.net wrote:
>>>>> Hi Matthias,
>>>>> 
>>>>> You might want to play with process binding to see if your problem is
>>>>> related to bad memory affinity.
>>>>> 
>>>>> Try to launch pingpong on two CPUs of the same socket, then on
>>>>> different sockets (i.e. bind each process to a core, and try different
>>>>> configurations).
>>>>> 
>>>>> Sylvain
>>>>> 
>>>>> 
>>>>> 
>>>>> De :    Matthias Jurenz <matthias.jur...@tu-dresden.de>
>>>>> A :     Open MPI Developers <de...@open-mpi.org>
>>>>> Date :  13/02/2012 12:12
>>>>> Objet : [OMPI devel] poor btl sm latency
>>>>> Envoyé par :    devel-boun...@open-mpi.org
>>>>> 
>>>>> 
>>>>> 
>>>>> Hello all,
>>>>> 
>>>>> on our new AMD cluster (AMD Opteron 6274, 2,2GHz) we get very bad
>>>>> latencies
>>>>> (~1.5us) when performing 0-byte p2p communication on one single node
>>>>> using the
>>>>> Open MPI sm BTL. When using Platform MPI we get ~0.5us latencies which
>>>>> is pretty good. The bandwidth results are similar for both MPI
>>>>> implementations
>>>>> (~3,3GB/s) - this is okay.
>>>>> 
>>>>> One node has 64 cores and 64Gb RAM where it doesn't matter how many
>>>>> ranks allocated by the application. We get similar results with
>>>>> different number of
>>>>> ranks.
>>>>> 
>>>>> We are using Open MPI 1.5.4 which is built by gcc 4.3.4 without any
>>>>> special
>>>>> configure options except the installation prefix and the location of
>>>>> the LSF
>>>>> stuff.
>>>>> 
>>>>> As mentioned at http://www.open-mpi.org/faq/?category=sm we tried to
>>>>> use /dev/shm instead of /tmp for the session directory, but it had no
>>>>> effect. Furthermore, we tried the current release candidate 1.5.5rc1
>>>>> of Open MPI which
>>>>> provides an option to use the SysV shared memory (-mca shmem sysv) -
>>>>> also this
>>>>> results in similar poor latencies.
>>>>> 
>>>>> Do you have any idea? Please help!
>>>>> 
>>>>> Thanks,
>>>>> Matthias
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] poor btl sm latency

Reply via email to