[OMPI users] tcp of openmpi-1.7.3 under our environment is very slow

2013-12-16 Thread tmishima


Hi,

I usually use infiniband network, where openmpi-1.7.3 and 1.6.5 works fine.

The other days, I had a chance to use tcp network(1GbE) and I noticed that
my application with openmpi-1.7.3 was quite slower than openmpi-1.6.5.
So, I did OSU MPI Bandwidth Test v3.1.1 as shown below, which shows
bandwidth for smaller size(< 1024) is very slow compared with 1.6.5.
In addition, the latency for larger size( >65536 ) seems to be strange.

Does this depend on our local environment or some mca parameter would be
necesarry? I'm afraid that something is wrong with tcp of openmpi-1.7.3.

openmpi-1.7.3:

[mishima@node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca tbl
^openib osu_bw
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.00
2 0.01
4 0.01
8 0.03
160.05
320.10
640.32
128   0.37
256   0.87
512   5.97
1024 20.00
2048182.87
4096202.53
8192215.14
16384   225.16
32768   228.58
65536   115.23
131072  198.24
262144  193.38
524288  233.03
1048576 227.31
2097152 233.07
4194304 233.25

[mishima@node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca btl
^openib osu_latency
# OSU MPI Latency Test v3.1.1
# SizeLatency (us)
019.23
119.57
219.52
419.88
820.44
16   20.38
32   20.78
64   21.14
128  21.75
256  23.20
512  26.12
1024 31.54
2048 41.72
4096 64.55
8192107.52
16384   179.23
32768   251.58
65536 20689.68
13107221179.79
26214420168.56
52428822984.83
1048576   25994.54
2097152   30929.55
4194304   38028.48

openmpi-1.6.5:

[mishima@node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca tbl
^openib osu_bw
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.22
2 0.45
4 0.89
8 1.77
163.57
327.15
64   14.28
128  28.58
256  57.17
512  96.44
1024152.38
2048182.84
4096203.17
8192215.13
16384   225.05
32768   100.58
65536   225.24
131072  182.92
262144  192.82
524288  212.92
1048576 233.35
2097152 233.72
4194304 233.89

[mishima@node07 OMB-3.1.1]$ mpirun -np 2 -host node07,node08 -mca btl
^openib osu_latency
# OSU MPI Latency Test v3.1.1
# SizeLatency (us)
017.24
117.30
217.29
417.30
824.32
16   17.24
32   17.80
64   17.91
128  19.08
256  20.81
512  22.83
1024 27.82
2048 39.54
4096 52.66
8192 97.70
16384   143.23
32768   215.02
65536   481.08
131072  800.64
262144 1475.12
524288 2698.62
10485764992.31
20971529558.96
4194304   20801.50

Regards,
Tetsuya Mishima



Re: [OMPI users] typo in opal/memoryhooks/memory.h (1.6.5)

2013-12-16 Thread Jeff Squyres (jsquyres)
On Dec 16, 2013, at 2:24 PM, Gus Correa  wrote:

> A question, for the benefit of OMPI 1.6.5 users (stable-version die hards 
> like us here).
> When fixes like Ake's are applied to a stable version,
> do they make it to the (1.6.5) tarball or to some other code base?

They are currently going into the 1.6.x nightly tarballs.  For example, I just 
committed some minor fixes last week, and then Ake's fix today (and another 
long-standing trivial fix) to the 1.6 branch, which automatically triggers a 
nightly tarball build:

http://www.open-mpi.org/nightly/v1.6/

> How innocuous would it be not to apply the the typo fix
> caught by Ake, and keep OMPI 1.6.5 programs running?
> Would opal be lobotomized?

Nah, that was a totally trivial fix.  It won't affect the correctness or 
performance of your Open MPI 1.6.x installation.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI users] slowdown with infiniband and latest CentOS kernel

2013-12-16 Thread Noam Bernstein
Has anyone tried to use openmpi 1.7.3 with the latest CentOS kernel  
(well, nearly latest: 2.6.32-431.el6.x86_64), and especially with infiniband? 

I'm seeing lots of weird slowdowns, especially when using infiniband,
but even when running with "--mca btl self,sm" (it's much worse with
IB, though), so I was wondering if anyone else has tested this kernel yet?

Once I have some more detailed information I'll follow up.

Noam

Re: [OMPI users] [EXTERNAL] Re: Configuration for rendezvous and eager protocols: two-sided comm

2013-12-16 Thread Siddhartha Jana
Noted. Thanks all for the tips !
 On 16-Dec-2013 2:36 pm, "Jeff Squyres (jsquyres)" 
wrote:

> Everything that Brian said, plus: note that the MCA param that Christoph
> mentioned is specifically for the "sm" (shared memory) transport.  Each
> transport has their own set of MCA params (e.g., mca_btl_tcp_eager_limit,
> and friends).
>
>
> On Dec 16, 2013, at 3:19 PM, "Barrett, Brian W" 
> wrote:
>
> > Siddhartha -
> >
> > Christoph mentioned how to change the cross-over for shared memory, but
> it's really per-transport (so you'd have to change it for your off-node
> transport as well).  That's all in the FAQ you mentioned, so hopefully you
> can take it from there.  Note that, in general, moving the eager limits has
> some unintended side effects.  For example, it can cause more / less
> copies.  It can also greatly increase memory usage.
> >
> > Good luck,
> >
> > Brian
> >
> > On 12/16/13 1:49 AM, "Siddhartha Jana" 
> wrote:
> >
> >> Thanks Christoph.
> >> I should have looked into the FAQ section on MCA params setting @ :
> >> http://www.open-mpi.org/faq/?category=tuning#available-mca-params
> >>
> >> Thanks again,
> >> -- Siddhartha
> >>
> >>
> >> On 16 December 2013 02:41, Christoph Niethammer 
> wrote:
> >>> Hi Siddhartha,
> >>>
> >>> MPI_Send/Recv in Open MPI implements both protocols and chooses based
> on the message size which one to use.
> >>> You can use the mca parameter "btl_sm_eager_limit" to modify the
> behaviour.
> >>>
> >>> Here the corresponding info obtained from the ompi_info tool:
> >>>
> >>> "btl_sm_eager_limit" (current value: <4096>, data source: default
> value)
> >>> Maximum size (in bytes) of "short" messages (must be >= 1)
> >>>
> >>> Regards
> >>> Christoph Niethammer
> >>>
> >>> --
> >>>
> >>> Christoph Niethammer
> >>> High Performance Computing Center Stuttgart (HLRS)
> >>> Nobelstrasse 19
> >>> 70569 Stuttgart
> >>>
> >>> Tel: ++49(0)711-685-87203
> >>> email: nietham...@hlrs.de
> >>> http://www.hlrs.de/people/niethammer
> >>>
> >>>
> >>>
> >>> - Ursprüngliche Mail -
> >>> Von: "Siddhartha Jana" 
> >>> An: "OpenMPI users mailing list" 
> >>> Gesendet: Samstag, 14. Dezember 2013 13:44:12
> >>> Betreff: [OMPI users] Configuration for rendezvous and eager
> protocols: two-sided comm
> >>>
> >>>
> >>>
> >>> Hi
> >>>
> >>>
> >>> In OpenMPI, are MPI_Send, MPI_Recv (and friends) implemented using
> rendezvous protocol or eager protocol?
> >>>
> >>>
> >>> If both, is there a way to choose one or the other during runtime or
> while building the library?
> >>>
> >>>
> >>> If there is a threshold of the message size that dictates the protocol
> to be used, is there a way I can alter that threshold value?
> >>>
> >>>
> >>> If different protocols were used for different versions of the library
> in the past, could someone please direct me to the exact version numbers of
> the implementations that used one or the other protocol?
> >>>
> >>>
> >>> Thanks a lot,
> >>> Siddhartha
> >>> ___
> >>> users mailing list
> >>> us...@open-mpi.org
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>> ___
> >>> users mailing list
> >>> us...@open-mpi.org
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> >
> > --
> >   Brian W. Barrett
> >   Scalable System Software Group
> >   Sandia National Laboratories
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] [EXTERNAL] Re: Configuration for rendezvous and eager protocols: two-sided comm

2013-12-16 Thread Jeff Squyres (jsquyres)
Everything that Brian said, plus: note that the MCA param that Christoph 
mentioned is specifically for the "sm" (shared memory) transport.  Each 
transport has their own set of MCA params (e.g., mca_btl_tcp_eager_limit, and 
friends).


On Dec 16, 2013, at 3:19 PM, "Barrett, Brian W"  wrote:

> Siddhartha -
> 
> Christoph mentioned how to change the cross-over for shared memory, but it's 
> really per-transport (so you'd have to change it for your off-node transport 
> as well).  That's all in the FAQ you mentioned, so hopefully you can take it 
> from there.  Note that, in general, moving the eager limits has some 
> unintended side effects.  For example, it can cause more / less copies.  It 
> can also greatly increase memory usage.
> 
> Good luck,
> 
> Brian
> 
> On 12/16/13 1:49 AM, "Siddhartha Jana"  wrote:
> 
>> Thanks Christoph. 
>> I should have looked into the FAQ section on MCA params setting @ :
>> http://www.open-mpi.org/faq/?category=tuning#available-mca-params
>> 
>> Thanks again,
>> -- Siddhartha
>> 
>> 
>> On 16 December 2013 02:41, Christoph Niethammer  wrote:
>>> Hi Siddhartha,
>>> 
>>> MPI_Send/Recv in Open MPI implements both protocols and chooses based on 
>>> the message size which one to use.
>>> You can use the mca parameter "btl_sm_eager_limit" to modify the behaviour.
>>> 
>>> Here the corresponding info obtained from the ompi_info tool:
>>> 
>>> "btl_sm_eager_limit" (current value: <4096>, data source: default value)
>>> Maximum size (in bytes) of "short" messages (must be >= 1)
>>> 
>>> Regards
>>> Christoph Niethammer
>>> 
>>> --
>>> 
>>> Christoph Niethammer
>>> High Performance Computing Center Stuttgart (HLRS)
>>> Nobelstrasse 19
>>> 70569 Stuttgart
>>> 
>>> Tel: ++49(0)711-685-87203
>>> email: nietham...@hlrs.de
>>> http://www.hlrs.de/people/niethammer
>>> 
>>> 
>>> 
>>> - Ursprüngliche Mail -
>>> Von: "Siddhartha Jana" 
>>> An: "OpenMPI users mailing list" 
>>> Gesendet: Samstag, 14. Dezember 2013 13:44:12
>>> Betreff: [OMPI users] Configuration for rendezvous and eager protocols: 
>>> two-sided comm
>>> 
>>> 
>>> 
>>> Hi
>>> 
>>> 
>>> In OpenMPI, are MPI_Send, MPI_Recv (and friends) implemented using 
>>> rendezvous protocol or eager protocol?
>>> 
>>> 
>>> If both, is there a way to choose one or the other during runtime or while 
>>> building the library?
>>> 
>>> 
>>> If there is a threshold of the message size that dictates the protocol to 
>>> be used, is there a way I can alter that threshold value?
>>> 
>>> 
>>> If different protocols were used for different versions of the library in 
>>> the past, could someone please direct me to the exact version numbers of 
>>> the implementations that used one or the other protocol?
>>> 
>>> 
>>> Thanks a lot,
>>> Siddhartha
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> 
> --
>   Brian W. Barrett
>   Scalable System Software Group
>   Sandia National Laboratories
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] [EXTERNAL] Re: Configuration for rendezvous and eager protocols: two-sided comm

2013-12-16 Thread Barrett, Brian W
Siddhartha -

Christoph mentioned how to change the cross-over for shared memory, but it's 
really per-transport (so you'd have to change it for your off-node transport as 
well).  That's all in the FAQ you mentioned, so hopefully you can take it from 
there.  Note that, in general, moving the eager limits has some unintended side 
effects.  For example, it can cause more / less copies.  It can also greatly 
increase memory usage.

Good luck,

Brian

On 12/16/13 1:49 AM, "Siddhartha Jana" 
> wrote:

Thanks Christoph.
I should have looked into the FAQ section on MCA params setting @ :
http://www.open-mpi.org/faq/?category=tuning#available-mca-params

Thanks again,
-- Siddhartha


On 16 December 2013 02:41, Christoph Niethammer 
> wrote:
Hi Siddhartha,

MPI_Send/Recv in Open MPI implements both protocols and chooses based on the 
message size which one to use.
You can use the mca parameter "btl_sm_eager_limit" to modify the behaviour.

Here the corresponding info obtained from the ompi_info tool:

"btl_sm_eager_limit" (current value: <4096>, data source: default value)
Maximum size (in bytes) of "short" messages (must be >= 1)

Regards
Christoph Niethammer

--

Christoph Niethammer
High Performance Computing Center Stuttgart (HLRS)
Nobelstrasse 19
70569 Stuttgart

Tel: ++49(0)711-685-87203
email: nietham...@hlrs.de
http://www.hlrs.de/people/niethammer



- Ursprüngliche Mail -
Von: "Siddhartha Jana" 
>
An: "OpenMPI users mailing list" >
Gesendet: Samstag, 14. Dezember 2013 13:44:12
Betreff: [OMPI users] Configuration for rendezvous and eager protocols: 
two-sided comm



Hi


In OpenMPI, are MPI_Send, MPI_Recv (and friends) implemented using rendezvous 
protocol or eager protocol?


If both, is there a way to choose one or the other during runtime or while 
building the library?


If there is a threshold of the message size that dictates the protocol to be 
used, is there a way I can alter that threshold value?


If different protocols were used for different versions of the library in the 
past, could someone please direct me to the exact version numbers of the 
implementations that used one or the other protocol?


Thanks a lot,
Siddhartha
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
  Brian W. Barrett
  Scalable System Software Group
  Sandia National Laboratories


Re: [OMPI users] typo in opal/memoryhooks/memory.h (1.6.5)

2013-12-16 Thread Gus Correa

Hi Jeff

A question, for the benefit of OMPI 1.6.5 users (stable-version die 
hards like us here).

When fixes like Ake's are applied to a stable version,
do they make it to the (1.6.5) tarball or to some other code base?

How innocuous would it be not to apply the the typo fix
caught by Ake, and keep OMPI 1.6.5 programs running?
Would opal be lobotomized?

Many thanks,
Gus Correa

On 12/16/2013 01:45 PM, Jeff Squyres (jsquyres) wrote:

Fixed -- thanks!

(I confirmed that it's not an issue in the 1.7 series, too)


On Dec 16, 2013, at 1:36 PM, Ake Sandgren  wrote:


Hi!

Not sure if this has been caught already or not, but there is a typo in
opal/memoryhooks/memory.h in 1.6.5.

#ifndef OPAL_MEMORY_MEMORY_H
#define OPAl_MEMORY_MEMORY_H

Note the lower case "l" in the define.

/Åke S.

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users







Re: [OMPI users] typo in opal/memoryhooks/memory.h (1.6.5)

2013-12-16 Thread Jeff Squyres (jsquyres)
Fixed -- thanks!

(I confirmed that it's not an issue in the 1.7 series, too)


On Dec 16, 2013, at 1:36 PM, Ake Sandgren  wrote:

> Hi!
> 
> Not sure if this has been caught already or not, but there is a typo in 
> opal/memoryhooks/memory.h in 1.6.5.
> 
> #ifndef OPAL_MEMORY_MEMORY_H
> #define OPAl_MEMORY_MEMORY_H
> 
> Note the lower case "l" in the define.
> 
> /Åke S.
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI users] typo in opal/memoryhooks/memory.h (1.6.5)

2013-12-16 Thread Ake Sandgren
Hi!

Not sure if this has been caught already or not, but there is a typo in 
opal/memoryhooks/memory.h in 1.6.5.

#ifndef OPAL_MEMORY_MEMORY_H
#define OPAl_MEMORY_MEMORY_H

Note the lower case "l" in the define.

/Åke S.



Re: [OMPI users] Configuration for rendezvous and eager protocols: two-sided comm

2013-12-16 Thread Siddhartha Jana
Thanks Christoph.
I should have looked into the FAQ section on MCA params setting @ :
http://www.open-mpi.org/faq/?category=tuning#available-mca-params

Thanks again,
-- Siddhartha


On 16 December 2013 02:41, Christoph Niethammer  wrote:

> Hi Siddhartha,
>
> MPI_Send/Recv in Open MPI implements both protocols and chooses based on
> the message size which one to use.
> You can use the mca parameter "btl_sm_eager_limit" to modify the behaviour.
>
> Here the corresponding info obtained from the ompi_info tool:
>
> "btl_sm_eager_limit" (current value: <4096>, data source: default value)
> Maximum size (in bytes) of "short" messages (must be >= 1)
>
> Regards
> Christoph Niethammer
>
> --
>
> Christoph Niethammer
> High Performance Computing Center Stuttgart (HLRS)
> Nobelstrasse 19
> 70569 Stuttgart
>
> Tel: ++49(0)711-685-87203
> email: nietham...@hlrs.de
> http://www.hlrs.de/people/niethammer
>
>
>
> - Ursprüngliche Mail -
> Von: "Siddhartha Jana" 
> An: "OpenMPI users mailing list" 
> Gesendet: Samstag, 14. Dezember 2013 13:44:12
> Betreff: [OMPI users] Configuration for rendezvous and eager protocols:
> two-sided comm
>
>
>
> Hi
>
>
> In OpenMPI, are MPI_Send, MPI_Recv (and friends) implemented using
> rendezvous protocol or eager protocol?
>
>
> If both, is there a way to choose one or the other during runtime or while
> building the library?
>
>
> If there is a threshold of the message size that dictates the protocol to
> be used, is there a way I can alter that threshold value?
>
>
> If different protocols were used for different versions of the library in
> the past, could someone please direct me to the exact version numbers of
> the implementations that used one or the other protocol?
>
>
> Thanks a lot,
> Siddhartha
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] Configuration for rendezvous and eager protocols: two-sided comm

2013-12-16 Thread Christoph Niethammer
Hi Siddhartha,

MPI_Send/Recv in Open MPI implements both protocols and chooses based on the 
message size which one to use.
You can use the mca parameter "btl_sm_eager_limit" to modify the behaviour.

Here the corresponding info obtained from the ompi_info tool:

"btl_sm_eager_limit" (current value: <4096>, data source: default value)
Maximum size (in bytes) of "short" messages (must be >= 1)

Regards
Christoph Niethammer

--

Christoph Niethammer
High Performance Computing Center Stuttgart (HLRS)
Nobelstrasse 19
70569 Stuttgart

Tel: ++49(0)711-685-87203
email: nietham...@hlrs.de
http://www.hlrs.de/people/niethammer



- Ursprüngliche Mail -
Von: "Siddhartha Jana" 
An: "OpenMPI users mailing list" 
Gesendet: Samstag, 14. Dezember 2013 13:44:12
Betreff: [OMPI users] Configuration for rendezvous and eager protocols: 
two-sided comm



Hi 


In OpenMPI, are MPI_Send, MPI_Recv (and friends) implemented using rendezvous 
protocol or eager protocol? 


If both, is there a way to choose one or the other during runtime or while 
building the library? 


If there is a threshold of the message size that dictates the protocol to be 
used, is there a way I can alter that threshold value? 


If different protocols were used for different versions of the library in the 
past, could someone please direct me to the exact version numbers of the 
implementations that used one or the other protocol? 


Thanks a lot, 
Siddhartha 
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users