Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-08-10 Thread Christoph Niethammer
Hello,

I can confirm, that it works for me, too.
Thanks! 

Best
Christoph Niethammer



- Original Message -
From: tmish...@jcity.maeda.co.jp
To: "Open MPI Developers" <devel@lists.open-mpi.org>
Sent: Wednesday, August 10, 2016 5:58:50 AM
Subject: Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

Finally it worked, thanks!

[mishima@manage OMB-3.1.1-openmpi2.0.0]$ ompi_info --param btl openib
--level 5 | grep openib_flags
  MCA btl openib: parameter "btl_openib_flags" (current value:
"65847", data source: default, level: 5 tuner/det
ail, type: unsigned_int)
[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -report-bindings
osu_bw
[manage.cluster:14439] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]:
[B/B/B/B/B/B][./././././.]
[manage.cluster:14439] MCW rank 1 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]:
[B/B/B/B/B/B][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 1.72
2 3.52
4 7.01
814.11
16   28.17
32   55.90
64   99.83
128 159.13
256 272.98
512 476.35
1024911.49
2048   1319.96
4096   1767.78
8192   2169.53
16384  2507.96
32768  2957.28
65536  3206.90
131072 3610.33
262144 3985.18
524288 4379.47
10485764560.90
20971524661.44
41943044631.21


Tetsuya Mishima

2016/08/10 11:57:29、"devel"さんは「Re: [OMPI devel] sm BTL performace of
the openmpi-2.0.0」で書きました
> Ack, the segv is due to a typo from transcribing the patch. Fixed. Please
try the following patch and let me know if it fixes the issues.
>
>
https://github.com/hjelmn/ompi/commit/4079eec9749e47dddc6acc9c0847b3091601919f.patch

>
> -Nathan
>
> > On Aug 8, 2016, at 9:48 PM, tmish...@jcity.maeda.co.jp wrote:
> >
> > The latest patch also causes a segfault...
> >
> > By the way, I found a typo as below. _pml_ob1.use_all_rdma in the
last
> > line should be _pml_ob1.use_all_rdma:
> >
> > +mca_pml_ob1.use_all_rdma = false;
> > +(void) mca_base_component_var_register
> > (_pml_ob1_component.pmlm_version, "use_all_rdma",
> > +   "Use all available RDMA
btls
> > for the RDMA and RDMA pipeline protocols "
> > +   "(default: false)",
> > MCA_BASE_VAR_TYPE_BOOL, NULL, 0, 0,
> > +   OPAL_INFO_LVL_5,
> > MCA_BASE_VAR_SCOPE_GROUP, _pml_ob1.use_all_rdma);
> > +
> >
> > Here is the OSU_BW and gdb output:
> >
> > # OSU MPI Bandwidth Test v3.1.1
> > # SizeBandwidth (MB/s)
> > 1 2.19
> > 2 4.43
> > 4 8.98
> > 818.07
> > 16   35.58
> > 32   70.62
> > 64  108.88
> > 128 172.97
> > 256 305.73
> > 512 536.48
> > 1024957.57
> > 2048   1587.21
> > 4096   1638.81
> > 8192   2165.14
> > 16384  2482.43
> > 32768  2866.33
> > 65536  3655.33
> > 131072 4208.40
> > 262144 4596.12
> > 524288 4769.27
> > 10485764900.00
> > [manage:16596] *** Process received signal ***
> > [manage:16596] Signal: Segmentation fault (11)
> > [manage:16596] Signal code: Address not mapped (1)
> > [manage:16596] Failing at address: 0x8
> > ...
> > Core was generated by `osu_bw'.
> > Program terminated with signal 11, Segmentation fault.
> > #0  0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
> > (gdb) where
> > #0  0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
> > #1  0x0031d9008934 in _Unwind_Backtrace ()
from /lib64/libgcc_s.so.1
> > #2  0x0037ab8e5ee8 in backtrace () from /lib64/libc.so.6
> > #3  0x2b5060c14345 in opal_backtrace_print ()
> > at ./backtrace_execinfo.c:47
> > #4  0x2b5060c11180 in

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-08-09 Thread tmishima
Finally it worked, thanks!

[mishima@manage OMB-3.1.1-openmpi2.0.0]$ ompi_info --param btl openib
--level 5 | grep openib_flags
  MCA btl openib: parameter "btl_openib_flags" (current value:
"65847", data source: default, level: 5 tuner/det
ail, type: unsigned_int)
[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -report-bindings
osu_bw
[manage.cluster:14439] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]:
[B/B/B/B/B/B][./././././.]
[manage.cluster:14439] MCW rank 1 bound to socket 0[core 0[hwt 0]], socket
0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]:
[B/B/B/B/B/B][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 1.72
2 3.52
4 7.01
814.11
16   28.17
32   55.90
64   99.83
128 159.13
256 272.98
512 476.35
1024911.49
2048   1319.96
4096   1767.78
8192   2169.53
16384  2507.96
32768  2957.28
65536  3206.90
131072 3610.33
262144 3985.18
524288 4379.47
10485764560.90
20971524661.44
41943044631.21


Tetsuya Mishima

2016/08/10 11:57:29、"devel"さんは「Re: [OMPI devel] sm BTL performace of
the openmpi-2.0.0」で書きました
> Ack, the segv is due to a typo from transcribing the patch. Fixed. Please
try the following patch and let me know if it fixes the issues.
>
>
https://github.com/hjelmn/ompi/commit/4079eec9749e47dddc6acc9c0847b3091601919f.patch

>
> -Nathan
>
> > On Aug 8, 2016, at 9:48 PM, tmish...@jcity.maeda.co.jp wrote:
> >
> > The latest patch also causes a segfault...
> >
> > By the way, I found a typo as below. _pml_ob1.use_all_rdma in the
last
> > line should be _pml_ob1.use_all_rdma:
> >
> > +mca_pml_ob1.use_all_rdma = false;
> > +(void) mca_base_component_var_register
> > (_pml_ob1_component.pmlm_version, "use_all_rdma",
> > +   "Use all available RDMA
btls
> > for the RDMA and RDMA pipeline protocols "
> > +   "(default: false)",
> > MCA_BASE_VAR_TYPE_BOOL, NULL, 0, 0,
> > +   OPAL_INFO_LVL_5,
> > MCA_BASE_VAR_SCOPE_GROUP, _pml_ob1.use_all_rdma);
> > +
> >
> > Here is the OSU_BW and gdb output:
> >
> > # OSU MPI Bandwidth Test v3.1.1
> > # SizeBandwidth (MB/s)
> > 1 2.19
> > 2 4.43
> > 4 8.98
> > 818.07
> > 16   35.58
> > 32   70.62
> > 64  108.88
> > 128 172.97
> > 256 305.73
> > 512 536.48
> > 1024957.57
> > 2048   1587.21
> > 4096   1638.81
> > 8192   2165.14
> > 16384  2482.43
> > 32768  2866.33
> > 65536  3655.33
> > 131072 4208.40
> > 262144 4596.12
> > 524288 4769.27
> > 10485764900.00
> > [manage:16596] *** Process received signal ***
> > [manage:16596] Signal: Segmentation fault (11)
> > [manage:16596] Signal code: Address not mapped (1)
> > [manage:16596] Failing at address: 0x8
> > ...
> > Core was generated by `osu_bw'.
> > Program terminated with signal 11, Segmentation fault.
> > #0  0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
> > (gdb) where
> > #0  0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
> > #1  0x0031d9008934 in _Unwind_Backtrace ()
from /lib64/libgcc_s.so.1
> > #2  0x0037ab8e5ee8 in backtrace () from /lib64/libc.so.6
> > #3  0x2b5060c14345 in opal_backtrace_print ()
> > at ./backtrace_execinfo.c:47
> > #4  0x2b5060c11180 in show_stackframe () at ./stacktrace.c:331
> > #5  
> > #6  mca_pml_ob1_recv_request_schedule_once ()
at ./pml_ob1_recvreq.c:983
> > #7  0x2aaab461c71a in mca_pml_ob1_recv_request_progress_rndv ()
> >
> >
from /home/mishima/opt/mpi/openmpi-2.0.0-pgi16.5/lib/openmpi/mca_pml_ob1.so
> > #8  0x2aa

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-08-09 Thread Nathan Hjelm
Ack, the segv is due to a typo from transcribing the patch. Fixed. Please try 
the following patch and let me know if it fixes the issues.

https://github.com/hjelmn/ompi/commit/4079eec9749e47dddc6acc9c0847b3091601919f.patch

-Nathan

> On Aug 8, 2016, at 9:48 PM, tmish...@jcity.maeda.co.jp wrote:
> 
> The latest patch also causes a segfault...
> 
> By the way, I found a typo as below. _pml_ob1.use_all_rdma in the last
> line should be _pml_ob1.use_all_rdma:
> 
> +mca_pml_ob1.use_all_rdma = false;
> +(void) mca_base_component_var_register
> (_pml_ob1_component.pmlm_version, "use_all_rdma",
> +   "Use all available RDMA btls
> for the RDMA and RDMA pipeline protocols "
> +   "(default: false)",
> MCA_BASE_VAR_TYPE_BOOL, NULL, 0, 0,
> +   OPAL_INFO_LVL_5,
> MCA_BASE_VAR_SCOPE_GROUP, _pml_ob1.use_all_rdma);
> +
> 
> Here is the OSU_BW and gdb output:
> 
> # OSU MPI Bandwidth Test v3.1.1
> # SizeBandwidth (MB/s)
> 1 2.19
> 2 4.43
> 4 8.98
> 818.07
> 16   35.58
> 32   70.62
> 64  108.88
> 128 172.97
> 256 305.73
> 512 536.48
> 1024957.57
> 2048   1587.21
> 4096   1638.81
> 8192   2165.14
> 16384  2482.43
> 32768  2866.33
> 65536  3655.33
> 131072 4208.40
> 262144 4596.12
> 524288 4769.27
> 10485764900.00
> [manage:16596] *** Process received signal ***
> [manage:16596] Signal: Segmentation fault (11)
> [manage:16596] Signal code: Address not mapped (1)
> [manage:16596] Failing at address: 0x8
> ...
> Core was generated by `osu_bw'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
> (gdb) where
> #0  0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
> #1  0x0031d9008934 in _Unwind_Backtrace () from /lib64/libgcc_s.so.1
> #2  0x0037ab8e5ee8 in backtrace () from /lib64/libc.so.6
> #3  0x2b5060c14345 in opal_backtrace_print ()
> at ./backtrace_execinfo.c:47
> #4  0x2b5060c11180 in show_stackframe () at ./stacktrace.c:331
> #5  
> #6  mca_pml_ob1_recv_request_schedule_once () at ./pml_ob1_recvreq.c:983
> #7  0x2aaab461c71a in mca_pml_ob1_recv_request_progress_rndv ()
> 
> from /home/mishima/opt/mpi/openmpi-2.0.0-pgi16.5/lib/openmpi/mca_pml_ob1.so
> #8  0x2aaab46198e5 in mca_pml_ob1_recv_frag_match ()
> at ./pml_ob1_recvfrag.c:715
> #9  0x2aaab4618e46 in mca_pml_ob1_recv_frag_callback_rndv ()
> at ./pml_ob1_recvfrag.c:267
> #10 0x2aaab37958d3 in mca_btl_vader_poll_handle_frag ()
> at ./btl_vader_component.c:589
> #11 0x2aaab3795b9a in mca_btl_vader_component_progress ()
> at ./btl_vader_component.c:231
> #12 0x2b5060bd16fc in opal_progress () at runtime/opal_progress.c:224
> #13 0x2b50600e9aa5 in ompi_request_default_wait_all () at
> request/req_wait.c:77
> #14 0x2b50601310dd in PMPI_Waitall () at ./pwaitall.c:76
> #15 0x00401108 in main () at ./osu_bw.c:144
> 
> 
> Tetsuya Mishima
> 
> 2016/08/09 11:53:04、"devel"さんは「Re: [OMPI devel] sm BTL performace of
> the openmpi-2.0.0」で書きました
>> No problem. Thanks for reporting this. Not all platforms see a slowdown
> so we missed it before the release. Let me know if that latest patch works
> for you.
>> 
>> -Nathan
>> 
>>> On Aug 8, 2016, at 8:50 PM, tmish...@jcity.maeda.co.jp wrote:
>>> 
>>> I understood. Thanks.
>>> 
>>> Tetsuya Mishima
>>> 
>>> 2016/08/09 11:33:15、"devel"さんは「Re: [OMPI devel] sm BTL performace
> of
>>> the openmpi-2.0.0」で書きました
>>>> I will add a control to have the new behavior or using all available
> RDMA
>>> btls or just the eager ones for the RDMA protocol. The flags will
> remain as
>>> they are. And, yes, for 2.0.0 you can set the btl
>>>> flags if you do not intend to use MPI RMA.
>>>> 
>>>> New patch:
>>>> 
>>>> 
>>> 
> https://github.com/hjelmn/ompi/commit/43267012e58d78e3fc713b98c6fb9f782de977c7.patch
> 
>>> 
>>>> 
>>>> -Nathan
>>>> 
>>>>> On Aug 8, 2016, at 8:16 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>> 
>>>&g

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-08-08 Thread Nathan Hjelm
No problem. Thanks for reporting this. Not all platforms see a slowdown so we 
missed it before the release. Let me know if that latest patch works for you.

-Nathan

> On Aug 8, 2016, at 8:50 PM, tmish...@jcity.maeda.co.jp wrote:
> 
> I understood. Thanks.
> 
> Tetsuya Mishima
> 
> 2016/08/09 11:33:15、"devel"さんは「Re: [OMPI devel] sm BTL performace of
> the openmpi-2.0.0」で書きました
>> I will add a control to have the new behavior or using all available RDMA
> btls or just the eager ones for the RDMA protocol. The flags will remain as
> they are. And, yes, for 2.0.0 you can set the btl
>> flags if you do not intend to use MPI RMA.
>> 
>> New patch:
>> 
>> 
> https://github.com/hjelmn/ompi/commit/43267012e58d78e3fc713b98c6fb9f782de977c7.patch
> 
>> 
>> -Nathan
>> 
>>> On Aug 8, 2016, at 8:16 PM, tmish...@jcity.maeda.co.jp wrote:
>>> 
>>> Then, my understanding is that you will restore the default value of
>>> btl_openib_flags to previous one( = 310) and add a new MCA parameter to
>>> control HCA inclusion for such a situation. The work arround so far for
>>> openmpi-2.0.0 is setting those flags manually. Right?
>>> 
>>> Tetsuya Mishima
>>> 
>>> 2016/08/09 9:56:29、"devel"さんは「Re: [OMPI devel] sm BTL performace
> of
>>> the openmpi-2.0.0」で書きました
>>>> Hmm, not good. So we have a situation where it is sometimes better to
>>> include the HCA when it is the only rdma btl. Will have a new version
> up in
>>> a bit that adds an MCA parameter to control the
>>>> behavior. The default will be the same as 1.10.x.
>>>> 
>>>> -Nathan
>>>> 
>>>>> On Aug 8, 2016, at 4:51 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>> 
>>>>> Hi, unfortunately it doesn't work well. The previous one was much
>>>>> better ...
>>>>> 
>>>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2
> -report-bindings
>>>>> osu_bw
>>>>> [manage.cluster:25107] MCW rank 0 bound to socket 0[core 0[hwt 0]],
>>> socket
>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt
>>> 0]]:
>>>>> [B/B/B/B/B/B][./././././.]
>>>>> [manage.cluster:25107] MCW rank 1 bound to socket 0[core 0[hwt 0]],
>>> socket
>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
>>>>> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt
>>> 0]]:
>>>>> [B/B/B/B/B/B][./././././.]
>>>>> # OSU MPI Bandwidth Test v3.1.1
>>>>> # SizeBandwidth (MB/s)
>>>>> 1 2.22
>>>>> 2 4.53
>>>>> 4 9.11
>>>>> 818.02
>>>>> 16   35.44
>>>>> 32   70.84
>>>>> 64          113.71
>>>>> 128     176.74
>>>>> 256 311.07
>>>>> 512 529.03
>>>>> 1024907.83
>>>>> 2048   1597.66
>>>>> 4096330.14
>>>>> 8192516.49
>>>>> 16384   780.31
>>>>> 32768  1038.43
>>>>> 65536  1186.36
>>>>> 131072 1268.87
>>>>> 262144 1222.24
>>>>> 524288 1232.30
>>>>> 10485761244.62
>>>>> 20971521260.25
>>>>> 41943041263.47
>>>>> 
>>>>> Tetsuya
>>>>> 
>>>>> 
>>>>> 2016/08/09 2:42:24、"devel"さんは「Re: [OMPI devel] sm BTL performace
>>> of
>>>>> the openmpi-2.0.0」で書きました
>>>>>> Ok, there was a problem with the selection logic when only one rdma
>>>>> capable btl is available. I changed the logic to always use the RDMA
>>> btl
>>>>> over pipelined send/recv. This works better for me on a
>>>>>> Intel Omnipath system. Let me know if this works for you.
>>>>>> 
>>>>>> 
>>>>> 
>>> 
> https://github.com/hjelmn/ompi/commit/dddb865b5337213fd73d0e226b02e2f049cfab47.patch
> 

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-08-08 Thread tmishima
I understood. Thanks.

Tetsuya Mishima

2016/08/09 11:33:15、"devel"さんは「Re: [OMPI devel] sm BTL performace of
the openmpi-2.0.0」で書きました
> I will add a control to have the new behavior or using all available RDMA
btls or just the eager ones for the RDMA protocol. The flags will remain as
they are. And, yes, for 2.0.0 you can set the btl
> flags if you do not intend to use MPI RMA.
>
> New patch:
>
>
https://github.com/hjelmn/ompi/commit/43267012e58d78e3fc713b98c6fb9f782de977c7.patch

>
> -Nathan
>
> > On Aug 8, 2016, at 8:16 PM, tmish...@jcity.maeda.co.jp wrote:
> >
> > Then, my understanding is that you will restore the default value of
> > btl_openib_flags to previous one( = 310) and add a new MCA parameter to
> > control HCA inclusion for such a situation. The work arround so far for
> > openmpi-2.0.0 is setting those flags manually. Right?
> >
> > Tetsuya Mishima
> >
> > 2016/08/09 9:56:29、"devel"さんは「Re: [OMPI devel] sm BTL performace
of
> > the openmpi-2.0.0」で書きました
> >> Hmm, not good. So we have a situation where it is sometimes better to
> > include the HCA when it is the only rdma btl. Will have a new version
up in
> > a bit that adds an MCA parameter to control the
> >> behavior. The default will be the same as 1.10.x.
> >>
> >> -Nathan
> >>
> >>> On Aug 8, 2016, at 4:51 PM, tmish...@jcity.maeda.co.jp wrote:
> >>>
> >>> Hi, unfortunately it doesn't work well. The previous one was much
> >>> better ...
> >>>
> >>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2
-report-bindings
> >>> osu_bw
> >>> [manage.cluster:25107] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> > socket
> >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> >>> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt
> > 0]]:
> >>> [B/B/B/B/B/B][./././././.]
> >>> [manage.cluster:25107] MCW rank 1 bound to socket 0[core 0[hwt 0]],
> > socket
> >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> >>> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt
> > 0]]:
> >>> [B/B/B/B/B/B][./././././.]
> >>> # OSU MPI Bandwidth Test v3.1.1
> >>> # SizeBandwidth (MB/s)
> >>> 1 2.22
> >>> 2 4.53
> >>> 4 9.11
> >>> 818.02
> >>> 16   35.44
> >>> 32   70.84
> >>> 64  113.71
> >>> 128 176.74
> >>> 256 311.07
> >>> 512 529.03
> >>> 1024    907.83
> >>> 2048   1597.66
> >>> 4096330.14
> >>> 8192516.49
> >>> 16384   780.31
> >>> 32768  1038.43
> >>> 65536  1186.36
> >>> 131072 1268.87
> >>> 262144 1222.24
> >>> 524288 1232.30
> >>> 10485761244.62
> >>> 20971521260.25
> >>> 41943041263.47
> >>>
> >>> Tetsuya
> >>>
> >>>
> >>> 2016/08/09 2:42:24、"devel"さんは「Re: [OMPI devel] sm BTL performace
> > of
> >>> the openmpi-2.0.0」で書きました
> >>>> Ok, there was a problem with the selection logic when only one rdma
> >>> capable btl is available. I changed the logic to always use the RDMA
> > btl
> >>> over pipelined send/recv. This works better for me on a
> >>>> Intel Omnipath system. Let me know if this works for you.
> >>>>
> >>>>
> >>>
> >
https://github.com/hjelmn/ompi/commit/dddb865b5337213fd73d0e226b02e2f049cfab47.patch

> >
> >>>
> >>>>
> >>>> -Nathan
> >>>>
> >>>> On Aug 07, 2016, at 10:00 PM, tmish...@jcity.maeda.co.jp wrote:
> >>>>
> >>>> Hi, here is the gdb output for additional information:
> >>>>
> >>>> (It might be inexact, because I built openmpi-2.0.0 without debug
> > option)
> >>>>
> >>>> Core was generated by `osu_bw'.
> >>>> Program terminated with signal 11, Segmentation fault.
> >>>> #0 0x0031d9008806 i

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-08-08 Thread tmishima
Then, my understanding is that you will restore the default value of
btl_openib_flags to previous one( = 310) and add a new MCA parameter to
control HCA inclusion for such a situation. The work arround so far for
openmpi-2.0.0 is setting those flags manually. Right?

Tetsuya Mishima

2016/08/09 9:56:29、"devel"さんは「Re: [OMPI devel] sm BTL performace of
the openmpi-2.0.0」で書きました
> Hmm, not good. So we have a situation where it is sometimes better to
include the HCA when it is the only rdma btl. Will have a new version up in
a bit that adds an MCA parameter to control the
> behavior. The default will be the same as 1.10.x.
>
> -Nathan
>
> > On Aug 8, 2016, at 4:51 PM, tmish...@jcity.maeda.co.jp wrote:
> >
> > Hi, unfortunately it doesn't work well. The previous one was much
> > better ...
> >
> > [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -report-bindings
> > osu_bw
> > [manage.cluster:25107] MCW rank 0 bound to socket 0[core 0[hwt 0]],
socket
> > 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt
0]]:
> > [B/B/B/B/B/B][./././././.]
> > [manage.cluster:25107] MCW rank 1 bound to socket 0[core 0[hwt 0]],
socket
> > 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> > cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt
0]]:
> > [B/B/B/B/B/B][./././././.]
> > # OSU MPI Bandwidth Test v3.1.1
> > # SizeBandwidth (MB/s)
> > 1 2.22
> > 2 4.53
> > 4 9.11
> > 818.02
> > 16   35.44
> > 32   70.84
> > 64  113.71
> > 128 176.74
> > 256 311.07
> > 512 529.03
> > 1024907.83
> > 2048   1597.66
> > 4096330.14
> > 8192516.49
> > 16384   780.31
> > 32768  1038.43
> > 65536  1186.36
> > 131072 1268.87
> > 262144         1222.24
> > 524288 1232.30
> > 10485761244.62
> > 20971521260.25
> > 41943041263.47
> >
> > Tetsuya
> >
> >
> > 2016/08/09 2:42:24、"devel"さんは「Re: [OMPI devel] sm BTL performace
of
> > the openmpi-2.0.0」で書きました
> >> Ok, there was a problem with the selection logic when only one rdma
> > capable btl is available. I changed the logic to always use the RDMA
btl
> > over pipelined send/recv. This works better for me on a
> >> Intel Omnipath system. Let me know if this works for you.
> >>
> >>
> >
https://github.com/hjelmn/ompi/commit/dddb865b5337213fd73d0e226b02e2f049cfab47.patch

> >
> >>
> >> -Nathan
> >>
> >> On Aug 07, 2016, at 10:00 PM, tmish...@jcity.maeda.co.jp wrote:
> >>
> >> Hi, here is the gdb output for additional information:
> >>
> >> (It might be inexact, because I built openmpi-2.0.0 without debug
option)
> >>
> >> Core was generated by `osu_bw'.
> >> Program terminated with signal 11, Segmentation fault.
> >> #0 0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
> >> (gdb) where
> >> #0 0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
> >> #1 0x0031d9008934 in _Unwind_Backtrace ()
from /lib64/libgcc_s.so.1
> >> #2 0x0037ab8e5ee8 in backtrace () from /lib64/libc.so.6
> >> #3 0x2ad882bd4345 in opal_backtrace_print ()
> >> at ./backtrace_execinfo.c:47
> >> #4 0x2ad882bd1180 in show_stackframe () at ./stacktrace.c:331
> >> #5 
> >> #6 mca_pml_ob1_recv_request_schedule_once ()
at ./pml_ob1_recvreq.c:983
> >> #7 0x2aaab412f47a in mca_pml_ob1_recv_request_progress_rndv ()
> >>
> >>
> >
from /home/mishima/opt/mpi/openmpi-2.0.0-pgi16.5/lib/openmpi/mca_pml_ob1.so
> >> #8 0x2aaab412c645 in mca_pml_ob1_recv_frag_match ()
> >> at ./pml_ob1_recvfrag.c:715
> >> #9 0x2aaab412bba6 in mca_pml_ob1_recv_frag_callback_rndv ()
> >> at ./pml_ob1_recvfrag.c:267
> >> #10 0x2f2748d3 in mca_btl_vader_poll_handle_frag ()
> >> at ./btl_vader_component.c:589
> >> #11 0x2f274b9a in mca_btl_vader_component_progress ()
> >> at ./btl_vader_component.c:231
> >> #12 0x2ad882b916fc in opal_progress () at
runtime/opal_progress.c:224
> >> #13 0x2ad8820a9aa5 in ompi_request_default_wait_all () at

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-08-08 Thread Nathan Hjelm
Hmm, not good. So we have a situation where it is sometimes better to include 
the HCA when it is the only rdma btl. Will have a new version up in a bit that 
adds an MCA parameter to control the behavior. The default will be the same as 
1.10.x.

-Nathan

> On Aug 8, 2016, at 4:51 PM, tmish...@jcity.maeda.co.jp wrote:
> 
> Hi, unfortunately it doesn't work well. The previous one was much
> better ...
> 
> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -report-bindings
> osu_bw
> [manage.cluster:25107] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]:
> [B/B/B/B/B/B][./././././.]
> [manage.cluster:25107] MCW rank 1 bound to socket 0[core 0[hwt 0]], socket
> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]:
> [B/B/B/B/B/B][./././././.]
> # OSU MPI Bandwidth Test v3.1.1
> # SizeBandwidth (MB/s)
> 1 2.22
> 2 4.53
> 4 9.11
> 818.02
> 16   35.44
> 32   70.84
> 64  113.71
> 128 176.74
> 256 311.07
> 512 529.03
> 1024907.83
> 2048   1597.66
> 4096330.14
> 8192516.49
> 16384   780.31
> 32768  1038.43
> 65536  1186.36
> 131072 1268.87
> 262144 1222.24
> 524288 1232.30
> 10485761244.62
> 2097152            1260.25
> 4194304    1263.47
> 
> Tetsuya
> 
> 
> 2016/08/09 2:42:24、"devel"さんは「Re: [OMPI devel] sm BTL performace of
> the openmpi-2.0.0」で書きました
>> Ok, there was a problem with the selection logic when only one rdma
> capable btl is available. I changed the logic to always use the RDMA btl
> over pipelined send/recv. This works better for me on a
>> Intel Omnipath system. Let me know if this works for you.
>> 
>> 
> https://github.com/hjelmn/ompi/commit/dddb865b5337213fd73d0e226b02e2f049cfab47.patch
> 
>> 
>> -Nathan
>> 
>> On Aug 07, 2016, at 10:00 PM, tmish...@jcity.maeda.co.jp wrote:
>> 
>> Hi, here is the gdb output for additional information:
>> 
>> (It might be inexact, because I built openmpi-2.0.0 without debug option)
>> 
>> Core was generated by `osu_bw'.
>> Program terminated with signal 11, Segmentation fault.
>> #0 0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
>> (gdb) where
>> #0 0x0031d9008806 in ?? () from /lib64/libgcc_s.so.1
>> #1 0x0031d9008934 in _Unwind_Backtrace () from /lib64/libgcc_s.so.1
>> #2 0x0037ab8e5ee8 in backtrace () from /lib64/libc.so.6
>> #3 0x2ad882bd4345 in opal_backtrace_print ()
>> at ./backtrace_execinfo.c:47
>> #4 0x2ad882bd1180 in show_stackframe () at ./stacktrace.c:331
>> #5 
>> #6 mca_pml_ob1_recv_request_schedule_once () at ./pml_ob1_recvreq.c:983
>> #7 0x2aaab412f47a in mca_pml_ob1_recv_request_progress_rndv ()
>> 
>> 
> from /home/mishima/opt/mpi/openmpi-2.0.0-pgi16.5/lib/openmpi/mca_pml_ob1.so
>> #8 0x2aaab412c645 in mca_pml_ob1_recv_frag_match ()
>> at ./pml_ob1_recvfrag.c:715
>> #9 0x2aaab412bba6 in mca_pml_ob1_recv_frag_callback_rndv ()
>> at ./pml_ob1_recvfrag.c:267
>> #10 0x2f2748d3 in mca_btl_vader_poll_handle_frag ()
>> at ./btl_vader_component.c:589
>> #11 0x2f274b9a in mca_btl_vader_component_progress ()
>> at ./btl_vader_component.c:231
>> #12 0x00002ad882b916fc in opal_progress () at runtime/opal_progress.c:224
>> #13 0x2ad8820a9aa5 in ompi_request_default_wait_all () at
>> request/req_wait.c:77
>> #14 0x2ad8820f10dd in PMPI_Waitall () at ./pwaitall.c:76
>> #15 0x00401108 in main () at ./osu_bw.c:144
>> 
>> Tetsuya
>> 
>> 
>> 2016/08/08 12:34:57、"devel"さんは「Re: [OMPI devel] sm BTL performace of
>> the openmpi-2.0.0」で書きました
>> Hi, it caused segfault as below:
>> [manage.cluster:25436] MCW rank 0 bound to socket 0[core 0[hwt 0]],socket
>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]],
> socket 0[core 4[hwt 0]], socket 0[core 5[hwt
> 0]]:[B/B/B/B/B/B][./././././.][manage.cluster:25436] MCW rank 1 bound to
> socket 0[core
>> 0[hwt 0]],socket
>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]],
> socket 0[core 4[hwt 0]], socket 0[core 5[hwt
>

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-07-26 Thread tmishima
pen: component tcp open
function successful
[manage.cluster:20012] mca: base: components_open: found loaded component
sm
[manage.cluster:20012] mca: base: components_open: component sm open
function successful
[manage.cluster:20012] mca: base: components_open: found loaded component
openib
[manage.cluster:20012] mca: base: components_open: component openib open
function successful
[manage.cluster:20012] select: initializing btl component self
[manage.cluster:20012] select: init of component self returned success
[manage.cluster:20012] select: initializing btl component vader
[manage.cluster:20012] select: init of component vader returned success
[manage.cluster:20012] select: initializing btl component tcp
[manage.cluster:20012] select: init of component tcp returned success
[manage.cluster:20012] select: initializing btl component sm
[manage.cluster:20012] select: init of component sm returned success
[manage.cluster:20012] select: initializing btl component openib
[manage.cluster:20012] Checking distance from this process to device=mthca0
[manage.cluster:20012] hwloc_distances->nbobjs=2
[manage.cluster:20012] hwloc_distances->latency[0]=1.00
[manage.cluster:20012] hwloc_distances->latency[1]=1.60
[manage.cluster:20012] hwloc_distances->latency[2]=1.60
[manage.cluster:20012] hwloc_distances->latency[3]=1.00
[manage.cluster:20012] ibv_obj->type set to NULL
[manage.cluster:20012] Process is bound: distance to device is 0.00
[manage.cluster:20012] openib BTL: rdmacm CPC unavailable for use on
mthca0:1; skipped
[manage.cluster:20011] openib BTL: rdmacm CPC unavailable for use on
mthca0:1; skipped
[manage.cluster:20012] [rank=1] openib: using port mthca0:1
[manage.cluster:20012] select: init of component openib returned success
[manage.cluster:20011] [rank=0] openib: using port mthca0:1
[manage.cluster:20011] select: init of component openib returned success
[manage.cluster:20012] mca: bml: Using self btl for send to [[16477,1],1]
on node manage
[manage.cluster:20011] mca: bml: Using self btl for send to [[16477,1],0]
on node manage
[manage.cluster:20012] mca: bml: Using vader btl for send to [[16477,1],0]
on node manage
[manage.cluster:20011] mca: bml: Using vader btl for send to [[16477,1],1]
on node manage
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 1.42
2 3.04
4 6.06
812.11
16   24.32
32   47.78
64   85.57
128 139.08
256 240.59
512 415.78
1024848.47
2048   1234.08
4096265.53
8192471.28
16384   740.52
32768  1029.48
65536  1191.29
131072 1271.51
262144 1238.58
524288 1246.67
10485761263.01
20971521275.67
41943041281.87
[manage.cluster:20011] mca: base: close: component self closed
[manage.cluster:20011] mca: base: close: unloading component self
[manage.cluster:20012] mca: base: close: component self closed
[manage.cluster:20012] mca: base: close: unloading component self
[manage.cluster:20012] mca: base: close: component vader closed
[manage.cluster:20012] mca: base: close: unloading component vader
[manage.cluster:20011] mca: base: close: component vader closed
[manage.cluster:20011] mca: base: close: unloading component vader
[manage.cluster:20012] mca: base: close: component tcp closed
[manage.cluster:20012] mca: base: close: unloading component tcp
[manage.cluster:20011] mca: base: close: component tcp closed
[manage.cluster:20011] mca: base: close: unloading component tcp
[manage.cluster:20011] mca: base: close: component sm closed
[manage.cluster:20011] mca: base: close: unloading component sm
[manage.cluster:20012] mca: base: close: component sm closed
[manage.cluster:20012] mca: base: close: unloading component sm
[manage.cluster:20011] mca: base: close: component openib closed
[manage.cluster:20011] mca: base: close: unloading component openib
[manage.cluster:20012] mca: base: close: component openib closed
[manage.cluster:20012] mca: base: close: unloading component openib


2016/07/27 9:23:34、"devel"さんは「Re: [OMPI devel] sm BTL performace of
the openmpi-2.0.0」で書きました
> Also, btl/vader has a higher exclusivity than btl/sm, so if you do not
> manually specify any btl, vader should be used.
>
>
> you can run with
>
> --mca btl_base_verbose 10
>
> to confirm which btl is used
>
>
> Cheers,
>
>
> Gilles
>
>
> On 7/27/2016 9:20 AM, Nathan Hjelm wrote:
> > sm is deprecated in 2.0.0 and will likely be removed in favor of vader
in 2.1.0.
> >
> > This issue is probably this known issue:
https://github.com/open-mpi/ompi-release/pull/1250
> >

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-07-26 Thread tmishima
Hi,

Thanks. I will try it and report later.

Tetsuya Mishima


2016/07/27 9:20:28、"devel"さんは「Re: [OMPI devel] sm BTL performace of
the openmpi-2.0.0」で書きました
> sm is deprecated in 2.0.0 and will likely be removed in favor of vader in
2.1.0.
>
> This issue is probably this known issue:
https://github.com/open-mpi/ompi-release/pull/1250
>
> Please apply those commits and see if it fixes the issue for you.
>
> -Nathan
>
> > On Jul 26, 2016, at 6:17 PM, tmish...@jcity.maeda.co.jp wrote:
> >
> > Hi Gilles,
> >
> > Thanks. I ran again with --mca pml ob1 but I've got the same results as
> > below:
> >
> > [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1
-bind-to
> > core -report-bindings osu_bw
> > [manage.cluster:18142] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> > [B/././././.][./././././.]
> > [manage.cluster:18142] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
> > [./B/./././.][./././././.]
> > # OSU MPI Bandwidth Test v3.1.1
> > # SizeBandwidth (MB/s)
> > 1 1.48
> > 2 3.07
> > 4 6.26
> > 812.53
> > 16   24.33
> > 32   49.03
> > 64   83.46
> > 128 132.60
> > 256 234.96
> > 512 420.86
> > 1024842.37
> > 2048   1231.65
> > 4096264.67
> > 8192472.16
> > 16384   740.42
> > 32768  1030.39
> > 65536  1191.16
> > 131072 1269.45
> > 262144 1238.33
> > 524288 1247.97
> > 10485761257.96
> > 20971521274.74
> > 41943041280.94
> > [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca
btl
> > self,sm -bind-to core -report-bindings osu_b
> > w
> > [manage.cluster:18204] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> > [B/././././.][./././././.]
> > [manage.cluster:18204] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
> > [./B/./././.][./././././.]
> > # OSU MPI Bandwidth Test v3.1.1
> > # SizeBandwidth (MB/s)
> > 1 0.52
> > 2 1.05
> > 4 2.08
> > 8 4.18
> > 168.21
> > 32   16.65
> > 64   32.60
> > 128  66.70
> > 256 132.45
> > 512 269.27
> > 1024504.63
> > 2048819.76
> > 4096874.54
> > 8192   1447.11
> > 16384  2263.28
> > 32768  3236.85
> > 65536  3567.34
> > 131072 3555.17
> > 262144 3455.76
> > 524288 3441.80
> > 10485763505.30
> > 20971523534.01
> > 41943043546.94
> > [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca
btl
> > self,sm,openib -bind-to core -report-binding
> > s osu_bw
> > [manage.cluster:18218] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> > [B/././././.][./././././.]
> > [manage.cluster:18218] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
> > [./B/./././.][./././././.]
> > # OSU MPI Bandwidth Test v3.1.1
> > # SizeBandwidth (MB/s)
> > 1 0.51
> > 2 1.03
> > 4 2.05
> > 8 4.07
> > 168.14
> > 32   16.32
> > 64   32.98
> > 128  63.70
> > 256 126.66
> > 512 252.61
> > 1024480.22
> > 2048810.54
> > 4096290.61
> > 8192512.49
> > 16384   764.60
> > 32768  1036.81
> > 65536  1182.81
> > 131072 1264.48
> > 262144 1235.82
> > 524288 1246.70
> > 10485761254.66
> > 20971521274.64
> > 41943041280.65
> > [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca
btl
> &

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-07-26 Thread Gilles Gouaillardet
1281.05


2016/07/27 9:02:49、"devel"さんは「Re: [OMPI devel] sm BTL performace of
the openmpi-2.0.0」で書きました

Hi,


can you please run again with

--mca pml ob1


if Open MPI was built with mxm support, pml/cm and mtl/mxm are used
instead of pml/ob1 and btl/openib


Cheers,


Gilles


On 7/27/2016 8:56 AM, tmish...@jcity.maeda.co.jp wrote:

Hi folks,

I saw a performance degradation of openmpi-2.0.0 when I ran our

application

on a node (12cores). So I did 4 tests using osu_bw as below:

1: mpirun –np 2 osu_bw  bad(30% of test2)
2: mpirun –np 2 –mca btl self,sm osu_bw good(same as

openmpi1.10.3)

3: mpirun –np 2 –mca btl self,sm,openib osu_bw  bad(30% of test2)
4: mpirun –np 2 –mca btl self,openib osu_bw bad(30% of test2)

I  guess openib BTL was used in the test 1 and 3, because these results

are

almost  same  as  test  4. I believe that sm BTL should be used even in

the

test 1 and 3, because its priority is higher than openib.

Unfortunately, at

the  moment,  I couldn’t figure out the root cause. So please someone

would

take care of it.

Regards,
Tetsuya Mishima

P.S. Here I attached these test results.

[mishima@manage   OMB-3.1.1-openmpi2.0.0]$   mpirun  -np  2  -bind-to

core

-report-bindings osu_bw
[manage.cluster:13389]  MCW  rank  0  bound  to  socket  0[core  0[hwt

0]]:

[B/././././.][./././././.]
[manage.cluster:13389]  MCW  rank  1  bound  to  socket  0[core  1[hwt

0]]:

[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 1.49
2 3.04
4 6.13
812.23
16   25.01
32   49.96
64   87.07
128 138.87
256 245.97
512 423.30
1024865.85
2048   1279.63
4096264.79
8192473.92
16384   739.27
32768  1030.49
65536  1190.21
131072 1270.77
262144 1238.74
524288 1245.97
10485761260.09
20971521274.53
41943041285.07
[mishima@manage  OMB-3.1.1-openmpi2.0.0]$  mpirun  -np  2  -mca btl

self,sm

-bind-to core -report-bindings osu_bw
[manage.cluster:13448]  MCW  rank  0  bound  to  socket  0[core  0[hwt

0]]:

[B/././././.][./././././.]
[manage.cluster:13448]  MCW  rank  1  bound  to  socket  0[core  1[hwt

0]]:

[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.51
2 1.01
4 2.03
8 4.08
167.92
32   16.16
64   32.53
128  64.30
256 128.19
512 256.48
1024468.62
2048785.29
4096854.78
8192   1404.51
16384  2249.20
32768  3136.40
65536  3495.84
131072 3436.69
262144 3392.11
524288 3400.07
10485763460.60
20971523488.09
41943043498.45
[mishima@manageOMB-3.1.1-openmpi2.0.0]$   mpirun   -np   2   -mca

btl

self,sm,openib -bind-to core -report-bindings osu_bw
[manage.cluster:13462]  MCW  rank  0  bound  to  socket  0[core  0[hwt

0]]:

[B/././././.][./././././.]
[manage.cluster:13462]  MCW  rank  1  bound  to  socket  0[core  1[hwt

0]]:

[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.54
2 1.09
4 2.18
8 4.37
168.75
32   17.37
64   34.67
128  66.66
256 132.55
512 261.52
1024489.51
2048818.38
4096290.48
8192511.64
16384   765.24
32768  1043.28
65536  1180.48
131072 1261.41
262144 1232.86
524288 1245.70
10485761245.69
20971521268.67
41943041281.33
[mishima@manage  OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl

self,openib

-bind-to core -report-bindings osu_bw
[manage.cluster:13521]  MCW  rank  0  bound  to  socket  0[core  0[hwt

0]]:

[B/././././.][./././././.]
[manage.cluster:13521]  MCW  rank  1  bound  to  socket  0[core  1[hwt

0]]:

[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.54
2 1.08
4 2.16
8  

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-07-26 Thread Nathan Hjelm
.86
> 2048553.44
> 4096707.14
> 8192            879.60
> 16384   763.02
> 32768  1042.89
> 65536  1185.45
> 131072 1267.56
> 262144 1227.41
> 524288 1244.61
> 10485761255.66
> 20971521273.55
> 41943041281.05
> 
> 
> 2016/07/27 9:02:49、"devel"さんは「Re: [OMPI devel] sm BTL performace of
> the openmpi-2.0.0」で書きました
>> Hi,
>> 
>> 
>> can you please run again with
>> 
>> --mca pml ob1
>> 
>> 
>> if Open MPI was built with mxm support, pml/cm and mtl/mxm are used
>> instead of pml/ob1 and btl/openib
>> 
>> 
>> Cheers,
>> 
>> 
>> Gilles
>> 
>> 
>> On 7/27/2016 8:56 AM, tmish...@jcity.maeda.co.jp wrote:
>>> Hi folks,
>>> 
>>> I saw a performance degradation of openmpi-2.0.0 when I ran our
> application
>>> on a node (12cores). So I did 4 tests using osu_bw as below:
>>> 
>>> 1: mpirun –np 2 osu_bw  bad(30% of test2)
>>> 2: mpirun –np 2 –mca btl self,sm osu_bw good(same as
> openmpi1.10.3)
>>> 3: mpirun –np 2 –mca btl self,sm,openib osu_bw  bad(30% of test2)
>>> 4: mpirun –np 2 –mca btl self,openib osu_bw bad(30% of test2)
>>> 
>>> I  guess openib BTL was used in the test 1 and 3, because these results
> are
>>> almost  same  as  test  4. I believe that sm BTL should be used even in
> the
>>> test 1 and 3, because its priority is higher than openib.
> Unfortunately, at
>>> the  moment,  I couldn’t figure out the root cause. So please someone
> would
>>> take care of it.
>>> 
>>> Regards,
>>> Tetsuya Mishima
>>> 
>>> P.S. Here I attached these test results.
>>> 
>>> [mishima@manage   OMB-3.1.1-openmpi2.0.0]$   mpirun  -np  2  -bind-to
> core
>>> -report-bindings osu_bw
>>> [manage.cluster:13389]  MCW  rank  0  bound  to  socket  0[core  0[hwt
> 0]]:
>>> [B/././././.][./././././.]
>>> [manage.cluster:13389]  MCW  rank  1  bound  to  socket  0[core  1[hwt
> 0]]:
>>> [./B/./././.][./././././.]
>>> # OSU MPI Bandwidth Test v3.1.1
>>> # SizeBandwidth (MB/s)
>>> 1 1.49
>>> 2 3.04
>>> 4 6.13
>>> 812.23
>>> 16   25.01
>>> 32   49.96
>>> 64   87.07
>>> 128 138.87
>>> 256 245.97
>>> 512 423.30
>>> 1024865.85
>>> 2048   1279.63
>>> 4096264.79
>>> 8192473.92
>>> 16384   739.27
>>> 32768  1030.49
>>> 65536  1190.21
>>> 131072 1270.77
>>> 262144 1238.74
>>> 524288 1245.97
>>> 10485761260.09
>>> 20971521274.53
>>> 41943041285.07
>>> [mishima@manage  OMB-3.1.1-openmpi2.0.0]$  mpirun  -np  2  -mca btl
> self,sm
>>> -bind-to core -report-bindings osu_bw
>>> [manage.cluster:13448]  MCW  rank  0  bound  to  socket  0[core  0[hwt
> 0]]:
>>> [B/././././.][./././././.]
>>> [manage.cluster:13448]  MCW  rank  1  bound  to  socket  0[core  1[hwt
> 0]]:
>>> [./B/./././.][./././././.]
>>> # OSU MPI Bandwidth Test v3.1.1
>>> # SizeBandwidth (MB/s)
>>> 1 0.51
>>> 2 1.01
>>> 4 2.03
>>> 8 4.08
>>> 167.92
>>> 32   16.16
>>> 64   32.53
>>> 128  64.30
>>> 256 128.19
>>> 512 256.48
>>> 1024468.62
>>> 2048785.29
>>> 4096854.78
>>> 8192   1404.51
>>> 16384  2249.20
>>> 32768  3136.40
>>> 65536  3495.84
>>> 131072 3436.69
>>> 262144 3392.11
>>> 524288 3400.07
>>> 1048576  

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-07-26 Thread tmishima
Hi Gilles,

Thanks. I ran again with --mca pml ob1 but I've got the same results as
below:

[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -bind-to
core -report-bindings osu_bw
[manage.cluster:18142] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:18142] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 1.48
2 3.07
4 6.26
812.53
16   24.33
32   49.03
64   83.46
128 132.60
256 234.96
512 420.86
1024842.37
2048   1231.65
4096264.67
8192472.16
16384   740.42
32768  1030.39
65536  1191.16
131072 1269.45
262144 1238.33
524288 1247.97
10485761257.96
20971521274.74
41943041280.94
[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl
self,sm -bind-to core -report-bindings osu_b
w
[manage.cluster:18204] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:18204] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.52
2 1.05
4 2.08
8 4.18
168.21
32   16.65
64   32.60
128  66.70
256 132.45
512 269.27
1024504.63
2048819.76
4096874.54
8192   1447.11
16384  2263.28
32768  3236.85
65536  3567.34
131072 3555.17
262144 3455.76
524288 3441.80
10485763505.30
20971523534.01
41943043546.94
[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl
self,sm,openib -bind-to core -report-binding
s osu_bw
[manage.cluster:18218] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:18218] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.51
2 1.03
4 2.05
8 4.07
168.14
32   16.32
64   32.98
128  63.70
256 126.66
512 252.61
1024480.22
2048810.54
4096290.61
8192512.49
16384   764.60
32768  1036.81
65536  1182.81
131072 1264.48
262144 1235.82
524288 1246.70
10485761254.66
20971521274.64
41943041280.65
[mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl
self,openib -bind-to core -report-bindings o
su_bw
[manage.cluster:18276] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:18276] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.54
2 1.08
4 2.18
8 4.33
168.69
32   17.39
64   34.34
128  66.28
256 130.36
512 241.81
1024429.86
2048553.44
4096707.14
8192879.60
16384   763.02
32768  1042.89
65536  1185.45
131072 1267.56
262144 1227.41
524288 1244.61
10485761255.66
20971521273.55
41943041281.05


2016/07/27 9:02:49、"devel"さんは「Re: [OMPI devel] sm BTL performace of
the openmpi-2.0.0」で書きました
> Hi,
>
>
> can you please run again with
>
> --mca pml ob1
>
>
> if Open MPI was built with mxm support, pml/cm and mtl/mxm are used
> instead of pml/ob1 and btl/openib
>
>
> Cheers,
>
>
> Gilles
>
>
> On 7/27/2016 8:56 AM, tmish...@jcity.maeda.co.jp wrote:
> > Hi folks,
> >
> > I saw a performance degradation of openmpi-2.0.0 when I ran our
application
> > 

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-07-26 Thread Gilles Gouaillardet

Hi,


can you please run again with

--mca pml ob1


if Open MPI was built with mxm support, pml/cm and mtl/mxm are used 
instead of pml/ob1 and btl/openib



Cheers,


Gilles


On 7/27/2016 8:56 AM, tmish...@jcity.maeda.co.jp wrote:

Hi folks,

I saw a performance degradation of openmpi-2.0.0 when I ran our application
on a node (12cores). So I did 4 tests using osu_bw as below:

1: mpirun –np 2 osu_bw  bad(30% of test2)
2: mpirun –np 2 –mca btl self,sm osu_bw good(same as openmpi1.10.3)
3: mpirun –np 2 –mca btl self,sm,openib osu_bw  bad(30% of test2)
4: mpirun –np 2 –mca btl self,openib osu_bw bad(30% of test2)

I  guess openib BTL was used in the test 1 and 3, because these results are
almost  same  as  test  4. I believe that sm BTL should be used even in the
test 1 and 3, because its priority is higher than openib. Unfortunately, at
the  moment,  I couldn’t figure out the root cause. So please someone would
take care of it.

Regards,
Tetsuya Mishima

P.S. Here I attached these test results.

[mishima@manage   OMB-3.1.1-openmpi2.0.0]$   mpirun  -np  2  -bind-to  core
-report-bindings osu_bw
[manage.cluster:13389]  MCW  rank  0  bound  to  socket  0[core  0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13389]  MCW  rank  1  bound  to  socket  0[core  1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 1.49
2 3.04
4 6.13
812.23
16   25.01
32   49.96
64   87.07
128 138.87
256 245.97
512 423.30
1024865.85
2048   1279.63
4096264.79
8192473.92
16384   739.27
32768  1030.49
65536  1190.21
131072 1270.77
262144 1238.74
524288 1245.97
10485761260.09
20971521274.53
41943041285.07
[mishima@manage  OMB-3.1.1-openmpi2.0.0]$  mpirun  -np  2  -mca btl self,sm
-bind-to core -report-bindings osu_bw
[manage.cluster:13448]  MCW  rank  0  bound  to  socket  0[core  0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13448]  MCW  rank  1  bound  to  socket  0[core  1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.51
2 1.01
4 2.03
8 4.08
167.92
32   16.16
64   32.53
128  64.30
256 128.19
512 256.48
1024468.62
2048785.29
4096854.78
8192   1404.51
16384  2249.20
32768  3136.40
65536  3495.84
131072 3436.69
262144 3392.11
524288 3400.07
10485763460.60
20971523488.09
41943043498.45
[mishima@manageOMB-3.1.1-openmpi2.0.0]$   mpirun   -np   2   -mca   btl
self,sm,openib -bind-to core -report-bindings osu_bw
[manage.cluster:13462]  MCW  rank  0  bound  to  socket  0[core  0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13462]  MCW  rank  1  bound  to  socket  0[core  1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.54
2 1.09
4 2.18
8 4.37
168.75
32   17.37
64   34.67
128  66.66
256 132.55
512 261.52
1024489.51
2048818.38
4096290.48
8192511.64
16384   765.24
32768  1043.28
65536  1180.48
131072 1261.41
262144 1232.86
524288 1245.70
10485761245.69
20971521268.67
41943041281.33
[mishima@manage  OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl self,openib
-bind-to core -report-bindings osu_bw
[manage.cluster:13521]  MCW  rank  0  bound  to  socket  0[core  0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13521]  MCW  rank  1  bound  to  socket  0[core  1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.54
2 1.08
4 2.16
8 4.34
168.64
32   17.25
64   34.30
128  66.13
256 

[OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-07-26 Thread tmishima

Hi folks,

I saw a performance degradation of openmpi-2.0.0 when I ran our application
on a node (12cores). So I did 4 tests using osu_bw as below:

1: mpirun –np 2 osu_bw  bad(30% of test2)
2: mpirun –np 2 –mca btl self,sm osu_bw good(same as openmpi1.10.3)
3: mpirun –np 2 –mca btl self,sm,openib osu_bw  bad(30% of test2)
4: mpirun –np 2 –mca btl self,openib osu_bw bad(30% of test2)

I  guess openib BTL was used in the test 1 and 3, because these results are
almost  same  as  test  4. I believe that sm BTL should be used even in the
test 1 and 3, because its priority is higher than openib. Unfortunately, at
the  moment,  I couldn’t figure out the root cause. So please someone would
take care of it.

Regards,
Tetsuya Mishima

P.S. Here I attached these test results.

[mishima@manage   OMB-3.1.1-openmpi2.0.0]$   mpirun  -np  2  -bind-to  core
-report-bindings osu_bw
[manage.cluster:13389]  MCW  rank  0  bound  to  socket  0[core  0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13389]  MCW  rank  1  bound  to  socket  0[core  1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 1.49
2 3.04
4 6.13
812.23
16   25.01
32   49.96
64   87.07
128 138.87
256 245.97
512 423.30
1024865.85
2048   1279.63
4096264.79
8192473.92
16384   739.27
32768  1030.49
65536  1190.21
131072 1270.77
262144 1238.74
524288 1245.97
10485761260.09
20971521274.53
41943041285.07
[mishima@manage  OMB-3.1.1-openmpi2.0.0]$  mpirun  -np  2  -mca btl self,sm
-bind-to core -report-bindings osu_bw
[manage.cluster:13448]  MCW  rank  0  bound  to  socket  0[core  0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13448]  MCW  rank  1  bound  to  socket  0[core  1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.51
2 1.01
4 2.03
8 4.08
167.92
32   16.16
64   32.53
128  64.30
256 128.19
512 256.48
1024468.62
2048785.29
4096854.78
8192   1404.51
16384  2249.20
32768  3136.40
65536  3495.84
131072 3436.69
262144 3392.11
524288 3400.07
10485763460.60
20971523488.09
41943043498.45
[mishima@manageOMB-3.1.1-openmpi2.0.0]$   mpirun   -np   2   -mca   btl
self,sm,openib -bind-to core -report-bindings osu_bw
[manage.cluster:13462]  MCW  rank  0  bound  to  socket  0[core  0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13462]  MCW  rank  1  bound  to  socket  0[core  1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.54
2 1.09
4 2.18
8 4.37
168.75
32   17.37
64   34.67
128  66.66
256 132.55
512 261.52
1024489.51
2048818.38
4096290.48
8192511.64
16384   765.24
32768  1043.28
65536  1180.48
131072 1261.41
262144 1232.86
524288 1245.70
10485761245.69
20971521268.67
41943041281.33
[mishima@manage  OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl self,openib
-bind-to core -report-bindings osu_bw
[manage.cluster:13521]  MCW  rank  0  bound  to  socket  0[core  0[hwt 0]]:
[B/././././.][./././././.]
[manage.cluster:13521]  MCW  rank  1  bound  to  socket  0[core  1[hwt 0]]:
[./B/./././.][./././././.]
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
1 0.54
2 1.08
4 2.16
8 4.34
168.64
32   17.25
64   34.30
128  66.13
256 129.99
512 242.26
1024429.24
2048556.00
4096706.80
8192874.35
16384   762.60
32768  1039.61
65536