sm is deprecated in 2.0.0 and will likely be removed in favor of vader in 2.1.0.

This issue is probably this known issue: 
https://github.com/open-mpi/ompi-release/pull/1250

Please apply those commits and see if it fixes the issue for you.

-Nathan

> On Jul 26, 2016, at 6:17 PM, tmish...@jcity.maeda.co.jp wrote:
> 
> Hi Gilles,
> 
> Thanks. I ran again with --mca pml ob1 but I've got the same results as
> below:
> 
> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -bind-to
> core -report-bindings osu_bw
> [manage.cluster:18142] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> [B/././././.][./././././.]
> [manage.cluster:18142] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
> [./B/./././.][./././././.]
> # OSU MPI Bandwidth Test v3.1.1
> # Size        Bandwidth (MB/s)
> 1                         1.48
> 2                         3.07
> 4                         6.26
> 8                        12.53
> 16                       24.33
> 32                       49.03
> 64                       83.46
> 128                     132.60
> 256                     234.96
> 512                     420.86
> 1024                    842.37
> 2048                   1231.65
> 4096                    264.67
> 8192                    472.16
> 16384                   740.42
> 32768                  1030.39
> 65536                  1191.16
> 131072                 1269.45
> 262144                 1238.33
> 524288                 1247.97
> 1048576                1257.96
> 2097152                1274.74
> 4194304                1280.94
> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl
> self,sm -bind-to core -report-bindings osu_b
> w
> [manage.cluster:18204] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> [B/././././.][./././././.]
> [manage.cluster:18204] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
> [./B/./././.][./././././.]
> # OSU MPI Bandwidth Test v3.1.1
> # Size        Bandwidth (MB/s)
> 1                         0.52
> 2                         1.05
> 4                         2.08
> 8                         4.18
> 16                        8.21
> 32                       16.65
> 64                       32.60
> 128                      66.70
> 256                     132.45
> 512                     269.27
> 1024                    504.63
> 2048                    819.76
> 4096                    874.54
> 8192                   1447.11
> 16384                  2263.28
> 32768                  3236.85
> 65536                  3567.34
> 131072                 3555.17
> 262144                 3455.76
> 524288                 3441.80
> 1048576                3505.30
> 2097152                3534.01
> 4194304                3546.94
> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl
> self,sm,openib -bind-to core -report-binding
> s osu_bw
> [manage.cluster:18218] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> [B/././././.][./././././.]
> [manage.cluster:18218] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
> [./B/./././.][./././././.]
> # OSU MPI Bandwidth Test v3.1.1
> # Size        Bandwidth (MB/s)
> 1                         0.51
> 2                         1.03
> 4                         2.05
> 8                         4.07
> 16                        8.14
> 32                       16.32
> 64                       32.98
> 128                      63.70
> 256                     126.66
> 512                     252.61
> 1024                    480.22
> 2048                    810.54
> 4096                    290.61
> 8192                    512.49
> 16384                   764.60
> 32768                  1036.81
> 65536                  1182.81
> 131072                 1264.48
> 262144                 1235.82
> 524288                 1246.70
> 1048576                1254.66
> 2097152                1274.64
> 4194304                1280.65
> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl
> self,openib -bind-to core -report-bindings o
> su_bw
> [manage.cluster:18276] MCW rank 0 bound to socket 0[core 0[hwt 0]]:
> [B/././././.][./././././.]
> [manage.cluster:18276] MCW rank 1 bound to socket 0[core 1[hwt 0]]:
> [./B/./././.][./././././.]
> # OSU MPI Bandwidth Test v3.1.1
> # Size        Bandwidth (MB/s)
> 1                         0.54
> 2                         1.08
> 4                         2.18
> 8                         4.33
> 16                        8.69
> 32                       17.39
> 64                       34.34
> 128                      66.28
> 256                     130.36
> 512                     241.81
> 1024                    429.86
> 2048                    553.44
> 4096                    707.14
> 8192                    879.60
> 16384                   763.02
> 32768                  1042.89
> 65536                  1185.45
> 131072                 1267.56
> 262144                 1227.41
> 524288                 1244.61
> 1048576                1255.66
> 2097152                1273.55
> 4194304                1281.05
> 
> 
> 2016/07/27 9:02:49、"devel"さんは「Re: [OMPI devel] sm BTL performace of
> the openmpi-2.0.0」で書きました
>> Hi,
>> 
>> 
>> can you please run again with
>> 
>> --mca pml ob1
>> 
>> 
>> if Open MPI was built with mxm support, pml/cm and mtl/mxm are used
>> instead of pml/ob1 and btl/openib
>> 
>> 
>> Cheers,
>> 
>> 
>> Gilles
>> 
>> 
>> On 7/27/2016 8:56 AM, tmish...@jcity.maeda.co.jp wrote:
>>> Hi folks,
>>> 
>>> I saw a performance degradation of openmpi-2.0.0 when I ran our
> application
>>> on a node (12cores). So I did 4 tests using osu_bw as below:
>>> 
>>> 1: mpirun –np 2 osu_bw                              bad(30% of test2)
>>> 2: mpirun –np 2 –mca btl self,sm osu_bw             good(same as
> openmpi1.10.3)
>>> 3: mpirun –np 2 –mca btl self,sm,openib osu_bw      bad(30% of test2)
>>> 4: mpirun –np 2 –mca btl self,openib osu_bw bad(30% of test2)
>>> 
>>> I  guess openib BTL was used in the test 1 and 3, because these results
> are
>>> almost  same  as  test  4. I believe that sm BTL should be used even in
> the
>>> test 1 and 3, because its priority is higher than openib.
> Unfortunately, at
>>> the  moment,  I couldn’t figure out the root cause. So please someone
> would
>>> take care of it.
>>> 
>>> Regards,
>>> Tetsuya Mishima
>>> 
>>> P.S. Here I attached these test results.
>>> 
>>> [mishima@manage   OMB-3.1.1-openmpi2.0.0]$   mpirun  -np  2  -bind-to
> core
>>> -report-bindings osu_bw
>>> [manage.cluster:13389]  MCW  rank  0  bound  to  socket  0[core  0[hwt
> 0]]:
>>> [B/././././.][./././././.]
>>> [manage.cluster:13389]  MCW  rank  1  bound  to  socket  0[core  1[hwt
> 0]]:
>>> [./B/./././.][./././././.]
>>> # OSU MPI Bandwidth Test v3.1.1
>>> # Size        Bandwidth (MB/s)
>>> 1                         1.49
>>> 2                         3.04
>>> 4                         6.13
>>> 8                        12.23
>>> 16                       25.01
>>> 32                       49.96
>>> 64                       87.07
>>> 128                     138.87
>>> 256                     245.97
>>> 512                     423.30
>>> 1024                    865.85
>>> 2048                   1279.63
>>> 4096                    264.79
>>> 8192                    473.92
>>> 16384                   739.27
>>> 32768                  1030.49
>>> 65536                  1190.21
>>> 131072                 1270.77
>>> 262144                 1238.74
>>> 524288                 1245.97
>>> 1048576                1260.09
>>> 2097152                1274.53
>>> 4194304                1285.07
>>> [mishima@manage  OMB-3.1.1-openmpi2.0.0]$  mpirun  -np  2  -mca btl
> self,sm
>>> -bind-to core -report-bindings osu_bw
>>> [manage.cluster:13448]  MCW  rank  0  bound  to  socket  0[core  0[hwt
> 0]]:
>>> [B/././././.][./././././.]
>>> [manage.cluster:13448]  MCW  rank  1  bound  to  socket  0[core  1[hwt
> 0]]:
>>> [./B/./././.][./././././.]
>>> # OSU MPI Bandwidth Test v3.1.1
>>> # Size        Bandwidth (MB/s)
>>> 1                         0.51
>>> 2                         1.01
>>> 4                         2.03
>>> 8                         4.08
>>> 16                        7.92
>>> 32                       16.16
>>> 64                       32.53
>>> 128                      64.30
>>> 256                     128.19
>>> 512                     256.48
>>> 1024                    468.62
>>> 2048                    785.29
>>> 4096                    854.78
>>> 8192                   1404.51
>>> 16384                  2249.20
>>> 32768                  3136.40
>>> 65536                  3495.84
>>> 131072                 3436.69
>>> 262144                 3392.11
>>> 524288                 3400.07
>>> 1048576                3460.60
>>> 2097152                3488.09
>>> 4194304                3498.45
>>> [mishima@manage    OMB-3.1.1-openmpi2.0.0]$   mpirun   -np   2   -mca
> btl
>>> self,sm,openib -bind-to core -report-bindings osu_bw
>>> [manage.cluster:13462]  MCW  rank  0  bound  to  socket  0[core  0[hwt
> 0]]:
>>> [B/././././.][./././././.]
>>> [manage.cluster:13462]  MCW  rank  1  bound  to  socket  0[core  1[hwt
> 0]]:
>>> [./B/./././.][./././././.]
>>> # OSU MPI Bandwidth Test v3.1.1
>>> # Size        Bandwidth (MB/s)
>>> 1                         0.54
>>> 2                         1.09
>>> 4                         2.18
>>> 8                         4.37
>>> 16                        8.75
>>> 32                       17.37
>>> 64                       34.67
>>> 128                      66.66
>>> 256                     132.55
>>> 512                     261.52
>>> 1024                    489.51
>>> 2048                    818.38
>>> 4096                    290.48
>>> 8192                    511.64
>>> 16384                   765.24
>>> 32768                  1043.28
>>> 65536                  1180.48
>>> 131072                 1261.41
>>> 262144                 1232.86
>>> 524288                 1245.70
>>> 1048576                1245.69
>>> 2097152                1268.67
>>> 4194304                1281.33
>>> [mishima@manage  OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl
> self,openib
>>> -bind-to core -report-bindings osu_bw
>>> [manage.cluster:13521]  MCW  rank  0  bound  to  socket  0[core  0[hwt
> 0]]:
>>> [B/././././.][./././././.]
>>> [manage.cluster:13521]  MCW  rank  1  bound  to  socket  0[core  1[hwt
> 0]]:
>>> [./B/./././.][./././././.]
>>> # OSU MPI Bandwidth Test v3.1.1
>>> # Size        Bandwidth (MB/s)
>>> 1                         0.54
>>> 2                         1.08
>>> 4                         2.16
>>> 8                         4.34
>>> 16                        8.64
>>> 32                       17.25
>>> 64                       34.30
>>> 128                      66.13
>>> 256                     129.99
>>> 512                     242.26
>>> 1024                    429.24
>>> 2048                    556.00
>>> 4096                    706.80
>>> 8192                    874.35
>>> 16384                   762.60
>>> 32768                  1039.61
>>> 65536                  1184.03
>>> 131072                 1267.09
>>> 262144                 1230.76
>>> 524288                 1246.92
>>> 1048576                1255.88
>>> 2097152                1274.54
>>> 4194304
> 1281.63
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/07/19288.php
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/develLink to
> this post: http://www.open-mpi.org/community/lists/devel/2016/07/19289.php
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/07/19290.php

Reply via email to