sm is deprecated in 2.0.0 and will likely be removed in favor of vader in 2.1.0.
This issue is probably this known issue: https://github.com/open-mpi/ompi-release/pull/1250 Please apply those commits and see if it fixes the issue for you. -Nathan > On Jul 26, 2016, at 6:17 PM, tmish...@jcity.maeda.co.jp wrote: > > Hi Gilles, > > Thanks. I ran again with --mca pml ob1 but I've got the same results as > below: > > [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -bind-to > core -report-bindings osu_bw > [manage.cluster:18142] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > [B/././././.][./././././.] > [manage.cluster:18142] MCW rank 1 bound to socket 0[core 1[hwt 0]]: > [./B/./././.][./././././.] > # OSU MPI Bandwidth Test v3.1.1 > # Size Bandwidth (MB/s) > 1 1.48 > 2 3.07 > 4 6.26 > 8 12.53 > 16 24.33 > 32 49.03 > 64 83.46 > 128 132.60 > 256 234.96 > 512 420.86 > 1024 842.37 > 2048 1231.65 > 4096 264.67 > 8192 472.16 > 16384 740.42 > 32768 1030.39 > 65536 1191.16 > 131072 1269.45 > 262144 1238.33 > 524288 1247.97 > 1048576 1257.96 > 2097152 1274.74 > 4194304 1280.94 > [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl > self,sm -bind-to core -report-bindings osu_b > w > [manage.cluster:18204] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > [B/././././.][./././././.] > [manage.cluster:18204] MCW rank 1 bound to socket 0[core 1[hwt 0]]: > [./B/./././.][./././././.] > # OSU MPI Bandwidth Test v3.1.1 > # Size Bandwidth (MB/s) > 1 0.52 > 2 1.05 > 4 2.08 > 8 4.18 > 16 8.21 > 32 16.65 > 64 32.60 > 128 66.70 > 256 132.45 > 512 269.27 > 1024 504.63 > 2048 819.76 > 4096 874.54 > 8192 1447.11 > 16384 2263.28 > 32768 3236.85 > 65536 3567.34 > 131072 3555.17 > 262144 3455.76 > 524288 3441.80 > 1048576 3505.30 > 2097152 3534.01 > 4194304 3546.94 > [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl > self,sm,openib -bind-to core -report-binding > s osu_bw > [manage.cluster:18218] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > [B/././././.][./././././.] > [manage.cluster:18218] MCW rank 1 bound to socket 0[core 1[hwt 0]]: > [./B/./././.][./././././.] > # OSU MPI Bandwidth Test v3.1.1 > # Size Bandwidth (MB/s) > 1 0.51 > 2 1.03 > 4 2.05 > 8 4.07 > 16 8.14 > 32 16.32 > 64 32.98 > 128 63.70 > 256 126.66 > 512 252.61 > 1024 480.22 > 2048 810.54 > 4096 290.61 > 8192 512.49 > 16384 764.60 > 32768 1036.81 > 65536 1182.81 > 131072 1264.48 > 262144 1235.82 > 524288 1246.70 > 1048576 1254.66 > 2097152 1274.64 > 4194304 1280.65 > [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl > self,openib -bind-to core -report-bindings o > su_bw > [manage.cluster:18276] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > [B/././././.][./././././.] > [manage.cluster:18276] MCW rank 1 bound to socket 0[core 1[hwt 0]]: > [./B/./././.][./././././.] > # OSU MPI Bandwidth Test v3.1.1 > # Size Bandwidth (MB/s) > 1 0.54 > 2 1.08 > 4 2.18 > 8 4.33 > 16 8.69 > 32 17.39 > 64 34.34 > 128 66.28 > 256 130.36 > 512 241.81 > 1024 429.86 > 2048 553.44 > 4096 707.14 > 8192 879.60 > 16384 763.02 > 32768 1042.89 > 65536 1185.45 > 131072 1267.56 > 262144 1227.41 > 524288 1244.61 > 1048576 1255.66 > 2097152 1273.55 > 4194304 1281.05 > > > 2016/07/27 9:02:49、"devel"さんは「Re: [OMPI devel] sm BTL performace of > the openmpi-2.0.0」で書きました >> Hi, >> >> >> can you please run again with >> >> --mca pml ob1 >> >> >> if Open MPI was built with mxm support, pml/cm and mtl/mxm are used >> instead of pml/ob1 and btl/openib >> >> >> Cheers, >> >> >> Gilles >> >> >> On 7/27/2016 8:56 AM, tmish...@jcity.maeda.co.jp wrote: >>> Hi folks, >>> >>> I saw a performance degradation of openmpi-2.0.0 when I ran our > application >>> on a node (12cores). So I did 4 tests using osu_bw as below: >>> >>> 1: mpirun –np 2 osu_bw bad(30% of test2) >>> 2: mpirun –np 2 –mca btl self,sm osu_bw good(same as > openmpi1.10.3) >>> 3: mpirun –np 2 –mca btl self,sm,openib osu_bw bad(30% of test2) >>> 4: mpirun –np 2 –mca btl self,openib osu_bw bad(30% of test2) >>> >>> I guess openib BTL was used in the test 1 and 3, because these results > are >>> almost same as test 4. I believe that sm BTL should be used even in > the >>> test 1 and 3, because its priority is higher than openib. > Unfortunately, at >>> the moment, I couldn’t figure out the root cause. So please someone > would >>> take care of it. >>> >>> Regards, >>> Tetsuya Mishima >>> >>> P.S. Here I attached these test results. >>> >>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -bind-to > core >>> -report-bindings osu_bw >>> [manage.cluster:13389] MCW rank 0 bound to socket 0[core 0[hwt > 0]]: >>> [B/././././.][./././././.] >>> [manage.cluster:13389] MCW rank 1 bound to socket 0[core 1[hwt > 0]]: >>> [./B/./././.][./././././.] >>> # OSU MPI Bandwidth Test v3.1.1 >>> # Size Bandwidth (MB/s) >>> 1 1.49 >>> 2 3.04 >>> 4 6.13 >>> 8 12.23 >>> 16 25.01 >>> 32 49.96 >>> 64 87.07 >>> 128 138.87 >>> 256 245.97 >>> 512 423.30 >>> 1024 865.85 >>> 2048 1279.63 >>> 4096 264.79 >>> 8192 473.92 >>> 16384 739.27 >>> 32768 1030.49 >>> 65536 1190.21 >>> 131072 1270.77 >>> 262144 1238.74 >>> 524288 1245.97 >>> 1048576 1260.09 >>> 2097152 1274.53 >>> 4194304 1285.07 >>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl > self,sm >>> -bind-to core -report-bindings osu_bw >>> [manage.cluster:13448] MCW rank 0 bound to socket 0[core 0[hwt > 0]]: >>> [B/././././.][./././././.] >>> [manage.cluster:13448] MCW rank 1 bound to socket 0[core 1[hwt > 0]]: >>> [./B/./././.][./././././.] >>> # OSU MPI Bandwidth Test v3.1.1 >>> # Size Bandwidth (MB/s) >>> 1 0.51 >>> 2 1.01 >>> 4 2.03 >>> 8 4.08 >>> 16 7.92 >>> 32 16.16 >>> 64 32.53 >>> 128 64.30 >>> 256 128.19 >>> 512 256.48 >>> 1024 468.62 >>> 2048 785.29 >>> 4096 854.78 >>> 8192 1404.51 >>> 16384 2249.20 >>> 32768 3136.40 >>> 65536 3495.84 >>> 131072 3436.69 >>> 262144 3392.11 >>> 524288 3400.07 >>> 1048576 3460.60 >>> 2097152 3488.09 >>> 4194304 3498.45 >>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca > btl >>> self,sm,openib -bind-to core -report-bindings osu_bw >>> [manage.cluster:13462] MCW rank 0 bound to socket 0[core 0[hwt > 0]]: >>> [B/././././.][./././././.] >>> [manage.cluster:13462] MCW rank 1 bound to socket 0[core 1[hwt > 0]]: >>> [./B/./././.][./././././.] >>> # OSU MPI Bandwidth Test v3.1.1 >>> # Size Bandwidth (MB/s) >>> 1 0.54 >>> 2 1.09 >>> 4 2.18 >>> 8 4.37 >>> 16 8.75 >>> 32 17.37 >>> 64 34.67 >>> 128 66.66 >>> 256 132.55 >>> 512 261.52 >>> 1024 489.51 >>> 2048 818.38 >>> 4096 290.48 >>> 8192 511.64 >>> 16384 765.24 >>> 32768 1043.28 >>> 65536 1180.48 >>> 131072 1261.41 >>> 262144 1232.86 >>> 524288 1245.70 >>> 1048576 1245.69 >>> 2097152 1268.67 >>> 4194304 1281.33 >>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl > self,openib >>> -bind-to core -report-bindings osu_bw >>> [manage.cluster:13521] MCW rank 0 bound to socket 0[core 0[hwt > 0]]: >>> [B/././././.][./././././.] >>> [manage.cluster:13521] MCW rank 1 bound to socket 0[core 1[hwt > 0]]: >>> [./B/./././.][./././././.] >>> # OSU MPI Bandwidth Test v3.1.1 >>> # Size Bandwidth (MB/s) >>> 1 0.54 >>> 2 1.08 >>> 4 2.16 >>> 8 4.34 >>> 16 8.64 >>> 32 17.25 >>> 64 34.30 >>> 128 66.13 >>> 256 129.99 >>> 512 242.26 >>> 1024 429.24 >>> 2048 556.00 >>> 4096 706.80 >>> 8192 874.35 >>> 16384 762.60 >>> 32768 1039.61 >>> 65536 1184.03 >>> 131072 1267.09 >>> 262144 1230.76 >>> 524288 1246.92 >>> 1048576 1255.88 >>> 2097152 1274.54 >>> 4194304 > 1281.63 >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel >>> Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/07/19288.php >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/develLink to > this post: http://www.open-mpi.org/community/lists/devel/2016/07/19289.php > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/07/19290.php