Hi Gilles, I confirmed the vader is used when I don't specify any BTL as you pointed out!
Regards, Tetsuya Mishima [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 --mca btl_base_verbose 10 -bind-to core -report-bindings osu_bw [manage.cluster:20006] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././.][./././././.] [manage.cluster:20006] MCW rank 1 bound to socket 0[core 1[hwt 0]]: [./B/./././.][./././././.] [manage.cluster:20011] mca: base: components_register: registering framework btl components [manage.cluster:20011] mca: base: components_register: found loaded component self [manage.cluster:20011] mca: base: components_register: component self register function successful [manage.cluster:20011] mca: base: components_register: found loaded component vader [manage.cluster:20011] mca: base: components_register: component vader register function successful [manage.cluster:20011] mca: base: components_register: found loaded component tcp [manage.cluster:20011] mca: base: components_register: component tcp register function successful [manage.cluster:20011] mca: base: components_register: found loaded component sm [manage.cluster:20011] mca: base: components_register: component sm register function successful [manage.cluster:20011] mca: base: components_register: found loaded component openib [manage.cluster:20011] mca: base: components_register: component openib register function successful [manage.cluster:20011] mca: base: components_open: opening btl components [manage.cluster:20011] mca: base: components_open: found loaded component self [manage.cluster:20011] mca: base: components_open: component self open function successful [manage.cluster:20011] mca: base: components_open: found loaded component vader [manage.cluster:20011] mca: base: components_open: component vader open function successful [manage.cluster:20011] mca: base: components_open: found loaded component tcp [manage.cluster:20011] mca: base: components_open: component tcp open function successful [manage.cluster:20011] mca: base: components_open: found loaded component sm [manage.cluster:20011] mca: base: components_open: component sm open function successful [manage.cluster:20011] mca: base: components_open: found loaded component openib [manage.cluster:20011] mca: base: components_open: component openib open function successful [manage.cluster:20011] select: initializing btl component self [manage.cluster:20011] select: init of component self returned success [manage.cluster:20011] select: initializing btl component vader [manage.cluster:20011] select: init of component vader returned success [manage.cluster:20011] select: initializing btl component tcp [manage.cluster:20011] select: init of component tcp returned success [manage.cluster:20011] select: initializing btl component sm [manage.cluster:20011] select: init of component sm returned success [manage.cluster:20011] select: initializing btl component openib [manage.cluster:20011] Checking distance from this process to device=mthca0 [manage.cluster:20011] hwloc_distances->nbobjs=2 [manage.cluster:20011] hwloc_distances->latency[0]=1.000000 [manage.cluster:20011] hwloc_distances->latency[1]=1.600000 [manage.cluster:20011] hwloc_distances->latency[2]=1.600000 [manage.cluster:20011] hwloc_distances->latency[3]=1.000000 [manage.cluster:20011] ibv_obj->type set to NULL [manage.cluster:20011] Process is bound: distance to device is 0.000000 [manage.cluster:20012] mca: base: components_register: registering framework btl components [manage.cluster:20012] mca: base: components_register: found loaded component self [manage.cluster:20012] mca: base: components_register: component self register function successful [manage.cluster:20012] mca: base: components_register: found loaded component vader [manage.cluster:20012] mca: base: components_register: component vader register function successful [manage.cluster:20012] mca: base: components_register: found loaded component tcp [manage.cluster:20012] mca: base: components_register: component tcp register function successful [manage.cluster:20012] mca: base: components_register: found loaded component sm [manage.cluster:20012] mca: base: components_register: component sm register function successful [manage.cluster:20012] mca: base: components_register: found loaded component openib [manage.cluster:20012] mca: base: components_register: component openib register function successful [manage.cluster:20012] mca: base: components_open: opening btl components [manage.cluster:20012] mca: base: components_open: found loaded component self [manage.cluster:20012] mca: base: components_open: component self open function successful [manage.cluster:20012] mca: base: components_open: found loaded component vader [manage.cluster:20012] mca: base: components_open: component vader open function successful [manage.cluster:20012] mca: base: components_open: found loaded component tcp [manage.cluster:20012] mca: base: components_open: component tcp open function successful [manage.cluster:20012] mca: base: components_open: found loaded component sm [manage.cluster:20012] mca: base: components_open: component sm open function successful [manage.cluster:20012] mca: base: components_open: found loaded component openib [manage.cluster:20012] mca: base: components_open: component openib open function successful [manage.cluster:20012] select: initializing btl component self [manage.cluster:20012] select: init of component self returned success [manage.cluster:20012] select: initializing btl component vader [manage.cluster:20012] select: init of component vader returned success [manage.cluster:20012] select: initializing btl component tcp [manage.cluster:20012] select: init of component tcp returned success [manage.cluster:20012] select: initializing btl component sm [manage.cluster:20012] select: init of component sm returned success [manage.cluster:20012] select: initializing btl component openib [manage.cluster:20012] Checking distance from this process to device=mthca0 [manage.cluster:20012] hwloc_distances->nbobjs=2 [manage.cluster:20012] hwloc_distances->latency[0]=1.000000 [manage.cluster:20012] hwloc_distances->latency[1]=1.600000 [manage.cluster:20012] hwloc_distances->latency[2]=1.600000 [manage.cluster:20012] hwloc_distances->latency[3]=1.000000 [manage.cluster:20012] ibv_obj->type set to NULL [manage.cluster:20012] Process is bound: distance to device is 0.000000 [manage.cluster:20012] openib BTL: rdmacm CPC unavailable for use on mthca0:1; skipped [manage.cluster:20011] openib BTL: rdmacm CPC unavailable for use on mthca0:1; skipped [manage.cluster:20012] [rank=1] openib: using port mthca0:1 [manage.cluster:20012] select: init of component openib returned success [manage.cluster:20011] [rank=0] openib: using port mthca0:1 [manage.cluster:20011] select: init of component openib returned success [manage.cluster:20012] mca: bml: Using self btl for send to [[16477,1],1] on node manage [manage.cluster:20011] mca: bml: Using self btl for send to [[16477,1],0] on node manage [manage.cluster:20012] mca: bml: Using vader btl for send to [[16477,1],0] on node manage [manage.cluster:20011] mca: bml: Using vader btl for send to [[16477,1],1] on node manage # OSU MPI Bandwidth Test v3.1.1 # Size Bandwidth (MB/s) 1 1.42 2 3.04 4 6.06 8 12.11 16 24.32 32 47.78 64 85.57 128 139.08 256 240.59 512 415.78 1024 848.47 2048 1234.08 4096 265.53 8192 471.28 16384 740.52 32768 1029.48 65536 1191.29 131072 1271.51 262144 1238.58 524288 1246.67 1048576 1263.01 2097152 1275.67 4194304 1281.87 [manage.cluster:20011] mca: base: close: component self closed [manage.cluster:20011] mca: base: close: unloading component self [manage.cluster:20012] mca: base: close: component self closed [manage.cluster:20012] mca: base: close: unloading component self [manage.cluster:20012] mca: base: close: component vader closed [manage.cluster:20012] mca: base: close: unloading component vader [manage.cluster:20011] mca: base: close: component vader closed [manage.cluster:20011] mca: base: close: unloading component vader [manage.cluster:20012] mca: base: close: component tcp closed [manage.cluster:20012] mca: base: close: unloading component tcp [manage.cluster:20011] mca: base: close: component tcp closed [manage.cluster:20011] mca: base: close: unloading component tcp [manage.cluster:20011] mca: base: close: component sm closed [manage.cluster:20011] mca: base: close: unloading component sm [manage.cluster:20012] mca: base: close: component sm closed [manage.cluster:20012] mca: base: close: unloading component sm [manage.cluster:20011] mca: base: close: component openib closed [manage.cluster:20011] mca: base: close: unloading component openib [manage.cluster:20012] mca: base: close: component openib closed [manage.cluster:20012] mca: base: close: unloading component openib 2016/07/27 9:23:34、"devel"さんは「Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0」で書きました > Also, btl/vader has a higher exclusivity than btl/sm, so if you do not > manually specify any btl, vader should be used. > > > you can run with > > --mca btl_base_verbose 10 > > to confirm which btl is used > > > Cheers, > > > Gilles > > > On 7/27/2016 9:20 AM, Nathan Hjelm wrote: > > sm is deprecated in 2.0.0 and will likely be removed in favor of vader in 2.1.0. > > > > This issue is probably this known issue: https://github.com/open-mpi/ompi-release/pull/1250 > > > > Please apply those commits and see if it fixes the issue for you. > > > > -Nathan > > > >> On Jul 26, 2016, at 6:17 PM, tmish...@jcity.maeda.co.jp wrote: > >> > >> Hi Gilles, > >> > >> Thanks. I ran again with --mca pml ob1 but I've got the same results as > >> below: > >> > >> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -bind-to > >> core -report-bindings osu_bw > >> [manage.cluster:18142] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > >> [B/././././.][./././././.] > >> [manage.cluster:18142] MCW rank 1 bound to socket 0[core 1[hwt 0]]: > >> [./B/./././.][./././././.] > >> # OSU MPI Bandwidth Test v3.1.1 > >> # Size Bandwidth (MB/s) > >> 1 1.48 > >> 2 3.07 > >> 4 6.26 > >> 8 12.53 > >> 16 24.33 > >> 32 49.03 > >> 64 83.46 > >> 128 132.60 > >> 256 234.96 > >> 512 420.86 > >> 1024 842.37 > >> 2048 1231.65 > >> 4096 264.67 > >> 8192 472.16 > >> 16384 740.42 > >> 32768 1030.39 > >> 65536 1191.16 > >> 131072 1269.45 > >> 262144 1238.33 > >> 524288 1247.97 > >> 1048576 1257.96 > >> 2097152 1274.74 > >> 4194304 1280.94 > >> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl > >> self,sm -bind-to core -report-bindings osu_b > >> w > >> [manage.cluster:18204] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > >> [B/././././.][./././././.] > >> [manage.cluster:18204] MCW rank 1 bound to socket 0[core 1[hwt 0]]: > >> [./B/./././.][./././././.] > >> # OSU MPI Bandwidth Test v3.1.1 > >> # Size Bandwidth (MB/s) > >> 1 0.52 > >> 2 1.05 > >> 4 2.08 > >> 8 4.18 > >> 16 8.21 > >> 32 16.65 > >> 64 32.60 > >> 128 66.70 > >> 256 132.45 > >> 512 269.27 > >> 1024 504.63 > >> 2048 819.76 > >> 4096 874.54 > >> 8192 1447.11 > >> 16384 2263.28 > >> 32768 3236.85 > >> 65536 3567.34 > >> 131072 3555.17 > >> 262144 3455.76 > >> 524288 3441.80 > >> 1048576 3505.30 > >> 2097152 3534.01 > >> 4194304 3546.94 > >> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl > >> self,sm,openib -bind-to core -report-binding > >> s osu_bw > >> [manage.cluster:18218] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > >> [B/././././.][./././././.] > >> [manage.cluster:18218] MCW rank 1 bound to socket 0[core 1[hwt 0]]: > >> [./B/./././.][./././././.] > >> # OSU MPI Bandwidth Test v3.1.1 > >> # Size Bandwidth (MB/s) > >> 1 0.51 > >> 2 1.03 > >> 4 2.05 > >> 8 4.07 > >> 16 8.14 > >> 32 16.32 > >> 64 32.98 > >> 128 63.70 > >> 256 126.66 > >> 512 252.61 > >> 1024 480.22 > >> 2048 810.54 > >> 4096 290.61 > >> 8192 512.49 > >> 16384 764.60 > >> 32768 1036.81 > >> 65536 1182.81 > >> 131072 1264.48 > >> 262144 1235.82 > >> 524288 1246.70 > >> 1048576 1254.66 > >> 2097152 1274.64 > >> 4194304 1280.65 > >> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca pml ob1 -mca btl > >> self,openib -bind-to core -report-bindings o > >> su_bw > >> [manage.cluster:18276] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > >> [B/././././.][./././././.] > >> [manage.cluster:18276] MCW rank 1 bound to socket 0[core 1[hwt 0]]: > >> [./B/./././.][./././././.] > >> # OSU MPI Bandwidth Test v3.1.1 > >> # Size Bandwidth (MB/s) > >> 1 0.54 > >> 2 1.08 > >> 4 2.18 > >> 8 4.33 > >> 16 8.69 > >> 32 17.39 > >> 64 34.34 > >> 128 66.28 > >> 256 130.36 > >> 512 241.81 > >> 1024 429.86 > >> 2048 553.44 > >> 4096 707.14 > >> 8192 879.60 > >> 16384 763.02 > >> 32768 1042.89 > >> 65536 1185.45 > >> 131072 1267.56 > >> 262144 1227.41 > >> 524288 1244.61 > >> 1048576 1255.66 > >> 2097152 1273.55 > >> 4194304 1281.05 > >> > >> > >> 2016/07/27 9:02:49、"devel"さんは「Re: [OMPI devel] sm BTL performace of > >> the openmpi-2.0.0」で書きました > >>> Hi, > >>> > >>> > >>> can you please run again with > >>> > >>> --mca pml ob1 > >>> > >>> > >>> if Open MPI was built with mxm support, pml/cm and mtl/mxm are used > >>> instead of pml/ob1 and btl/openib > >>> > >>> > >>> Cheers, > >>> > >>> > >>> Gilles > >>> > >>> > >>> On 7/27/2016 8:56 AM, tmish...@jcity.maeda.co.jp wrote: > >>>> Hi folks, > >>>> > >>>> I saw a performance degradation of openmpi-2.0.0 when I ran our > >> application > >>>> on a node (12cores). So I did 4 tests using osu_bw as below: > >>>> > >>>> 1: mpirun –np 2 osu_bw bad(30% of test2) > >>>> 2: mpirun –np 2 –mca btl self,sm osu_bw good(same as > >> openmpi1.10.3) > >>>> 3: mpirun –np 2 –mca btl self,sm,openib osu_bw bad(30% of test2) > >>>> 4: mpirun –np 2 –mca btl self,openib osu_bw bad(30% of test2) > >>>> > >>>> I guess openib BTL was used in the test 1 and 3, because these results > >> are > >>>> almost same as test 4. I believe that sm BTL should be used even in > >> the > >>>> test 1 and 3, because its priority is higher than openib. > >> Unfortunately, at > >>>> the moment, I couldn’t figure out the root cause. So please someone > >> would > >>>> take care of it. > >>>> > >>>> Regards, > >>>> Tetsuya Mishima > >>>> > >>>> P.S. Here I attached these test results. > >>>> > >>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -bind-to > >> core > >>>> -report-bindings osu_bw > >>>> [manage.cluster:13389] MCW rank 0 bound to socket 0[core 0 [hwt > >> 0]]: > >>>> [B/././././.][./././././.] > >>>> [manage.cluster:13389] MCW rank 1 bound to socket 0[core 1 [hwt > >> 0]]: > >>>> [./B/./././.][./././././.] > >>>> # OSU MPI Bandwidth Test v3.1.1 > >>>> # Size Bandwidth (MB/s) > >>>> 1 1.49 > >>>> 2 3.04 > >>>> 4 6.13 > >>>> 8 12.23 > >>>> 16 25.01 > >>>> 32 49.96 > >>>> 64 87.07 > >>>> 128 138.87 > >>>> 256 245.97 > >>>> 512 423.30 > >>>> 1024 865.85 > >>>> 2048 1279.63 > >>>> 4096 264.79 > >>>> 8192 473.92 > >>>> 16384 739.27 > >>>> 32768 1030.49 > >>>> 65536 1190.21 > >>>> 131072 1270.77 > >>>> 262144 1238.74 > >>>> 524288 1245.97 > >>>> 1048576 1260.09 > >>>> 2097152 1274.53 > >>>> 4194304 1285.07 > >>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl > >> self,sm > >>>> -bind-to core -report-bindings osu_bw > >>>> [manage.cluster:13448] MCW rank 0 bound to socket 0[core 0 [hwt > >> 0]]: > >>>> [B/././././.][./././././.] > >>>> [manage.cluster:13448] MCW rank 1 bound to socket 0[core 1 [hwt > >> 0]]: > >>>> [./B/./././.][./././././.] > >>>> # OSU MPI Bandwidth Test v3.1.1 > >>>> # Size Bandwidth (MB/s) > >>>> 1 0.51 > >>>> 2 1.01 > >>>> 4 2.03 > >>>> 8 4.08 > >>>> 16 7.92 > >>>> 32 16.16 > >>>> 64 32.53 > >>>> 128 64.30 > >>>> 256 128.19 > >>>> 512 256.48 > >>>> 1024 468.62 > >>>> 2048 785.29 > >>>> 4096 854.78 > >>>> 8192 1404.51 > >>>> 16384 2249.20 > >>>> 32768 3136.40 > >>>> 65536 3495.84 > >>>> 131072 3436.69 > >>>> 262144 3392.11 > >>>> 524288 3400.07 > >>>> 1048576 3460.60 > >>>> 2097152 3488.09 > >>>> 4194304 3498.45 > >>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca > >> btl > >>>> self,sm,openib -bind-to core -report-bindings osu_bw > >>>> [manage.cluster:13462] MCW rank 0 bound to socket 0[core 0 [hwt > >> 0]]: > >>>> [B/././././.][./././././.] > >>>> [manage.cluster:13462] MCW rank 1 bound to socket 0[core 1 [hwt > >> 0]]: > >>>> [./B/./././.][./././././.] > >>>> # OSU MPI Bandwidth Test v3.1.1 > >>>> # Size Bandwidth (MB/s) > >>>> 1 0.54 > >>>> 2 1.09 > >>>> 4 2.18 > >>>> 8 4.37 > >>>> 16 8.75 > >>>> 32 17.37 > >>>> 64 34.67 > >>>> 128 66.66 > >>>> 256 132.55 > >>>> 512 261.52 > >>>> 1024 489.51 > >>>> 2048 818.38 > >>>> 4096 290.48 > >>>> 8192 511.64 > >>>> 16384 765.24 > >>>> 32768 1043.28 > >>>> 65536 1180.48 > >>>> 131072 1261.41 > >>>> 262144 1232.86 > >>>> 524288 1245.70 > >>>> 1048576 1245.69 > >>>> 2097152 1268.67 > >>>> 4194304 1281.33 > >>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 -mca btl > >> self,openib > >>>> -bind-to core -report-bindings osu_bw > >>>> [manage.cluster:13521] MCW rank 0 bound to socket 0[core 0 [hwt > >> 0]]: > >>>> [B/././././.][./././././.] > >>>> [manage.cluster:13521] MCW rank 1 bound to socket 0[core 1 [hwt > >> 0]]: > >>>> [./B/./././.][./././././.] > >>>> # OSU MPI Bandwidth Test v3.1.1 > >>>> # Size Bandwidth (MB/s) > >>>> 1 0.54 > >>>> 2 1.08 > >>>> 4 2.16 > >>>> 8 4.34 > >>>> 16 8.64 > >>>> 32 17.25 > >>>> 64 34.30 > >>>> 128 66.13 > >>>> 256 129.99 > >>>> 512 242.26 > >>>> 1024 429.24 > >>>> 2048 556.00 > >>>> 4096 706.80 > >>>> 8192 874.35 > >>>> 16384 762.60 > >>>> 32768 1039.61 > >>>> 65536 1184.03 > >>>> 131072 1267.09 > >>>> 262144 1230.76 > >>>> 524288 1246.92 > >>>> 1048576 1255.88 > >>>> 2097152 1274.54 > >>>> 4194304 > >> 1281.63 > >>>> _______________________________________________ > >>>> devel mailing list > >>>> de...@open-mpi.org > >>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > >>>> Link to this post: > >> http://www.open-mpi.org/community/lists/devel/2016/07/19288.php > >>> _______________________________________________ > >>> devel mailing list > >>> de...@open-mpi.org > >>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/develLink to > >> this post: http://www.open-mpi.org/community/lists/devel/2016/07/19289.php > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > >> Link to this post: http://www.open-mpi.org/community/lists/devel/2016/07/19290.php > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: http://www.open-mpi.org/community/lists/devel/2016/07/19291.php > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/develLink to this post: http://www.open-mpi.org/community/lists/devel/2016/07/19292.php