**Benchmark data**
The benchmark data contains data collected on Linux and Mac, and compared 
between build with and w.o. MKLDNN, as the computation on a build w.o. MKLDNN 
is too slow, only the performance data selected CNN models are listed, 
benchmarking script based on example\image-classfication\benchmark_score.py.
**On CentOS 7.4**, pip is used for MXNET installation, that is pip install 
mxnet==1.3.0 v.s. pip install mxnet-mkl==1.3.0. 
**_(Benchmarking is executed on a 1-socket Xeon SKX-8180, 28-core and 192G 
DDR4-2166 memery)_**  

VGG16

batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 63.972961 | 2.776588 | 2304.01%
16 | 90.132777 | 3.27203 | 2754.64%
32 | 90.533301 | 3.271969 | 2766.94%
64 | 90.547993 | 3.332716 | 2716.94%
128 | 90.130061 | 3.303833 | 2728.05%
256 | 89.474756 | 3.333387 | 2684.20%

Inception-v3

batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 58.965411 | 6.244512 | 944.28%
16 | 168.280915 | 6.566202 | 2562.83%
32 | 167.823787 | 6.421525 | 2613.46%
64 | 168.746333 | 6.585618 | 2562.35%
128 | 166.841938 | 6.453535 | 2585.28%
256 | 162.761511 | 6.484705 | 2509.93%

Inception-v4

batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 32.362458 | 3.310546 | 977.56%
16 | 84.847819 | 3.393066 | 2500.62%
32 | 85.549374 | 3.379569 | 2531.37%
64 | 86.123905 | 3.335553 | 2582.00%
128 | 85.134901 | 3.334666 | 2553.03%
256 | 83.655486 | 3.330463 | 2511.83%

ResNet-50

batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 83.434864 | 11.020557 | 757.08%
16 | 194.102224 | 11.092527 | 1749.85%
32 | 197.600266 | 10.904773 | 1812.05%
64 | 199.251137 | 10.746266 | 1854.14%
128 | 198.108861 | 10.732905 | 1845.81%
256 | 196.444539 | 10.638787 | 1846.49%

MobileNet

batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 263.504341 | 27.284977 | 965.75%
16 | 607.443174 | 27.705262 | 2192.52%
32 | 614.830145 | 26.904616 | 2285.22%
64 | 644.903928 | 26.844882 | 2402.33%
128 | 621.659484 | 26.381861 | 2356.39%
256 | 605.399741 | 26.354961 | 2297.10%

**On MacOS,** the default compilation configurations disabling the OPENMP, 
below tables listing the perf datas on build with MKLDNN(OPENMP enabled), and 
the build without MKLDNN.
**_(HW is iMac Pro with one Socket 8-core Xeon-W and 32G DDR4 memory)_**

VGG16

batch   size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 20.913986 | 7.821254 | 267.40%
16 | 24.273071 | 8.438211 | 287.66%
32 | 24.704907 | 8.480799 | 291.30%
64 | 24.94608 | 8.524874 | 292.63%
128 | 25.074148 | 8.53283 | 293.86%
256 | 25.2629 | 8.535707 | 295.97%

Inception-v3

batch   size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 41.431404 | 10.323434 | 401.33%
16 | 54.312317 | 10.665803 | 509.22%
32 | 54.604119 | 10.621378 | 514.10%
64 | 54.39568 | 10.605843 | 512.88%
128 | 54.410785 | 10.62466 | 512.12%
256 | 54.614424 | 10.616772 | 514.42%

Inception-V4

batch   size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 20.715221 | 5.655873 | 366.26%
16 | 26.249734 | 5.779357 | 454.20%
32 | 26.197659 | 5.761883 | 454.67%
64 | 26.16153 | 5.771389 | 453.30%
128 | 26.247461 | 5.778834 | 454.20%
256 | 26.313875 | 5.77839 | 455.38%

ResNet-50

batch   size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 41.70109 | 19.246681 | 216.67%
16 | 43.132788 | 20.854712 | 206.83%
32 | 41.613291 | 20.570733 | 202.29%
64 | 38.13329 | 20.652445 | 184.64%
128 | 38.839577 | 20.685878 | 187.76%
256 | 38.853521 | 20.68953 | 187.79%

MobileNet

batch   size | MKLDNN-enabled | w.o. MKLDNN | boost-up
-- | -- | -- | --
1 | 200.91608 | 36.047475 | 557.37%
16 | 287.614019 | 37.224849 | 772.64%
32 | 277.838051 | 36.914548 | 752.65%
64 | 274.474078 | 36.939298 | 743.04%
128 | 273.622323 | 37.04172 | 738.69%
256 | 273.445636 | 36.947783 | 740.09%



[ Full content available at: 
https://github.com/apache/incubator-mxnet/pull/12591 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to