**Benchmark data** The benchmark data contains data collected on Linux and Mac, and compared between build with and w.o. MKLDNN, as the computation on a build w.o. MKLDNN is too slow, only the performance data selected CNN models are listed, benchmarking script based on example\image-classfication\benchmark_score.py.
**On CentOS 7.4**, pip is used for MXNET installation, that is pip install mxnet==1.3.0 v.s. pip install mxnet-mkl==1.3.0. **_(Benchmarking is executed on a 1-socket Xeon SKX-8180, 28-core and 192G DDR4-2666 memery)_** VGG16 batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 63.972961 | 2.776588 | 2304.01% 16 | 90.132777 | 3.27203 | 2754.64% 32 | 90.533301 | 3.271969 | 2766.94% 64 | 90.547993 | 3.332716 | 2716.94% 128 | 90.130061 | 3.303833 | 2728.05% 256 | 89.474756 | 3.333387 | 2684.20% Inception-v3 batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 58.965411 | 6.244512 | 944.28% 16 | 168.280915 | 6.566202 | 2562.83% 32 | 167.823787 | 6.421525 | 2613.46% 64 | 168.746333 | 6.585618 | 2562.35% 128 | 166.841938 | 6.453535 | 2585.28% 256 | 162.761511 | 6.484705 | 2509.93% Inception-v4 batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 32.362458 | 3.310546 | 977.56% 16 | 84.847819 | 3.393066 | 2500.62% 32 | 85.549374 | 3.379569 | 2531.37% 64 | 86.123905 | 3.335553 | 2582.00% 128 | 85.134901 | 3.334666 | 2553.03% 256 | 83.655486 | 3.330463 | 2511.83% ResNet-50 batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 83.434864 | 11.020557 | 757.08% 16 | 194.102224 | 11.092527 | 1749.85% 32 | 197.600266 | 10.904773 | 1812.05% 64 | 199.251137 | 10.746266 | 1854.14% 128 | 198.108861 | 10.732905 | 1845.81% 256 | 196.444539 | 10.638787 | 1846.49% MobileNet batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 263.504341 | 27.284977 | 965.75% 16 | 607.443174 | 27.705262 | 2192.52% 32 | 614.830145 | 26.904616 | 2285.22% 64 | 644.903928 | 26.844882 | 2402.33% 128 | 621.659484 | 26.381861 | 2356.39% 256 | 605.399741 | 26.354961 | 2297.10% **On MacOS,** the default compilation configurations disabling the OPENMP, below tables listing the perf datas on build with MKLDNN(OPENMP enabled), and the build without MKLDNN. **_(HW is iMac Pro with one Socket 8-core Xeon-W and 32G DDR4 memory)_** VGG16 batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 20.913986 | 7.821254 | 267.40% 16 | 24.273071 | 8.438211 | 287.66% 32 | 24.704907 | 8.480799 | 291.30% 64 | 24.94608 | 8.524874 | 292.63% 128 | 25.074148 | 8.53283 | 293.86% 256 | 25.2629 | 8.535707 | 295.97% Inception-v3 batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 41.431404 | 10.323434 | 401.33% 16 | 54.312317 | 10.665803 | 509.22% 32 | 54.604119 | 10.621378 | 514.10% 64 | 54.39568 | 10.605843 | 512.88% 128 | 54.410785 | 10.62466 | 512.12% 256 | 54.614424 | 10.616772 | 514.42% Inception-V4 batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 20.715221 | 5.655873 | 366.26% 16 | 26.249734 | 5.779357 | 454.20% 32 | 26.197659 | 5.761883 | 454.67% 64 | 26.16153 | 5.771389 | 453.30% 128 | 26.247461 | 5.778834 | 454.20% 256 | 26.313875 | 5.77839 | 455.38% ResNet-50 batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 41.70109 | 19.246681 | 216.67% 16 | 43.132788 | 20.854712 | 206.83% 32 | 41.613291 | 20.570733 | 202.29% 64 | 38.13329 | 20.652445 | 184.64% 128 | 38.839577 | 20.685878 | 187.76% 256 | 38.853521 | 20.68953 | 187.79% MobileNet batch size | MKLDNN-enabled | w.o. MKLDNN | boost-up -- | -- | -- | -- 1 | 200.91608 | 36.047475 | 557.37% 16 | 287.614019 | 37.224849 | 772.64% 32 | 277.838051 | 36.914548 | 752.65% 64 | 274.474078 | 36.939298 | 743.04% 128 | 273.622323 | 37.04172 | 738.69% 256 | 273.445636 | 36.947783 | 740.09% [ Full content available at: https://github.com/apache/incubator-mxnet/pull/12591 ] This message was relayed via gitbox.apache.org for [email protected]
