juliusshufan edited a comment on issue #12591: USE_MKLDNN=1 is default in make build (mkldnn must be explicitly turned off) URL: https://github.com/apache/incubator-mxnet/pull/12591#issuecomment-424968127 **RNN related data, including both accuracy, and performance/benchmarking.** **Accuracy** 1. **_A GNMT model_** implemented by gluon-nlp (scripts\nmt\train_gnmt.py), IWMT2015 dataset, en-vi translation. The decoder-encoder is a 2-layer LSTM, per the model implemenation, as gluon.rnncell used, the MKLDNN FC can be covered as it is gluon.rnncell is an unfused kernel, below figure is the ppl trends collected on both GPU and CPU, with same hyper-parameters, the two curves aligned very well.  2. A simple RNN model, provided by official MXNET repo (/example/rnn/bucketing), implemented by RNN symbol API. Training tests are using a 3-layer LSTM/GRU RNN model with fused-RNN kernel on CPU and GPU, and comparses the training curves. **Benchmarking** Thanks to the new features released by MXNET 1.3.0 on Gluon RNN API, dummy-data based benchmarking are executed, using fused and unfused Gluon RNN-API repectively, with MXNET with MKLDNN as the backend. The benchmarking uses a series predefined input shape, 1-layer LSTM Input Shape (N, T, C, Input Size) | Fused | Unfused | Boost -- | -- | -- | -- [64, 15, 500, 500] | 2917.237852 | 1667.527 | 174.94% [64, 20, 500, 500] | 3661.45311 | 1196.497 | 306.01% [64, 25, 500, 500] | 3288.546223 | 855.2861 | 384.50% [64, 30, 500, 500] | 2913.375177 | 660.5786 | 441.03% [64, 35, 500, 500] | 2581.44028 | 519.6848 | 496.73% [64, 40, 500, 500] | 2479.42023 | 714.7851 | 346.88% [64, 45, 500, 500] | 2300.442591 | 625.1124 | 368.00% [64, 50, 500, 500] | 2160.407494 | 549.2164 | 393.36% [16, 25, 512, 512] | 1067.593284 | 332.028 | 321.54% [32, 25, 512, 512] | 1830.461068 | 649.8168 | 281.69% [64, 25, 512, 512] | 2827.429465 | 1187.243 | 238.15% [128, 25, 512, 512] | 3938.397784 | 1547.932 | 254.43% [16, 25, 1024, 1024] | 231.900727 | 154.7335 | 149.87% [32, 25, 1024, 1024] | 429.570455 | 298.2182 | 144.05% [64, 25, 1024, 1024] | 744.384772 | 480.4162 | 154.95% [128, 25, 1024, 1024] | 1204.706856 | 696.3014 | 173.02% [16, 25, 2048, 2048] | 52.323166 | 40.81776 | 128.19% [32, 25, 2048, 2048] | 101.108405 | 78.72398 | 128.43% [64, 25, 2048, 2048] | 181.117374 | 131.4923 | 137.74% [128, 25, 2048, 2048] | 315.360515 | 223.4272 | 141.15% [16, 25, 4096, 4096] | 12.326611 | 9.575337 | 128.73% [32, 25, 4096, 4096] | 24.255487 | 18.75816 | 129.31% [64, 25, 4096, 4096] | 44.229753 | 34.00344 | 130.07% [128, 25, 4096, 4096] | 78.146907 | 64.36427 | 121.41%
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
