ElaineBao opened a new pull request #15664: add int8 bn mkldnn implementation and test URL: https://github.com/apache/incubator-mxnet/pull/15664 ## Description ## Add a new operator - int8 batch norm, mkldnn implementation and test ## Details ## ### Usage ### 1. Check the doc in https://github.com/apache/incubator-mxnet/tree/master/example/quantization/README.md to quantize models and do inference. 2. In order to use standalone int8 batch norm instead of fused fp32 batch norm, one should `export MXNET_DISABLE_MKLDNN_FUSE_CONV_BN=1` before using `imagenet_gen_qsym_mkldnn.py` to quantize the model. 3. Suggest to use fuse bn if it can be fused, since it's faster and more accurate. Otherwise use standalone int8 bn. ### Limitation ### 1. Currently int8 bn only support s8 input, since mkldnn batch norm only support s8 input 2. Currently int8 bn cannot support `calib_mode = none`, since when calculating the thresholds on the fly with s8 input, errors are large. One can run with `calib_mode=naïve/entropy`, should have a similar accuracy with fp32 model. ### Performance ### I tested several models on skylake, which can be used for reference. Models | FP32 Acc | INT8 (fuse fp32 bn) Acc | INT8 (standalone int8 bn) Acc -|-|-|- Resnet50 V1 | 0.765 | 0.760 | 0.751 Mobilenet1.0 | 0.722 | 0.720 | 0.677 Inception V3 | 0.782 | 0.782 | 0.772
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
