ElaineBao opened a new pull request #15664: add int8 bn mkldnn implementation 
and test
URL: https://github.com/apache/incubator-mxnet/pull/15664
 
 
   ## Description ##
   Add a new operator - int8 batch norm, mkldnn implementation and test
   
   ## Details ##
   ### Usage ###
   1. Check the doc in 
https://github.com/apache/incubator-mxnet/tree/master/example/quantization/README.md
 to quantize models and do inference.
   2. In order to use standalone int8 batch norm instead of fused fp32 batch 
norm, one should `export MXNET_DISABLE_MKLDNN_FUSE_CONV_BN=1` before using 
`imagenet_gen_qsym_mkldnn.py` to quantize the model.
   3. Suggest to use fuse bn if it can be fused, since it's faster and more 
accurate. Otherwise use standalone int8 bn.
   
   ### Limitation ###
   1. Currently int8 bn only support s8 input, since mkldnn batch norm only 
support s8 input
   2. Currently int8 bn cannot support `calib_mode = none`, since when 
calculating the thresholds on the fly with s8 input, errors are large. One can 
run with `calib_mode=naïve/entropy`, should have a similar accuracy with fp32 
model.
   
   ### Performance ###
   I tested several models on skylake, which can be used for reference.
   
   Models | FP32 Acc |  INT8 (fuse fp32 bn)  Acc | INT8 (standalone int8 bn) Acc
   -|-|-|-
   Resnet50 V1 | 0.765 | 0.760 | 0.751
   Mobilenet1.0 | 0.722 | 0.720 | 0.677
   Inception V3 | 0.782 | 0.782 | 0.772
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to