[GitHub] ZhennanQin commented on a change in pull request #13697: [MKLDNN] Enable signed int8 support for convolution.

GitBox Mon, 31 Dec 2018 21:42:31 -0800

ZhennanQin commented on a change in pull request #13697: [MKLDNN] Enable signed 
int8 support for convolution.
URL: https://github.com/apache/incubator-mxnet/pull/13697#discussion_r244628443


 ##########
 File path: example/quantization/imagenet_gen_qsym_mkldnn.py
 ##########
 @@ -140,8 +140,8 @@ def save_params(fname, arg_params, aux_params, 
logger=None):
                              ' thresholds. This mode is expected to produce 
the best inference accuracy of all three'
                              ' kinds of quantized models if the calibration 
dataset is representative enough of the'
                              ' inference dataset.')
-    parser.add_argument('--quantized-dtype', type=str, default='uint8',
-                        choices=['int8', 'uint8'],
+    parser.add_argument('--quantized-dtype', type=str, default='auto',
 
 Review comment:
   > Do we do this for the first negative input into subgraphs handled by 
mkldnn? If so is there an advantage to doing the reorder into a signed type as 
opposed to unsigned?
   
   in MXNet, the transformation from fp32 to uint8/int8 is done by quantize 
operator, which is backed with mkldnn reorder. User can decide which out_type 
is used for `_contrib_quantize_v2` operator with quantize script option: 
`--quantized-dtype`, the valid choices are 'uint8', 'int8' and 'auto'.
   For 'uint8', all `_contrib_quantize_v2` operator will try to transform data 
from fp32 to uint8. It's better to use this mode only when there's no negative 
value in network, e.g. resnet excluding first convoulution.
   For 'int8', all `_contrib_quantize_v2` operator will try to transform data 
from fp32 to int8. It has best adaptation as it can handle both positive and 
negative values. But the disadvantage is, for those non-negative data, only 7 
bits are used, causing accuracy lost. And the performance is a bit slower.
   For 'auto', all `_contrib_quantize_v2` operator will try to transform data 
from fp32 to either uint8 or int8 as demond. If calibration result shows that 
this layer result has negative value, then the out_type will be int8, otherwise 
out_type will be uint8. Then int8 will be only used for handling data with 
negative value, and we can get best accuracy and performance by using uint8 and 
int8 in different layers.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] ZhennanQin commented on a change in pull request #13697: [MKLDNN] Enable signed int8 support for convolution.

Reply via email to