ZhennanQin commented on a change in pull request #13697: [MKLDNN] Enable signed 
int8 support for convolution.
URL: https://github.com/apache/incubator-mxnet/pull/13697#discussion_r244628443
 
 

 ##########
 File path: example/quantization/imagenet_gen_qsym_mkldnn.py
 ##########
 @@ -140,8 +140,8 @@ def save_params(fname, arg_params, aux_params, 
logger=None):
                              ' thresholds. This mode is expected to produce 
the best inference accuracy of all three'
                              ' kinds of quantized models if the calibration 
dataset is representative enough of the'
                              ' inference dataset.')
-    parser.add_argument('--quantized-dtype', type=str, default='uint8',
-                        choices=['int8', 'uint8'],
+    parser.add_argument('--quantized-dtype', type=str, default='auto',
 
 Review comment:
   > Do we do this for the first negative input into subgraphs handled by 
mkldnn? If so is there an advantage to doing the reorder into a signed type as 
opposed to unsigned?
   
   in MXNet, the transformation from fp32 to uint8/int8 is done by quantize 
operator, which is backed with mkldnn reorder. User can decide which out_type 
is used for `quantize` operator with quantize script option: 
`--quantized-dtype`, the valid choices are 'uint8', 'int8' and 'auto'.
   For 'uint8', all `quantize` operator will try to transform data from fp32 to 
uint8. It's better to use this mode only when there's no negative value in 
network, e.g. resnet excluding first convoulution.
   For 'int8', all `quantize` operator will try to transform data from fp32 to 
int8. It has best adaptation as it can handle both positive and negative 
values. But the disadvantage is, for those non-negative data, only 7 bits are 
used, causing accuracy lost. And the performance is a bit slower.
   For 'auto', all `quantize` operator will try to transform data from fp32 to 
either uint8 or int8 as demond. If calibration result shows that this layer 
result has negative value, then the out_type will be int8, otherwise out_type 
will be uint8. Then int8 will be only used for handling data with negative 
value, and we can get best accuracy and performance by using uint8 and int8 in 
different layers.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to