rajeshii commented on a change in pull request #14094: Enhance gpu quantization
URL: https://github.com/apache/incubator-mxnet/pull/14094#discussion_r259658256
 
 

 ##########
 File path: python/mxnet/contrib/quantization.py
 ##########
 @@ -499,6 +499,9 @@ def quantize_model(sym, arg_params, aux_params,
     if quantized_dtype not in ('int8', 'uint8'):
         raise ValueError('unknown quantized_dtype %s received,'
                          ' expected `int8` or `uint8`' % quantized_dtype)
+    if quantized_dtype == 'uint8' and ctx != cpu():
+        raise ValueError('currently, uint8 quantization is only supported by 
CPU,'
+                         ' please switch to the context of CPU or int8 data 
type for GPU')
 
 Review comment:
   I don't think this modification can work since **infer type** error 
`mxnet.base.MXNetError: [02:07:55] 
/home/ubuntu/experimentals/1.4_release/src/operator/quantization/../tensor/matrix_op-inl.h:250:
 Check failed: src.type_flag_ == ret.type_flag_ (3 vs. 5)` will occur before 
`QuantizeCompute` and we cannot get the ctx information during `infer` stage. 
So I think it's good to interrupt this action during the calibration stage.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to