wyc-ruiker opened a new pull request #8243:
URL: https://github.com/apache/tvm/pull/8243


   This is a bug mentioned in 
[8319](https://discuss.tvm.apache.org/t/error-when-quantizing-mobilenetv2-model-from-mxnet/8319)
 and 
[6346](https://discuss.tvm.apache.org/t/quantization-mobilenetv2-quantization-failed/6346).
   
   For MobileNetV2, there will be a layer of conv2d after global_avg_pool2d. 
But for the current quantization strategy, stop_quantize() will be called as 
soon as global_avg_pool2d is encountered:
   
https://github.com/apache/tvm/blob/d69011dea6a09960b38d36f679888a6e29c24240/python/tvm/relay/quantize/_annotate.py#L395-L410
   
   In the process of Realize, the conv2d which behind global_avg_pool2d will 
still enter the Conv2dRealize function. This will cause the quantization of 
MobileNetV2 to fail:
   
https://github.com/apache/tvm/blob/d69011dea6a09960b38d36f679888a6e29c24240/src/relay/quantize/realize.cc#L206-L215
   
   When using skip_conv_layers in the other layers other than layer 0, we will 
also encounter the same bug.
   
   Here is a simple modification with reference to MulRealize and AddRealize. 
Then all the networks in 
[quantization](https://github.com/apache/tvm/tree/main/tests/python/nightly/quantization)
 can run without error at present. But the accuracy of the network in 
[test_quantization_accuracy.py](https://github.com/apache/tvm/blob/main/tests/python/nightly/quantization/test_quantization_accuracy.py)
 is currently unable to achieve the previous results, and because there is no 
AutoTVM log, they are now running very slowly.
   
   Could you help review this pr? @masahi @ZihengJiang @tqchen 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to