wyc-ruiker opened a new pull request #8243: URL: https://github.com/apache/tvm/pull/8243
This is a bug mentioned in [8319](https://discuss.tvm.apache.org/t/error-when-quantizing-mobilenetv2-model-from-mxnet/8319) and [6346](https://discuss.tvm.apache.org/t/quantization-mobilenetv2-quantization-failed/6346). For MobileNetV2, there will be a layer of conv2d after global_avg_pool2d. But for the current quantization strategy, stop_quantize() will be called as soon as global_avg_pool2d is encountered: https://github.com/apache/tvm/blob/d69011dea6a09960b38d36f679888a6e29c24240/python/tvm/relay/quantize/_annotate.py#L395-L410 In the process of Realize, the conv2d which behind global_avg_pool2d will still enter the Conv2dRealize function. This will cause the quantization of MobileNetV2 to fail: https://github.com/apache/tvm/blob/d69011dea6a09960b38d36f679888a6e29c24240/src/relay/quantize/realize.cc#L206-L215 When using skip_conv_layers in the other layers other than layer 0, we will also encounter the same bug. Here is a simple modification with reference to MulRealize and AddRealize. Then all the networks in [quantization](https://github.com/apache/tvm/tree/main/tests/python/nightly/quantization) can run without error at present. But the accuracy of the network in [test_quantization_accuracy.py](https://github.com/apache/tvm/blob/main/tests/python/nightly/quantization/test_quantization_accuracy.py) is currently unable to achieve the previous results, and because there is no AutoTVM log, they are now running very slowly. Could you help review this pr? @masahi @ZihengJiang @tqchen -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
