[GitHub] [tvm] masahi commented on issue #13545: [Bug][FQ2I] Failed to run FakeQuantizationToInteger on QDQ ONNX model

GitBox Wed, 07 Dec 2022 03:11:38 -0800


masahi commented on issue #13545:
URL: https://github.com/apache/tvm/issues/13545#issuecomment-1340813002


   I made some progress on debugging this issue. There is a small problem in 
the way the model is quantized. The scale and zp of the weight and bias are 
tensors (of shape 1), so Relay thinks this is per-channel quantization. But the 
output channel size and the size of scale doesn't match (32 vs 1). This leads 
to huge number of type errors from your model. So can you update your model so 
that scale and zp of weights and bias are stores as scalars like activation, if 
you are doing per-tensor quantization?
   
   <img width="497" alt="スクリーンショット 2022-12-07 9 11 30" 
src="https://user-images.githubusercontent.com/1776403/206163467-52da89d9-d4a7-42f1-ab73-35e4c31554e9.png";>
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] masahi commented on issue #13545: [Bug][FQ2I] Failed to run FakeQuantizationToInteger on QDQ ONNX model

Reply via email to