masahi commented on issue #13545: URL: https://github.com/apache/tvm/issues/13545#issuecomment-1340813002
I made some progress on debugging this issue. There is a small problem in the way the model is quantized. The scale and zp of the weight and bias are tensors (of shape 1), so Relay thinks this is per-channel quantization. But the output channel size and the size of scale doesn't match (32 vs 1). This leads to huge number of type errors from your model. So can you update your model so that scale and zp of weights and bias are stores as scalars like activation, if you are doing per-tensor quantization? <img width="497" alt="スクリーンショット 2022-12-07 9 11 30" src="https://user-images.githubusercontent.com/1776403/206163467-52da89d9-d4a7-42f1-ab73-35e4c31554e9.png"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
