guberti commented on issue #13364: URL: https://github.com/apache/tvm/issues/13364#issuecomment-1313903539
# How does quantization work in general? Consider a **fused, quantized conv2d** operator, like those in the MLPerf Tiny (or any quantized MobileNetV1 model). In addition to having a kernel and bias, our operator has "three" extra parameters related to quantization (I'm simplifying slightly): - An **input zero point** (an `int32`) - One **requantization scale multiplier** per channel (a `float32`) - An **output zero point** (an `int32` that equals the input zero point for the next layer) - $$3$$ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
