guberti commented on issue #13364:
URL: https://github.com/apache/tvm/issues/13364#issuecomment-1313903539

   # How does quantization work in general?
   
   Consider a **fused, quantized conv2d** operator, like those in the MLPerf 
Tiny (or any quantized MobileNetV1 model). In addition to having a kernel and 
bias, our operator has "three" extra parameters related to quantization (I'm 
simplifying slightly):
   
   - An **input zero point** (an `int32`)
   - One **requantization scale multiplier** per channel (a `float32`)
   - An **output zero point** (an `int32` that equals the input zero point for 
the next layer)
   - 
   
   $$3$$


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to