giuseros commented on a change in pull request #6445:
URL: https://github.com/apache/incubator-tvm/pull/6445#discussion_r500281070
##########
File path: python/tvm/topi/arm_cpu/tensor_intrin.py
##########
@@ -445,7 +443,7 @@ def gemm_quantized(M, N, K, unroll, interleave, in_type,
out_type):
)
c_buffer = tvm.tir.decl_buffer(
- C.shape, dtype=out_type, name="c_buffer", offset_factor=1,
strides=[te.var("sc"), 1]
+ C.shape, dtype="int32", name="c_buffer", offset_factor=1,
strides=[te.var("sc"), 1]
Review comment:
Hi @FrozenGene , the problem is the following: in quantized `conv2d`, we
do `conv2d` and then `requantization` (those are two different relay
operators). Conv2d goes from `int8->int32`, and requantization goes from
`int32->int8`. So in theory this would work with `out_type`.
However, in some tests (pre-existing to my changes, that I run internally) I
noticed that they set the (`conv2d`) `out_type` to `int8`(or `uint8`). In this
case the intrinsic still needs to produce an `int32` value and the cast to
`int8` (or `uint8`) needs to happen at a later stage.
This change is basically saying: no matter the `out_type` the intrinsic will
produce a `int32` result. If we want the output to be `int8` (which would be
wrong, but some tests do it to simplify the testing) the conversion needs to
happen later.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]