[GitHub] [incubator-tvm] giuseros commented on a change in pull request #6445: Add dot product support for quantized convolution.

GitBox Tue, 06 Oct 2020 06:36:09 -0700


giuseros commented on a change in pull request #6445:
URL: https://github.com/apache/incubator-tvm/pull/6445#discussion_r500281070




##########
File path: python/tvm/topi/arm_cpu/tensor_intrin.py
##########
@@ -445,7 +443,7 @@ def gemm_quantized(M, N, K, unroll, interleave, in_type, 
out_type):
     )
 
     c_buffer = tvm.tir.decl_buffer(
-        C.shape, dtype=out_type, name="c_buffer", offset_factor=1, 
strides=[te.var("sc"), 1]
+        C.shape, dtype="int32", name="c_buffer", offset_factor=1, 
strides=[te.var("sc"), 1]

Review comment:
       Hi @FrozenGene , the problem is the following: in quantized `conv2d`, we 
do `conv2d` and then `requantization` (those are two different relay 
operators). Conv2d goes from `int8->int32`, and requantization goes from 
`int32->int8`. So in theory this would work with `out_type`.
   
   However, in some tests (pre-existing to my changes, that I run internally) I 
noticed that they set the (`conv2d`) `out_type` to `int8`(or `uint8`). In this 
case the intrinsic still needs to produce an `int32` value and the cast to 
`int8` (or `uint8`) needs to happen at a later stage. 
   
   This change is basically saying: no matter the `out_type` the intrinsic will 
produce a `int32` result. If we want the output to be `int8` (which would be 
wrong, but some tests do it to simplify the testing) the conversion needs to 
happen later. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-tvm] giuseros commented on a change in pull request #6445: Add dot product support for quantized convolution.

Reply via email to