[GitHub] [tvm] guberti commented on a diff in pull request #13242: [microTVM] [WIP] Modernize Arm Cortex-M convolution schedules

GitBox Wed, 23 Nov 2022 07:05:52 -0800


guberti commented on code in PR #13242:
URL: https://github.com/apache/tvm/pull/13242#discussion_r1027633465



##########
src/relay/qnn/op/requantize.cc:
##########
@@ -498,8 +498,8 @@ bool RequantizeRel(const Array<Type>& types, int 
num_inputs, const Attrs& attrs,
     axis_shape = Integer(1);
   }
   // Check and assign types for scale and zero points.
-  AssignType(types[1], DataType::Float(32), axis_shape, reporter);  // 
input_scale
-  AssignType(types[2], DataType::Int(32), axis_shape, reporter);    // 
input_zero_pt
+  // AssignType(types[1], DataType::Float(32), axis_shape, reporter);  // 
input_scale

Review Comment:
   Fixed - this PR should not change `requantize.cc`.
   
   However, it is a bit of a tricky issue. In `qnn_alter_op.py`, I want to 
manually choose the `int32` requantize scale to improve performance. However, 
Relay's `requantize` op only allows the output scale to be a `float32`.
   
   I get around this by storing the scale data as a `float32` array with the 
correct bytes, and reading it back as an `int32` array. I've added a comment to 
`qnn_alter_op.py` to better explain what happens here. This is pretty gross.
   
   Longer term, I'd love to add a new Relay op `IntegerRequantize` that takes 
`int32` scale and shift arguments, which will let us solve this problem in a 
nice way. Would love your thoughts on the right way to address this!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] guberti commented on a diff in pull request #13242: [microTVM] [WIP] Modernize Arm Cortex-M convolution schedules

Reply via email to