guberti commented on code in PR #13242:
URL: https://github.com/apache/tvm/pull/13242#discussion_r1027633465
##########
src/relay/qnn/op/requantize.cc:
##########
@@ -498,8 +498,8 @@ bool RequantizeRel(const Array<Type>& types, int
num_inputs, const Attrs& attrs,
axis_shape = Integer(1);
}
// Check and assign types for scale and zero points.
- AssignType(types[1], DataType::Float(32), axis_shape, reporter); //
input_scale
- AssignType(types[2], DataType::Int(32), axis_shape, reporter); //
input_zero_pt
+ // AssignType(types[1], DataType::Float(32), axis_shape, reporter); //
input_scale
Review Comment:
Fixed - this PR should not change `requantize.cc`.
However, it is a bit of a tricky issue. In `qnn_alter_op.py`, I want to
manually choose the `int32` requantize scale to improve performance. However,
Relay's `requantize` op only allows the output scale to be a `float32`.
I get around this by storing the scale data as a `float32` array with the
correct bytes, and reading it back as an `int32` array. I've added a comment to
`qnn_alter_op.py` to better explain what happens here. This is pretty gross.
Longer term, I'd love to add a new Relay op `IntegerRequantize` that takes
`int32` scale and shift arguments, which will let us solve this problem in a
nice way. Would love your thoughts on the right way to address this!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]