mbrookhart commented on issue #12707:
URL: https://github.com/apache/tvm/issues/12707#issuecomment-1239618078
Where did you add the code?
Something is going wrong with the pass because you have expressions as
scales and zero points. If I run FoldConstant before FQ2I, it works:
```
import tvm
from tvm import relay
mod = tvm.parser.parse_expr(
'''
fn (%x1: Tensor[(1, 3, 224, 224), float32], %x2: Tensor[(1, 3, 224, 224),
float32]){
%0 = power(2f, 1f);
%1 = divide(1f, %0);
%2 = qnn.quantize(%x1, %1, 0, out_dtype="int8");
%3 = clip(%2, a_min=-127f, a_max=127f);
%4 = power(2f, 1f);
%5 = divide(1f, %4);
%6 = qnn.quantize(%x2, %5, 0, out_dtype="int8");
%7 = clip(%6, a_min=-127f, a_max=127f);
%8 = qnn.dequantize(%3, %1, 0);
%9 = qnn.dequantize(%7, %5, 0);
%10 = (%8, %9);
%11 = power(2f, 1f);
%12 = concatenate(%10, axis=1);
%13 = divide(1f, %11);
%14 = qnn.quantize(%12, %13, 0, out_dtype="int8");
%15 = clip(%14, a_min=-127f, a_max=127f);
qnn.dequantize(%15, %13, 0)
}
'''
)
mod = tvm.IRModule.from_expr(mod)
mod = relay.transform.InferType()(mod)
mod = relay.transform.FoldConstant()(mod)
mod = tvm.relay.transform.FakeQuantizationToInteger(False, True)(mod)
print(mod)
```
```
def @main(%x1: Tensor[(1, 3, 224, 224), float32] /* ty=Tensor[(1, 3, 224,
224), float32] span=string:5:19 */, %x2: Tensor[(1, 3, 224, 224), float32] /*
ty=Tensor[(1, 3, 224, 224), float32] span=string:9:19 */) -> Tensor[(1, 6, 224,
224), float32] {
%0 = qnn.quantize(%x1, 0.5f /* ty=float32 */, 0 /* ty=int32
span=string:5:29 */, out_dtype="int8") /* ty=Tensor[(1, 3, 224, 224), int8]
span=string:6:11 */;
%1 = qnn.quantize(%x2, 0.5f /* ty=float32 */, 0 /* ty=int32
span=string:9:29 */, out_dtype="int8") /* ty=Tensor[(1, 3, 224, 224), int8]
span=string:10:11 */;
%2 = clip(%0, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 3, 224, 224),
int8] span=string:11:21 */;
%3 = clip(%1, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 3, 224, 224),
int8] span=string:12:21 */;
%4 = (%2, %3) /* ty=(Tensor[(1, 3, 224, 224), int8], Tensor[(1, 3, 224,
224), int8]) span=string:15:19 */;
%5 = (0.5f /* ty=float32 */, 0.5f /* ty=float32 */) /* ty=(float32,
float32) */;
%6 = (0 /* ty=int32 span=string:11:30 */, 0 /* ty=int32 span=string:12:30
*/) /* ty=(int32, int32) */;
%7 = qnn.concatenate(%4, %5, %6, 0.5f /* ty=float32 */, 0 /* ty=int32
span=string:17:31 */, axis=1) /* ty=Tensor[(1, 6, 224, 224), int8] */;
%8 = clip(%7, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 6, 224, 224),
int8] span=string:19:16 */;
qnn.dequantize(%8, 0.5f /* ty=float32 */, 0 /* ty=int32 span=string:19:27
*/) /* ty=Tensor[(1, 6, 224, 224), float32] span=string:3:1 */
}
```
I'll try to figure out why the pass is failing with Expr scale and zero point
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]