AriMIR commented on issue #8717:
URL: https://github.com/apache/tvm/issues/8717#issuecomment-907596350
Hello @Mousius,
Sorry for the large number of questions. I just started to deal with the
mechanics of TVM and it is quite possible that some of the questions are pretty
stupid.
I cut a Depthwise Conv2D with output (1, 56, 56, 128) from quantized
mobilenet_v1 frozen graph (mobilenet_v1_1.0_224_quant_frozen.pb),
then transform it to tflite and check a relay and TIR.
There is my relay for Conv2D:
def
@main(%MobilenetV1/MobilenetV1/Conv2d_2_pointwise/act_quant/FakeQuantWithMinMaxVars:
Tensor[(1, 56, 56, 128), float32], %v_param_2: Tensor[(1, 3, 3, 128),
float32], %v_param_3: Tensor[(128), float32]) {
%0 = qnn.quantize(%v_param_2, 0.0605373f, 160, out_dtype="uint8");
%1 = qnn.dequantize(%0, 0.0605373f, 160);
%2 = reshape(%1, newshape=[3, 3, 128, 1]);
%3 =
nn.conv2d(%MobilenetV1/MobilenetV1/Conv2d_2_pointwise/act_quant/FakeQuantWithMinMaxVars,
%2, padding=[1, 1, 1, 1], groups=128, channels=128, kernel_size=[3, 3],
data_layout="NHWC", kernel_layout="HWOI");
%4 = nn.bias_add(%3, %v_param_3, axis=3);
%5 = clip(%4, a_min=0f, a_max=6f);
%6 = qnn.quantize(%5, 0.0235285f, 0, out_dtype="uint8");
qnn.dequantize(%6, 0.0235285f, 0)
}
But it differs from the relay in your bug report.
Then I checked relay from the whole mobilenet: (from
mobilenet_v1_1.0_224_quant_frozen.pb and then from
mobilenet_v1_1.0_224_quant.tflite also),
But I haven’t found a relay like yours in the whole mobilenet relay on my
side (tried to find “int16” or “fixed_point_multiply” lines).
Another difference is that we are using tflite quantizer (“qnn.quantize”,
“qnn.dequantize” in the relay from our side)
I suggest you use another quantizer, because there are no “qnn” lines in
your relay.
My questions on the above problem are:
1. It seems that you have a different version of tvm than mine, please
indicate in which commit did you get this error?
2. Which quantizer did you use?
3. Please tell me why relays are different?
The next part of my question is regarding large allocations
1. In which place in the code do you check TIR? (TIR primfn on my side
also differs, not at all, but in some parts)
2. Could you please give me an example when op fusion is correct
Thank you!
Best regards,
Arina Naumova,
Software developer, Grovety
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]