AriMIR commented on issue #8717:
URL: https://github.com/apache/tvm/issues/8717#issuecomment-907596350


   Hello @Mousius,
   
   Sorry for the large number of questions. I just started to deal with the 
mechanics of TVM and it is quite possible that some of the questions are pretty 
stupid.
   
   I cut  a Depthwise Conv2D with output (1, 56, 56, 128) from quantized 
mobilenet_v1 frozen graph (mobilenet_v1_1.0_224_quant_frozen.pb),
   then transform it to tflite and check a relay and TIR.
   
   There is my relay for Conv2D:
    def 
@main(%MobilenetV1/MobilenetV1/Conv2d_2_pointwise/act_quant/FakeQuantWithMinMaxVars:
 Tensor[(1, 56, 56, 128), float32], %v_param_2: Tensor[(1, 3, 3, 128), 
float32], %v_param_3: Tensor[(128), float32]) {
     %0 = qnn.quantize(%v_param_2, 0.0605373f, 160, out_dtype="uint8");
     %1 = qnn.dequantize(%0, 0.0605373f, 160);
     %2 = reshape(%1, newshape=[3, 3, 128, 1]);
     %3 = 
nn.conv2d(%MobilenetV1/MobilenetV1/Conv2d_2_pointwise/act_quant/FakeQuantWithMinMaxVars,
 %2, padding=[1, 1, 1, 1], groups=128, channels=128, kernel_size=[3, 3], 
data_layout="NHWC", kernel_layout="HWOI");
     %4 = nn.bias_add(%3, %v_param_3, axis=3);
     %5 = clip(%4, a_min=0f, a_max=6f);
     %6 = qnn.quantize(%5, 0.0235285f, 0, out_dtype="uint8");
     qnn.dequantize(%6, 0.0235285f, 0)
   }
   
   But it differs from the relay in your bug report.
   Then I checked relay from the whole mobilenet: (from 
mobilenet_v1_1.0_224_quant_frozen.pb and then from 
mobilenet_v1_1.0_224_quant.tflite also),
   But I haven’t found a relay like yours in the whole mobilenet relay on my 
side (tried to find “int16” or “fixed_point_multiply” lines).
   
   Another difference is that we are using tflite quantizer (“qnn.quantize”, 
“qnn.dequantize” in the relay from our side)
   I suggest you use another quantizer, because there are no “qnn” lines in 
your relay.
   
   My questions on the above problem are:
   1. It seems that you have a different version of tvm than mine, please 
indicate in which commit did you get this error?
   2. Which quantizer did you use?
   3. Please tell me why relays are different?
   
   The next part of my question is regarding large allocations
   1.   In which place in the code do you check TIR? (TIR primfn on my  side 
also differs, not at all, but in some parts) 
   2.   Could you please give me an example when op fusion is correct
   
   Thank you!
   
   Best regards,
   Arina Naumova,
   Software developer, Grovety
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to