FrozenGene edited a comment on issue #4828: [QNN][TFLite] TFLite rounding mode support URL: https://github.com/apache/incubator-tvm/pull/4828#issuecomment-583691425 > just modified add/mul/concat requantize rounding mode and tested, no luck. will change the default rounding behavior for a later test. > > update: I forced FixedPointMultiply(PerChannel) rounding mode to be TFLITE, but still unable to get bit-exact results. > > one more thing, setting tflite default rounding mode to TFLITE seems to break GPU test of mobilenet_v2, maybe you guys have any ideas/suggestions? > > @FrozenGene @anijain2305 Let us break the model into layer by layer and compare with tflite. I want to describe my development way, maybe it could help you. For example, we have mobilenetv2 quantized model, you could get the quantized tensorflow and tflite model. Then you could call `tflite_convert` (feed it quantized tensorflow model) and set the output layer (for example, just the first convolution layer), then you get one quantized tflite model only contain the first convolution layer of mobilenet v2. After you verify it correctly, you could go on until you finish the whole model e2e correctly. Command example: `tflite_convert --input_format=TENSORFLOW_GRAPHDEF --graph_def_file="xx.pb" --output_file=xx.tflite --output_format=TFLITE --input_arrays=input --input_shapes=1,224,224,3 --std_dev_values=127 --mean_values=127 --inference_type=QUANTIZED_UINT8 --inference_input_type=QUANTIZED_UINT8 --default_ranges_min=0 --default_ranges_max=6 --output_arrays=xx` I think when you verify, you could run on cpu firstly locally to find issue, then consider gpu ci issue.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
