u99127 commented on a change in pull request #5848: URL: https://github.com/apache/incubator-tvm/pull/5848#discussion_r445098409
########## File path: python/tvm/relay/frontend/tflite.py ########## @@ -2566,17 +2613,27 @@ def convert_quantize(self, op): input_tensors = self.get_input_tensors(op) assert len(input_tensors) == 1, "input tensors length should be 1" input_tensor = input_tensors[0] + input_tensor_type_str = self.get_tensor_type_str(input_tensor.tensor.Type()) in_expr = self.get_expr(input_tensor.tensor_idx) output_tensors = self.get_output_tensors(op) assert len(output_tensors) == 1, "output tensors length should be 1" output_tensor = output_tensors[0] + output_tensor_type_str = self.get_tensor_type_str(output_tensor.tensor.Type()) # The output must be quantized assert output_tensor.qnn_params - # Quantize the input - out = self.quantize(in_expr, output_tensor) + # TFLite Quantize op can also act as Requantize op + if input_tensor_type_str == "float32": + out = self.quantize(in_expr, output_tensor) + else: + out = _qnn.op.requantize(in_expr, + input_scale=input_tensor.qnn_params['scale'], + input_zero_point=input_tensor.qnn_params['zero_point'], + output_scale=output_tensor.qnn_params['scale'], + output_zero_point=output_tensor.qnn_params['zero_point'], + out_dtype=output_tensor_type_str) return out Review comment: This to me looks like it can go in by it's own right as a separate PR but this needs a unit test change in tflite/test_forward.py . ########## File path: python/tvm/relay/frontend/tflite.py ########## @@ -243,10 +243,46 @@ def get_tensors(self, tensors_idx_list): qnn_params = None tflite_qnn_params = tensor.Quantization() if tflite_qnn_params is not None: - scale = float(tflite_qnn_params.ScaleAsNumpy()) - zero_point = int(tflite_qnn_params.ZeroPointAsNumpy()) + # Params might be per-tensor or per-axis quantized. For per-tensor, scale and zero + # points are scalar. For per-axis, scale and zero points are tensors. But as per + # TFLite quantization spec, the restrictions on ops suggest that for per-axis, even + # if zero point is a tensor - all the zero points are identical. More infomration + # here - https://www.tensorflow.org/lite/performance/quantization_spec + + tflite_scale = tflite_qnn_params.ScaleAsNumpy() + tflite_zero_point = tflite_qnn_params.ZeroPointAsNumpy() + is_qnn_params_valid = True + + # Handle Per-axis and per-tensor cases + if isinstance(tflite_scale, np.ndarray): + assert isinstance(tflite_zero_point, np.ndarray) + + # Tensor - Per-axis quantization + if tflite_scale.shape != (1,) and tflite_zero_point.shape != (1,): + scale = tflite_scale + # Ensure that all zero points are identical + zero_point = tflite_zero_point + assert all(x == zero_point[0] for x in zero_point) Review comment: Minor Nit : Can we use an error here instead of an assert to show us clearly the change that has happened ? It also means we can provide some sensible diagnostic ? ########## File path: python/tvm/relay/frontend/tflite.py ########## @@ -243,10 +243,46 @@ def get_tensors(self, tensors_idx_list): qnn_params = None tflite_qnn_params = tensor.Quantization() if tflite_qnn_params is not None: - scale = float(tflite_qnn_params.ScaleAsNumpy()) - zero_point = int(tflite_qnn_params.ZeroPointAsNumpy()) + # Params might be per-tensor or per-axis quantized. For per-tensor, scale and zero + # points are scalar. For per-axis, scale and zero points are tensors. But as per + # TFLite quantization spec, the restrictions on ops suggest that for per-axis, even + # if zero point is a tensor - all the zero points are identical. More infomration + # here - https://www.tensorflow.org/lite/performance/quantization_spec Review comment: To be clear, we are interpreting this from the fact that Conv2d and Depthwise_conv2d have a zero_point of 0 listed in their restriction even though they have per-axis quantization. I would make the comment more explicit . For per-axis or per-channel quantization the scale and zero points for the weights are tensors (?) ########## File path: python/tvm/relay/frontend/tflite.py ########## @@ -262,21 +298,25 @@ def get_tensor_value(self, tensor_wrapper): except ImportError: raise ImportError("The tflite package must be installed") + data = tensor_wrapper.buffer.DataAsNumpy() + shape = tensor_wrapper.tensor.ShapeAsNumpy() + + # Set shape to 1 if the data is a scalar type + if data.shape == (1,) and isinstance(shape, int) and shape == 0: Review comment: I'm scratching my head at this condition with shape. Can you elaborate more as to why we need it ? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org