[GitHub] [incubator-tvm] u99127 commented on a change in pull request #5848: [TFLite] QNN support for TFLite 2.1.0 quantized models

GitBox Wed, 24 Jun 2020 12:31:25 -0700


u99127 commented on a change in pull request #5848:
URL: https://github.com/apache/incubator-tvm/pull/5848#discussion_r445098409




##########
File path: python/tvm/relay/frontend/tflite.py
##########
@@ -2566,17 +2613,27 @@ def convert_quantize(self, op):
         input_tensors = self.get_input_tensors(op)
         assert len(input_tensors) == 1, "input tensors length should be 1"
         input_tensor = input_tensors[0]
+        input_tensor_type_str = 
self.get_tensor_type_str(input_tensor.tensor.Type())
         in_expr = self.get_expr(input_tensor.tensor_idx)
 
         output_tensors = self.get_output_tensors(op)
         assert len(output_tensors) == 1, "output tensors length should be 1"
         output_tensor = output_tensors[0]
+        output_tensor_type_str = 
self.get_tensor_type_str(output_tensor.tensor.Type())
 
         # The output must be quantized
         assert output_tensor.qnn_params
-        # Quantize the input
-        out = self.quantize(in_expr, output_tensor)
 
+        # TFLite Quantize op can also act as Requantize op
+        if input_tensor_type_str == "float32":
+            out = self.quantize(in_expr, output_tensor)
+        else:
+            out = _qnn.op.requantize(in_expr,
+                                     
input_scale=input_tensor.qnn_params['scale'],
+                                     
input_zero_point=input_tensor.qnn_params['zero_point'],
+                                     
output_scale=output_tensor.qnn_params['scale'],
+                                     
output_zero_point=output_tensor.qnn_params['zero_point'],
+                                     out_dtype=output_tensor_type_str)
         return out

Review comment:
       This to me looks like it can go in by it's own right as a separate PR 
but this needs a unit test change in tflite/test_forward.py . 

##########
File path: python/tvm/relay/frontend/tflite.py
##########
@@ -243,10 +243,46 @@ def get_tensors(self, tensors_idx_list):
             qnn_params = None
             tflite_qnn_params = tensor.Quantization()
             if tflite_qnn_params is not None:
-                scale = float(tflite_qnn_params.ScaleAsNumpy())
-                zero_point = int(tflite_qnn_params.ZeroPointAsNumpy())
+                # Params might be per-tensor or per-axis quantized. For 
per-tensor, scale and zero
+                # points are scalar. For per-axis, scale and zero points are 
tensors. But as per
+                # TFLite quantization spec, the restrictions on ops suggest 
that for per-axis, even
+                # if zero point is a tensor - all the zero points are 
identical.  More infomration
+                # here - 
https://www.tensorflow.org/lite/performance/quantization_spec
+
+                tflite_scale = tflite_qnn_params.ScaleAsNumpy()
+                tflite_zero_point = tflite_qnn_params.ZeroPointAsNumpy()
+                is_qnn_params_valid = True
+
+                # Handle Per-axis and per-tensor cases
+                if isinstance(tflite_scale, np.ndarray):
+                    assert isinstance(tflite_zero_point, np.ndarray)
+
+                    # Tensor - Per-axis quantization
+                    if tflite_scale.shape != (1,) and tflite_zero_point.shape 
!= (1,):
+                        scale = tflite_scale
+                        # Ensure that all zero points are identical
+                        zero_point = tflite_zero_point
+                        assert all(x == zero_point[0] for x in zero_point)

Review comment:
       Minor Nit : Can we use an error here instead of an assert to show us 
clearly the change that has happened ? It also means we can provide some 
sensible diagnostic ?

##########
File path: python/tvm/relay/frontend/tflite.py
##########
@@ -243,10 +243,46 @@ def get_tensors(self, tensors_idx_list):
             qnn_params = None
             tflite_qnn_params = tensor.Quantization()
             if tflite_qnn_params is not None:
-                scale = float(tflite_qnn_params.ScaleAsNumpy())
-                zero_point = int(tflite_qnn_params.ZeroPointAsNumpy())
+                # Params might be per-tensor or per-axis quantized. For 
per-tensor, scale and zero
+                # points are scalar. For per-axis, scale and zero points are 
tensors. But as per
+                # TFLite quantization spec, the restrictions on ops suggest 
that for per-axis, even
+                # if zero point is a tensor - all the zero points are 
identical.  More infomration
+                # here - 
https://www.tensorflow.org/lite/performance/quantization_spec

Review comment:
       To be clear, we are interpreting this from the fact that Conv2d and 
Depthwise_conv2d have a zero_point of 0 listed in their restriction even though 
they have per-axis quantization. 
   
   I would make the comment more explicit . 
   
   For per-axis or per-channel quantization the scale and zero points for the 
weights are tensors (?) 
   
   
   
   

##########
File path: python/tvm/relay/frontend/tflite.py
##########
@@ -262,21 +298,25 @@ def get_tensor_value(self, tensor_wrapper):
         except ImportError:
             raise ImportError("The tflite package must be installed")
 
+        data = tensor_wrapper.buffer.DataAsNumpy()
+        shape = tensor_wrapper.tensor.ShapeAsNumpy()
+
+        # Set shape to 1 if the data is a scalar type
+        if data.shape == (1,) and isinstance(shape, int) and shape == 0:

Review comment:
       I'm scratching my head at this condition with shape. Can you elaborate 
more as to why we need it ? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [incubator-tvm] u99127 commented on a change in pull request #5848: [TFLite] QNN support for TFLite 2.1.0 quantized models

Reply via email to