[GitHub] [incubator-tvm] u99127 commented on a change in pull request #5848: [TFLite] QNN support for TFLite 2.1.0 quantized models

2020-07-01 Thread GitBox


u99127 commented on a change in pull request #5848:
URL: https://github.com/apache/incubator-tvm/pull/5848#discussion_r448548518



##
File path: python/tvm/relay/frontend/tflite.py
##
@@ -651,12 +701,43 @@ def convert_shape(self, op):
 
 def convert_relu(self, op):
 """Convert TFLite ReLU"""
+try:
+from tflite.ActivationFunctionType import ActivationFunctionType
+except ImportError:
+raise ImportError("The tflite package must be installed")
+

Review comment:
   I think this is unnecessary given the import of ActivationFunctionType 
in the constructor 
[here](https://github.com/apache/incubator-tvm/blob/b979bf6a7630ada055fff2d65e1cd0f8d55bb6a0/python/tvm/relay/frontend/tflite.py#L54)
   

##
File path: python/tvm/relay/frontend/tflite.py
##
@@ -692,6 +773,11 @@ def _hard_swish(data):
 
 def convert_relu6(self, op):
 """Convert TFLite ReLU6"""
+try:
+from tflite.ActivationFunctionType import ActivationFunctionType

Review comment:
   Same as relu,  I think this is unnecessary given the import of 
ActivationFunctionType in the constructor 
[here](https://github.com/apache/incubator-tvm/blob/b979bf6a7630ada055fff2d65e1cd0f8d55bb6a0/python/tvm/relay/frontend/tflite.py#L54)

##
File path: python/tvm/relay/frontend/tflite.py
##
@@ -263,21 +305,29 @@ def get_tensor_value(self, tensor_wrapper):
 except ImportError:
 raise ImportError("The tflite package must be installed")
 
+# Read the data from the buffer. Also extract the shape.
+# The shape is used later to reshape the data.
+data = tensor_wrapper.buffer.DataAsNumpy()
+shape = tensor_wrapper.tensor.ShapeAsNumpy()
+
+# When TFLite buffer is of size 1 (scalar), then TFLite tensor shape 
is set to 0.
+# Therefore, we set the shape to 1 for numpy reshape to work.  Set 
shape to 1 if the data is
+# a scalar type
+if data.size == 1 and isinstance(shape, int) and shape == 0:
+shape = (1,)
+
+if tensor_wrapper.tensor.Type() == TensorType.INT8:

Review comment:
   Minor nit and this should really be credited to Dmitriy Smirnov. 
https://github.com/d-smirnov
   
   the condition here could well be pulled out into a helper function that has 
a dictionary to help us map from TensorType to numpy type ? 
   
   Would make the code much cleaner and reduce duplication. 
   
   i.e. something like 
   
   
   def get_tensor_type_as_numpy(self, tensor_wrapper):
   """Returns np.dtype out of TensorType"""
   """Returns np.dtype out of TensorType"""
   assert isinstance(tensor_wrapper, TensorWrapper)
   
   try:
   from tflite.TensorType import TensorType
   return { TensorType.UINT8:   np.uint8,
TensorType.INT8:np.int8,   
TensorType.FLOAT32: np.float32,
TensorType.INT32:   np.int32,
TensorType.INT64:   np.int64,
TensorType.BOOL:np.bool_ }[ 
tensor_wrapper.tensor.Type() ]
   except ImportError:
   raise ImportError("The tflite package must be installed")
   
   except KeyError:
   raise NotImplementedError("Tensor type '{}' currently not 
supported"
 .format(tensor_wrapper.tensor.Type()))
   
   def get_tensor_value(self, tensor_wrapper):
   """Get tensor buffer value from given tensor wrapper"""
   assert isinstance(tensor_wrapper, TensorWrapper)
   
   value_type = self.get_tensor_type_as_numpy( tensor_wrapper )
   return np.frombuffer( tensor_wrapper.buffer.DataAsNumpy(), 
dtype=value_type ).reshape(
   tensor_wrapper.tensor.ShapeAsNumpy() \
   if 0 != tensor_wrapper.tensor.ShapeLength() \
   else [] )
   
   

##
File path: tests/python/frontend/tflite/test_forward.py
##
@@ -787,14 +897,30 @@ def test_forward_convolution():
 _test_convolution([4, 17, 17, 124], [1, 1, 124, 19], [1, 1], [1, 1], 
'SAME', 'NHWC', quantized=quantized)
 _test_convolution([4, 17, 17, 12], [3, 3, 12, 32], [1, 1], [2, 2], 
'VALID', 'NHWC', quantized=quantized)
 
-# depthwise convolution
-_test_convolution([4, 8, 8, 176], [1, 1, 176, 1], [1, 1], [1, 1], 'SAME', 
'NHWC', True)
-_test_convolution([4, 17, 17, 19], [3, 3, 19, 1], [1, 1], [2, 2], 'VALID', 
'NHWC', True)
-_test_convolution([4, 17, 17, 124], [1, 1, 124, 1], [1, 1], [1, 1], 
'SAME', 'NHWC', True)
-_test_convolution([4, 17, 17, 12], [3, 3, 12, 1], [1, 1], [2, 2], 'VALID', 
'NHWC', True)
-_test_convolution([4, 17, 17, 12], [3, 3, 12, 2], [1, 1], [2, 2], 'VALID', 
'NHWC', True)
-# dephtwise convolution with single input channel
-_test_convolution([1, 76, 64, 1], [9, 5, 1, 96], [1, 

[GitHub] [incubator-tvm] u99127 commented on a change in pull request #5848: [TFLite] QNN support for TFLite 2.1.0 quantized models

2020-06-24 Thread GitBox


u99127 commented on a change in pull request #5848:
URL: https://github.com/apache/incubator-tvm/pull/5848#discussion_r445098409



##
File path: python/tvm/relay/frontend/tflite.py
##
@@ -2566,17 +2613,27 @@ def convert_quantize(self, op):
 input_tensors = self.get_input_tensors(op)
 assert len(input_tensors) == 1, "input tensors length should be 1"
 input_tensor = input_tensors[0]
+input_tensor_type_str = 
self.get_tensor_type_str(input_tensor.tensor.Type())
 in_expr = self.get_expr(input_tensor.tensor_idx)
 
 output_tensors = self.get_output_tensors(op)
 assert len(output_tensors) == 1, "output tensors length should be 1"
 output_tensor = output_tensors[0]
+output_tensor_type_str = 
self.get_tensor_type_str(output_tensor.tensor.Type())
 
 # The output must be quantized
 assert output_tensor.qnn_params
-# Quantize the input
-out = self.quantize(in_expr, output_tensor)
 
+# TFLite Quantize op can also act as Requantize op
+if input_tensor_type_str == "float32":
+out = self.quantize(in_expr, output_tensor)
+else:
+out = _qnn.op.requantize(in_expr,
+ 
input_scale=input_tensor.qnn_params['scale'],
+ 
input_zero_point=input_tensor.qnn_params['zero_point'],
+ 
output_scale=output_tensor.qnn_params['scale'],
+ 
output_zero_point=output_tensor.qnn_params['zero_point'],
+ out_dtype=output_tensor_type_str)
 return out

Review comment:
   This to me looks like it can go in by it's own right as a separate PR 
but this needs a unit test change in tflite/test_forward.py . 

##
File path: python/tvm/relay/frontend/tflite.py
##
@@ -243,10 +243,46 @@ def get_tensors(self, tensors_idx_list):
 qnn_params = None
 tflite_qnn_params = tensor.Quantization()
 if tflite_qnn_params is not None:
-scale = float(tflite_qnn_params.ScaleAsNumpy())
-zero_point = int(tflite_qnn_params.ZeroPointAsNumpy())
+# Params might be per-tensor or per-axis quantized. For 
per-tensor, scale and zero
+# points are scalar. For per-axis, scale and zero points are 
tensors. But as per
+# TFLite quantization spec, the restrictions on ops suggest 
that for per-axis, even
+# if zero point is a tensor - all the zero points are 
identical.  More infomration
+# here - 
https://www.tensorflow.org/lite/performance/quantization_spec
+
+tflite_scale = tflite_qnn_params.ScaleAsNumpy()
+tflite_zero_point = tflite_qnn_params.ZeroPointAsNumpy()
+is_qnn_params_valid = True
+
+# Handle Per-axis and per-tensor cases
+if isinstance(tflite_scale, np.ndarray):
+assert isinstance(tflite_zero_point, np.ndarray)
+
+# Tensor - Per-axis quantization
+if tflite_scale.shape != (1,) and tflite_zero_point.shape 
!= (1,):
+scale = tflite_scale
+# Ensure that all zero points are identical
+zero_point = tflite_zero_point
+assert all(x == zero_point[0] for x in zero_point)

Review comment:
   Minor Nit : Can we use an error here instead of an assert to show us 
clearly the change that has happened ? It also means we can provide some 
sensible diagnostic ?

##
File path: python/tvm/relay/frontend/tflite.py
##
@@ -243,10 +243,46 @@ def get_tensors(self, tensors_idx_list):
 qnn_params = None
 tflite_qnn_params = tensor.Quantization()
 if tflite_qnn_params is not None:
-scale = float(tflite_qnn_params.ScaleAsNumpy())
-zero_point = int(tflite_qnn_params.ZeroPointAsNumpy())
+# Params might be per-tensor or per-axis quantized. For 
per-tensor, scale and zero
+# points are scalar. For per-axis, scale and zero points are 
tensors. But as per
+# TFLite quantization spec, the restrictions on ops suggest 
that for per-axis, even
+# if zero point is a tensor - all the zero points are 
identical.  More infomration
+# here - 
https://www.tensorflow.org/lite/performance/quantization_spec

Review comment:
   To be clear, we are interpreting this from the fact that Conv2d and 
Depthwise_conv2d have a zero_point of 0 listed in their restriction even though 
they have per-axis quantization. 
   
   I would make the comment more explicit . 
   
   For per-axis or per-channel quantization the scale and zero points for the 
weights are tensors (?) 
   
   
   
   

##
File path: