ekalda commented on code in PR #10915:
URL: https://github.com/apache/tvm/pull/10915#discussion_r874830358


##########
src/relay/qnn/op/convolution.cc:
##########
@@ -50,12 +50,14 @@ bool QnnConv2DRel(const Array<Type>& types, int num_inputs, 
const Attrs& attrs,
   if (data == nullptr || weight == nullptr) return false;
   const auto* param = attrs.as<Conv2DAttrs>();
   ICHECK(param != nullptr) << "Conv2DAttrs cannot be nullptr.";
-  ICHECK(data->dtype == DataType::Int(8) || data->dtype == DataType::UInt(8))
-      << "Expected qnn conv2d type(int8, uint8) for input but was " << 
data->dtype;
+  ICHECK(data->dtype == DataType::Int(8) || data->dtype == DataType::UInt(8) ||
+         data->dtype == DataType::Int(16))
+      << "Expected qnn conv2d type(int8, uint8, int16) for input but was " << 
data->dtype;
   ICHECK(weight->dtype == DataType::Int(8) || weight->dtype == 
DataType::UInt(8))
-      << "Expected qnn conv2d type(int8, uint8) for weight but was " << 
weight->dtype;
-  ICHECK(param->out_dtype == DataType::Int(16) || param->out_dtype == 
DataType::Int(32))
-      << "Expected qnn conv2d type(int32, int16) for output but was " << 
param->out_dtype;
+      << "Expected qnn conv2d type(int8, uint8, int16) for weight but was " << 
weight->dtype;

Review Comment:
   nit:
   ```suggestion
         << "Expected qnn conv2d type(int8, uint8) for weight but was " << 
weight->dtype;
   ```



##########
tests/python/frontend/tflite/test_forward.py:
##########
@@ -4572,6 +4650,47 @@ def test_forward_tflite_float16():
     tvm.testing.assert_allclose(tvm_sorted_labels, tflite_sorted_labels)
 
 
+def test_forward_tflite_mobilenet_int16():
+    """Test int16 quantized model"""
+    # MobilenetV2
+    tflite_model_file = tf_testing.get_workload_official(

Review Comment:
   Nit: I suppose the model in this file is in TensorFlow, not TFLite?



##########
src/relay/qnn/op/convolution.cc:
##########
@@ -829,7 +839,7 @@ This operator convolves quantized weight with quantized 
data. The scale of the
 output quantized tensor is the product of the weight_scale and input_scale of
 the input quantized tensors. The zero point of the output quantized tensor is
 0. By default, the dtype of output is int32. Please also refer to Requantize
-operator to understand how to scale back the int32 output to (u)int8.
+operator to understand how to scale back the int32 output to (u)int8 or 
(u)int16.

Review Comment:
   Does uint16 exist?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to