[GitHub] [tvm] chengven027-intellif commented on a diff in pull request #13747: [ONNX] QGemm support

GitBox Tue, 10 Jan 2023 17:49:02 -0800


chengven027-intellif commented on code in PR #13747:
URL: https://github.com/apache/tvm/pull/13747#discussion_r1066501716



##########
python/tvm/relay/frontend/onnx.py:
##########
@@ -4898,6 +4898,90 @@ def _impl_v10(cls, inputs, attr, params):
         return out
 
 
+class QGemm(OnnxOpConverter):
+    """Operator converter for QGemm."""
+
+    @classmethod
+    def _impl_v1(cls, inputs, attr, params):
+        # 
http://www.xavierdupre.fr/app/mlprodict/helpsphinx/onnxops/onnx_commicrosoft_QGemm.html
+
+        a = inputs[0]
+        a_scale = get_scalar(inputs[1], params)
+        a_zp = get_scalar(inputs[2], params, "int32")
+
+        b = inputs[3]
+        #  a scalar or 1D tensor of size 1 or N
+        b_scale = get_scalar_or_1d_tensor(inputs[4], params)
+        #  a scalar or 1D tensor of size 1 or N

Review Comment:
   yes，tensors `A scale and zero point`  do not  have `N`，Itis a scalar.  And 
tensor `B scale and zero point`  can have N size 1-D tensor, which means 
per-column quantization.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] chengven027-intellif commented on a diff in pull request #13747: [ONNX] QGemm support

Reply via email to