[GitHub] [tvm] ibsidorenko commented on a diff in pull request #13752: [microTVM] Use QNN schedules to give SOTA performance

via GitHub Mon, 20 Feb 2023 01:29:20 -0800


ibsidorenko commented on code in PR #13752:
URL: https://github.com/apache/tvm/pull/13752#discussion_r1111618732



##########
python/tvm/relay/qnn/strategy/arm_cpu.py:
##########
@@ -21,9 +21,55 @@
 regular/depthwise conv2d is supported, but qnn_dense will be added 
eventually."""
 
 from tvm import topi, TVMError
-from .generic import qnn_conv2d_strategy
+from tvm.topi.utils import get_const_tuple
 from ... import op as _op
 from ...op.strategy.generic import is_depthwise_conv2d
+from .generic import (
+    qnn_conv2d_strategy,
+    qnn_dense_strategy,
+    qnn_dequantize_strategy,
+    qnn_quantize_strategy,
+    wrap_compute_dequantize,
+    wrap_compute_quantize,
+    wrap_topi_qnn_dense,
+    wrap_topi_schedule,
+)
+
+
+@qnn_quantize_strategy.register("arm_cpu")
+def qnn_quantize_strategy_arm_cpu(_attrs, _inputs, _out_type, _target):
+    """qnn.quantize strategy for arm_cpu"""
+    strategy = _op.OpStrategy()
+    strategy.add_implementation(
+        wrap_compute_quantize(topi.hexagon.qnn_quantize),
+        wrap_topi_schedule(topi.hexagon.schedule_qnn_quantize),
+        name="qnn_quantize.arm_cpu",
+    )
+    return strategy
+
+
+@qnn_dequantize_strategy.register("arm_cpu")
+def qnn_dequantize_strategy_arm_cpu(_attrs, _inputs, _out_type, _target):
+    """qnn.dequantize strategy for arm_cpu"""
+    strategy = _op.OpStrategy()
+    strategy.add_implementation(
+        wrap_compute_dequantize(topi.hexagon.qnn_dequantize),
+        wrap_topi_schedule(topi.hexagon.schedule_qnn_dequantize),
+        name="qnn_dequantize.arm_cpu",
+    )
+    return strategy
+
+
+@qnn_dense_strategy.register("arm_cpu")
+def qnn_dense_strategy_arm_cpu(_attrs, _inputs, _out_type, _target):
+    """qnn.dense strategy for arm_cpu"""
+    strategy = _op.OpStrategy()
+    strategy.add_implementation(
+        wrap_topi_qnn_dense(topi.hexagon.qnn_dense),
+        wrap_topi_schedule(topi.hexagon.schedule_qnn_dense),

Review Comment:
   As I see you reuse compute/schedule from Hexagon. These schedules are not 
optimized and have very naive implementation. Is it acceptable for you?



##########
python/tvm/topi/nn/qnn.py:
##########
@@ -212,6 +212,48 @@ def qnn_requantize_alter_layout(_attrs, _inputs, _tinfos, 
_out_type):
     return None
 
 
[email protected]_func
+def qnn_bias_add_legalize(_attrs, _inputs, _tinfos):
+    """Legalize bias_add layout.
+
+    Bias add is not a QNN-specific function, but this generic exists so that 
empty channels can
+    be excised from quantized conv2d operators and folded into bias adds.
+
+    Parameters
+    ----------
+    attrs : tvm.ir.Attrs
+        Attributes of current convolution
+    inputs : tvm.relay.Expr
+        Grouped input symbols
+    tinfos : list
+        Input shape and dtype
+
+    """
+    return None
+
+
[email protected]_func
+def qnn_clip_legalize(_attrs, inputs, _tinfos, _out_type):
+    """Change clip layout.
+
+    Parameters
+    ----------
+    attrs : tvm.ir.Attrs
+        Attributes of current convolution
+    inputs : tvm.relay.Expr
+        Grouped input symbols
+    tinfos : list
+        Input shape and dtype
+    out_type: type
+        The output type
+
+    Note
+    ----
+    Unlike other TOPI functions, this function operates on both graph level 
and operator level.
+    """
+    return inputs[0]
+
+
 @tvm.target.generic_func
 def qnn_add_alter_layout(_attrs, _inputs, _tinfos, _out_type):

Review Comment:
   Here is the same. How about `qnn_add_alter_layout `--> `add_alter_layout`. 
Since we do it for nn.add (not qnn.add)



##########
python/tvm/topi/nn/qnn.py:
##########
@@ -212,6 +212,48 @@ def qnn_requantize_alter_layout(_attrs, _inputs, _tinfos, 
_out_type):
     return None
 
 
[email protected]_func
+def qnn_bias_add_legalize(_attrs, _inputs, _tinfos):

Review Comment:
   How about to rename to `bias_add_legalize`? This name looks confused since 
we do legalization for `nn.bias_add` (not qnn bias_add)



##########
python/tvm/relay/qnn/op/_qnn.py:
##########
@@ -85,12 +91,72 @@ def simulated_dequantize_compute(attrs, inputs, 
output_type):
 register_strategy("qnn.conv2d", strategy.qnn_conv2d_strategy)
 
 
+def _get_clip_dtype_bounds(dtype):
+    """Returns the minimum and maximum values of a C integer data type."""
+    assert "int" in dtype
+    bits = int(dtype[dtype.find("int") + 3 :])
+
+    if dtype.startswith("int"):
+        return (-(2 ** (bits - 1)), 2 ** (bits - 1) - 1)

Review Comment:
   Just a nit comment... np.iinfo(dtype).min / np.iinfo(dtype).max are not 
suitable?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] ibsidorenko commented on a diff in pull request #13752: [microTVM] Use QNN schedules to give SOTA performance

Reply via email to