[GitHub] [tvm] areusch commented on a change in pull request #9233: Cortex m7 intrinsic

GitBox Wed, 03 Nov 2021 18:44:37 -0700


areusch commented on a change in pull request #9233:
URL: https://github.com/apache/tvm/pull/9233#discussion_r742417398




##########
File path: tests/python/conftest.py
##########
@@ -40,3 +41,26 @@
 
 if tvm.support.libinfo().get("USE_MICRO", "OFF") != "ON":
     collect_ignore.append("unittest/test_micro_transport.py")
+
+
+def pytest_addoption(parser):
+    parser.addoption(
+        "--enable-corstone300-tests",
+        action="store_true",
+        default=False,
+        help="Run Corstone-300 FVP tests",
+    )
+
+
+def pytest_collection_modifyitems(config, items):
+    for item in items:
+        if config.getoption("--enable-corstone300-tests"):
+            if not "corstone300" in item.keywords:
+                item.add_marker(
+                    pytest.mark.skip(reason="Test should be marked 
'corstone300' to run")

Review comment:
       i think we just need one skip, right? doesn't this skip all other tests 
aside from corstone300?

##########
File path: python/tvm/target/arm_isa.py
##########
@@ -16,18 +16,24 @@
 # under the License.
 """Defines functions to analyze available opcodes in the ARM ISA."""
 
+import argparse
 
 ARM_ISA_MAP = {
-    "armv7e-m": ["SMLAD"],
+    "armv7e-m": ["SMLAD", "SSUB8", "SEL"],
+    "armv8-m": ["SMLAD", "SSUB8", "SEL"],
 }
 
 
 class IsaAnalyzer(object):
+    """Checks ISA support for given target"""
+
     def __init__(self, target):
         self.target = target
-        # TODO: actually parse -mcpu
-        arch = "armv7e-m"
-        self._isa_map = ARM_ISA_MAP[arch]
+        parser = argparse.ArgumentParser()

Review comment:
       you should use the built-in Target parsing logic here rather than 
argparse:
   ```suggestion
           target = tvm.target.Target(target)
           march = target.attrs.get("-march", None)
           self._isa_map = ARM_ISA_MAP[march] if march is not None else []
   ```
   (also need to delete the following lines 33-36--suggestion didn't quite get 
the diff)

##########
File path: python/tvm/topi/arm_cpu/dense.py
##########
@@ -0,0 +1,25 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=invalid-name, unused-variable, no-else-return, 
unused-argument, import-outside-toplevel
+"""Dense schedule for ARM CPU"""
+
+from .cortex_m7.dense import direct_simd

Review comment:
       i think it makes sense then to not import the cortex_m7 direct_simd into 
this module. can we reorganize as discussed in the earlier thread?

##########
File path: python/tvm/target/arm_isa.py
##########
@@ -16,18 +16,24 @@
 # under the License.
 """Defines functions to analyze available opcodes in the ARM ISA."""
 
+import argparse
 
 ARM_ISA_MAP = {
-    "armv7e-m": ["SMLAD"],
+    "armv7e-m": ["SMLAD", "SSUB8", "SEL"],

Review comment:
       @u99127 as discussed, let's punt the architecture labelling to the next 
PR.

##########
File path: python/tvm/relay/op/strategy/arm_cpu.py
##########
@@ -49,6 +49,26 @@ def schedule_concatenate_arm_cpu(_, outs, target):
         return topi.arm_cpu.schedule_concatenate(outs)
 
 
+@schedule_pool.register(["arm_cpu", "micro_dev"])
+def schedule_pool_arm_cpu(attrs, outs, target):
+    """schedule pooling ops arm cpu"""
+    layout = attrs.layout
+    isa = arm_isa.IsaAnalyzer(target)
+    avg_pool = isinstance(attrs, relay.op.op_attrs.AvgPool2DAttrs)
+    with target:
+        if (
+            avg_pool
+            and layout in ("NCW", "NCHW")
+            and "SMLAD" in isa

Review comment:
       i agree with you that we should refactor this. this was left over from 
the initial implementation which did propose to test for presence of 
instructions in the ISA; however, you're right that we should just need to 
determine which architecture is in use. since this PR just adds additional 
schedules which are purported to be compatible with cortex-m7 devices, perhaps 
we can address the question of lookup-by-architecture in a follow-on.

##########
File path: tests/python/integration/test_m7_simd.py
##########
@@ -0,0 +1,355 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import sys
+import numpy as np
+import pytest
+import tvm
+from tvm import relay
+from tests.python.relay.aot.aot_test_utils import (
+    AOTTestModel,
+    AOT_CORSTONE300_RUNNER,
+    generate_ref_data,
+    compile_and_run,
+)
+
+
[email protected]_corstone300
[email protected](
+    "data_shape_nhwc, kernel_size, num_filter, strides, padding, dilation",
+    [
+        ((1, 32, 32, 1), (3, 3), 12, 1, 0, 1),
+        ((1, 32, 10, 3), (3, 3), 16, 1, 0, 1),
+        ((1, 49, 10, 1), (10, 4), 64, (2, 1), (4, 1, 5, 1), 1),
+        ((1, 32, 32, 16), (3, 3), 16, 1, (0, 2, 2, 0), 1),
+        ((1, 32, 32, 16), (3, 3), 16, 1, 0, 1),
+        ((1, 32, 32, 16), (3, 3), 16, 1, 0, 1),
+        ((1, 32, 32, 16), (3, 3), 16, 1, (0, 2, 2, 0), 2),
+        ((1, 32, 32, 16), (3, 3), 16, 1, (1, 1, 2, 2), 2),
+        # bug https://github.com/apache/tvm/issues/9226
+        ((1, 49, 10, 1), (10, 4), 64, (2, 2), (4, 1, 5, 1), 1),
+        # from Visual Wake Word model
+        ((1, 96, 96, 3), (3, 3), 8, (2, 2), (0, 0, 1, 1), 1),
+        # from Image Classification model (one of the MLPerfTiny models)
+        ((1, 16, 16, 32), (1, 1), 64, (2, 2), 0, 1),
+        ((4, 16, 16, 8), (5, 5), 8, 2, (0, 4, 4, 0), 1),
+        ((4, 16, 16, 8), (5, 5), 16, 2, (0, 4, 4, 0), 1),
+        ((4, 16, 16, 8), (5, 5), 8, 2, 0, 1),
+        ((4, 16, 16, 8), (5, 5), 16, 2, 0, 1),
+        ((1, 16, 16, 8), (3, 3), 16, 2, (0, 0, 1, 1), 1),
+        ((1, 16, 16, 8), (3, 3), 16, 2, (1, 1, 2, 2), 1),
+        ((1, 16, 16, 8), (5, 5), 16, 2, (3, 3, 2, 2), 1),
+        ((1, 16, 16, 8), (3, 3), 16, 2, (0, 1, 2, 3), 1),
+    ],
+)
[email protected]("dtype", ["int8", "int16"])
+def test_conv2d(data_shape_nhwc, kernel_size, num_filter, strides, padding, 
dilation, dtype):
+    """Test a subgraph with a single conv2d operator."""
+    ishape = data_shape_nhwc
+    wshape = (*kernel_size, data_shape_nhwc[-1], num_filter)
+
+    weight_data = np.random.randint(low=-10, high=10, size=wshape, dtype=dtype)
+
+    input0 = relay.var("input", relay.TensorType(ishape, dtype))
+    weight0 = relay.const(weight_data)
+    out0 = relay.op.nn.conv2d(
+        input0,
+        weight0,
+        kernel_size=kernel_size,
+        strides=strides,
+        padding=padding,
+        dilation=(dilation, dilation),
+        data_layout="NHWC",
+        kernel_layout="HWIO",
+        out_dtype="int32",
+        out_layout="NHWC",
+    )
+    ref_mod = tvm.IRModule.from_expr(relay.Function([input0], out0))
+
+    input1 = relay.var("input", relay.TensorType(ishape, dtype))
+    weight1 = relay.const(np.moveaxis(weight_data, 2, -1))
+    out1 = relay.op.nn.conv2d(
+        input1,
+        weight1,
+        kernel_size=kernel_size,
+        strides=strides,
+        padding=padding,
+        dilation=(dilation, dilation),
+        data_layout="NHWC",
+        kernel_layout="HWOI",
+        out_dtype="int32",
+        out_layout="NHWC",
+    )
+    mod = tvm.IRModule.from_expr(relay.Function([input1], out1))
+
+    inputs = {"input": np.random.randint(low=-128, high=127, size=ishape, 
dtype=dtype)}
+    output_list = generate_ref_data(ref_mod, inputs)
+
+    compile_and_run(
+        AOTTestModel(module=mod, inputs=inputs, outputs=output_list),
+        runner=AOT_CORSTONE300_RUNNER,
+        interface_api="c",
+        use_unpacked_api=True,
+        target_opts={
+            "-keys": "arm_cpu",
+            "-march": "armv7e-m",

Review comment:
       agreed--i think -mcpu was used to key the IsaAnalyzer, correct?

##########
File path: python/tvm/relay/qnn/op/legalizations.py
##########
@@ -374,6 +374,8 @@ def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
         attrs["kernel_layout"],
         attrs["groups"],
     )
+
+    # Use int8 for Cortex-M7

Review comment:
       @sergey-grovety can you revert this comment or fix the set of CPUs 
indicated?

##########
File path: python/tvm/target/arm_isa.py
##########
@@ -16,18 +16,24 @@
 # under the License.
 """Defines functions to analyze available opcodes in the ARM ISA."""
 
+import argparse
 
 ARM_ISA_MAP = {
-    "armv7e-m": ["SMLAD"],
+    "armv7e-m": ["SMLAD", "SSUB8", "SEL"],
+    "armv8-m": ["SMLAD", "SSUB8", "SEL"],

Review comment:
       same thing here

##########
File path: python/tvm/topi/arm_cpu/conv1d.py
##########
@@ -0,0 +1,36 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=invalid-name, unused-variable, no-else-return, 
unused-argument, import-outside-toplevel
+"""Conv1D schedule for ARM CPU"""
+from __future__ import absolute_import as _abs
+
+from tvm import autotvm
+
+from .cortex_m7.conv1d import direct_simd as direct_simd_conv1d
+
+
[email protected]_topi_compute("conv1d_nwc_direct_simd.arm_cpu")
+def conv1d_nwc_direct_simd(cfg, data, kernel, strides, padding, dilation, 
out_dtype):

Review comment:
       mprofile seems good to me.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] areusch commented on a change in pull request #9233: Cortex m7 intrinsic

Reply via email to