masahi commented on a change in pull request #4258: [WIP][TVM] Bring Your Own 
Codegen to TVM
URL: https://github.com/apache/incubator-tvm/pull/4258#discussion_r352310360
 
 

 ##########
 File path: tutorials/dev/custom_relay_backend.py
 ##########
 @@ -0,0 +1,291 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+
+.. _tutorial-custom-relay-backend:
+
+Bring Your Own Codegen To TVM
+=============================
+**Author**: `Zhi Chen <https://github.com/zhiics>`_, `Cody Hao Yu 
<https:://github.com/comaniac>`_
+
+As the hardware devices targeted by deep learning workloads keep increasing, 
the required knowledge
+for users to achieve high performance on various devices keeps increasing as 
well. To free data
+scientists from worrying about the performance when developing a new model, 
hardware vendors either
+provide libraries such as MKLDNN or cuDNN with many commonly used deep 
learning operators,
+or provide frameworks such as TensorRT to let users describe their models in a 
certain way to
+achieve high performance. However, users have to learn a new programming 
interface when they
+attempt to work on a new library or device. As a result, the demand of a 
unified programming
+interface becomes more and more important to 1) let all users and hardware 
vendors stand on the
+same page, and 2) provide a feasible solution to allow a specialized hardware 
or library to only
+support widely used operators with extremely high performance, but fallback 
unsupported operators
+to general devices like CPU/GPU.
+
+In this tutorial, we demonstrate how a hardware vendor can easily implement
+a Relay backend to support a specialized hardware device/library. It mainly
+takes three steps: 1) define whether an operator is supported under a given
+template, 2) specify how to compile and serialize the supported operators so
+that it can ingest TVM specific data format, e.g. NDArray, and 3) specify how
+to execute the compiled operators on a certain device. We will demonstrate how
+to add a new backend that uses open source compilers (e.g. GCC, LLVM, etc) or 
any
+proprietary compilers to execute a subgraph of a model without the exposure of
+the IP of customer's codegen tool chain. Note that you will need to add the
+specialized Relay backend to the TVM codebase and rebuild TVM for enabling.
+
+"""
+
+######################################################################
+# Define The Supported Operators
+# ------------------------------
+# The first step is to define which operators are supported by your backend.
+# A template is provided to ease vendor's effort to add the supported
+# operators.
+#
+# For example, We create a new Python file at 
python/relay/backend/op/contrib/gcc/extern_op.py,
+# and implement a set of boolean functions with corresponding operator names. 
A boolean
+# function should return `True` if we allow it to be executed by the given 
backend; `False`
+# otherwise.
+
+from __future__ import absolute_import
+
+def conv2d(attrs, args):
+    """Check if the external codegen should be used.
+    """
+    return False
+
+def subtract(attrs, args):
+    """Check if the external codegen should be used.
+    """
+    return True
+
+def add(attrs, args):
+    """Check if the external codegen should be used.
+    """
+    return True
+
+def multiply(attrs, args):
+    """Check if the external codegen should be used.
+    """
+    return True
+
+######################################################################
+# Note that since we include `attrs` and `args` into the function signature, we
+# can define more complicated rules. For example, we can only support conv2d
+# with float32 data type or with kernel size 1x1. In addition, the vendors can
+# also check the attributes associated with a given operator to decide if it is
+# supported by checking the fields in `attrs`. In an even more complicated but
+# interesting scenario, we also allow developers to check the sequence of
+# operators through iterating on the `agrs`. However, this is only
+# unidirectional as only the inputs are visible.
+#
+# After annotating whether an operator can be executed on the given backend.
+# Users can directly invoke the partitioning pass to separate the graph into
+# multiple segments. The C++ backend implements a partitioning pass to fulfill
+# the task and creates subgraphs/sub-functions with *External* attribute,
+# indicating that this function will be handled by external codegen tool.
+# Therefore, Relay passes should skip optimizations on them.
+
+######################################################################
+# Customize Subgraph Annotations
+# ------------------------------
+# In addition to specifying a set of rules for supported operators, we can 
also implement
+# a Relay IR mutator to find the supported subgraphs, which may include 
multiple operators,
+# for the target backend. Here we implement an annotator that includes an 
entire Relay graph
+# to be offloaded. Specifically, we are going to do two tasks:
+# - insert `subgraph_begin` after all input variables
+# - insert `subgraph_end` before the primary output. For example, given a 
Relay graph as follows:
+#       input_a
+#          |
+#         add    --- input_b
+#          |
+#       subtract --- input_c
+#          |
+#       multiply --- input_d
+#          |
+#         out
+#
+# Our goal is to mutate the graph to the following:
+#
+#       input_a
+#          |
+#     subgraph_begin
+#          |
+#         add    --- subgraph_begin --- input_b
+#          |
+#       subtract --- subgraph_begin --- input_c
+#          |
+#       multiply --- subgraph_begin --- input_d
+#          |
+#      subgraph_end
+#          |
+#         out
+#
+# The implementation is shown as follows. As can be seen, the annotator is 
derived from
+# `ExprMutator` that traverses a Relay graph and allows we to mutate it. We 
know that all ops
 
 Review comment:
   allow us

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to