[GitHub] [tvm] fzi-peccia commented on a diff in pull request #13770: [microTVM]Gemmini code generation using microTVM

via GitHub Mon, 27 Mar 2023 04:49:49 -0700


fzi-peccia commented on code in PR #13770:
URL: https://github.com/apache/tvm/pull/13770#discussion_r1149196854



##########
gallery/tutorial/micro_gemmini_add.py:
##########
@@ -0,0 +1,239 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+Running TVM on the Gemmini accelerator - A single add layer example
+======================================================================================
+**Author**:
+`Federico Peccia <https://fPecc.github.io/>`_
+
+This tutorials shows how a quantized add layer can be compiled to be executed 
on the Gemmini accelerator. The generated baremetal C code is then tested on 
the Spike RISC-V ISA simulator. Before starting this tutorial, you should have 
downloaded the Chipyard repository and installed the Spike simulator with the 
Gemmini extension.
+
+Note: This is an **experimental** layer!
+"""
+
+import tensorflow as tf
+from tensorflow.keras import layers
+import numpy as np
+import os
+import tvm.contrib.gemmini as gemmini
+from tvm import relay
+import tvm
+
+##################################
+# Pre-requisites
+# --------------------------------
+#
+# After the installation of the Chipyard development tools, you should have an 
env.sh file in your Chipyard home directory. This file needs to be sourced 
before running this tutorial:
+#
+# .. code-block:: bash
+#
+#   source <your chipyard home path>/env.sh
+#
+# WARNING: if you have installed TVM in a virtual environment, FIRST activate 
the Chipyard environment, and THEN activate the tvm entironment.
+
+##################################
+# Baseline generation
+# --------------------------------
+#
+# In this section, we will generate the baseline input and expected output, 
which we are going to use to compare with the actual obtained output after 
running on the Gemmini accelerator.
+
+# Then we define the parameters of the layer we want to test. In this case:
+input_height = 16
+input_width = 16
+input_channels = 16
+activation = 0
+
+# We will generate a prequantized TFLite model, because for now the Gemmini 
integration only supports models that were quantized with specific flags as 
input.
+class Model(tf.Module):
+    def __init__(self, name=None):
+        super().__init__(name)
+
+    @tf.function(
+        input_signature=[
+            tf.TensorSpec(
+                shape=[1, input_height, input_width, input_channels],
+                dtype=tf.float32,
+            ),
+            tf.TensorSpec(
+                shape=[1, input_height, input_width, input_channels],
+                dtype=tf.float32,
+            ),
+        ]
+    )
+    def add(self, x, y):
+        if activation == 0:
+            return x + y
+        else:
+            return layers.Activation("relu")(x + y)
+
+
+model = Model()
+
+# Convert the concrete functions using TFLiteConverter
+converter = tf.lite.TFLiteConverter.from_keras_model(model)
+
+
+def representative_data_gen():
+    dataset = [
+        (
+            np.array(
+                np.random.randint(-127, 128, size=(1, input_height, 
input_width, input_channels)),
+                dtype=np.float32,
+            ),
+            np.array(
+                np.random.randint(0, 128, size=(1, input_height, input_width, 
input_channels)),
+                dtype=np.float32,
+            ),
+        )
+        for s in range(100)
+    ]
+    for input_value in dataset:
+        yield [input_value[0], input_value[1]]
+
+
+converter.optimizations = [tf.lite.Optimize.DEFAULT]
+converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
+converter.inference_input_type = tf.uint8
+converter.inference_output_type = tf.int8
+converter.representative_dataset = representative_data_gen
+converter._experimental_disable_per_channel = True
+
+tflite_model = converter.convert()
+
+# Save the model.
+with open("add.tflite", "wb") as f:
+    f.write(tflite_model)
+
+# Now that we have created the model, we import the model and run it. We store 
the output, in order to compare it with the output that will be later obtained 
from the Gemmini accelerator.
+
+os.system("rm -rf model.tar dev/ include/ generated-project/")
+
+tflite_file = "./add.tflite"
+tflite_model_buf = open(tflite_file, "rb").read()
+input_tensor = "layer1_input"
+input_dtype = "uint8"
+
+os.system("mkdir -p include")
+
+try:
+    import tflite
+
+    tflite_model = tflite.Model.GetRootAsModel(tflite_model_buf, 0)
+except AttributeError:
+    import tflite.Model
+
+    tflite_model = tflite.Model.Model.GetRootAsModel(tflite_model_buf, 0)
+
+# Load the TFLite model and allocate tensors.
+interpreter = tf.lite.Interpreter(model_path=tflite_file, 
experimental_preserve_all_tensors=True)
+interpreter.allocate_tensors()
+input_details = interpreter.get_input_details()
+output_details = interpreter.get_output_details()
+tensor_details = interpreter.get_tensor_details()
+
+input_matrix_1 = np.random.randint(
+    0, 255, (1, input_height, input_width, input_channels), dtype=np.uint8
+)
+input_matrix_2 = np.random.randint(
+    0, 255, (1, input_height, input_width, input_channels), dtype=np.uint8
+)
+
+interpreter.set_tensor(input_details[0]["index"], input_matrix_1)
+interpreter.set_tensor(input_details[1]["index"], input_matrix_2)
+
+interpreter.invoke()
+expected_output = interpreter.get_tensor(output_details[0]["index"])
+
+# Here, we create C files and headers with the inputs and expected output, so 
that we can then execute the same operation on the Gemmini accelerator, and 
compare the expected output with the actual predicted one.
+gemmini.create_header_file("inputs", "data", "input_1", input_matrix_2, 
"./include")

Review Comment:
   As recommended, I added this method to all tutorials.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm] fzi-peccia commented on a diff in pull request #13770: [microTVM]Gemmini code generation using microTVM

Reply via email to