from:"echuraev"

(tvm) branch main updated: Restore "pytest.mark.gpu" for RELAX tests (#16741)

2024-04-23 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 11f2253b9c Restore "pytest.mark.gpu" for RELAX tests (#16741)
11f2253b9c is described below

commit 11f2253b9cc22ff354e7f13df2d5a55feae01259
Author: apeskov 
AuthorDate: Tue Apr 23 11:22:55 2024 +0300

Restore "pytest.mark.gpu" for RELAX tests (#16741)

* [TEST] Mark RELAX GPU tests with pytest.mark.gpu

Missed pytest.mark.gpu prevents tests from launch in CI.

Signed-off-by: Alexander Peskov 

* fix

Signed-off-by: Alexander Peskov 

* Check fp8 compute capability

Signed-off-by: Alexander Peskov 

* fix func signature

Signed-off-by: Alexander Peskov 

* lint

Signed-off-by: Alexander Peskov 

-

Signed-off-by: Alexander Peskov 
Co-authored-by: Alexander Peskov 
---
 tests/python/relax/test_codegen_cublas.py | 10 ++
 tests/python/relax/test_codegen_cudnn.py  |  9 +
 tests/python/relax/test_codegen_cutlass.py|  9 +
 tests/python/relax/test_codegen_tensorrt.py   | 13 +++--
 tests/python/relax/test_contrib_vllm.py   |  2 +-
 tests/python/relax/test_transform_codegen_pass.py | 10 ++
 6 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/tests/python/relax/test_codegen_cublas.py 
b/tests/python/relax/test_codegen_cublas.py
index 4f357626b8..ea0861467f 100644
--- a/tests/python/relax/test_codegen_cublas.py
+++ b/tests/python/relax/test_codegen_cublas.py
@@ -36,14 +36,7 @@ def reset_seed():
 np.random.seed(0)
 
 
-has_cublas = tvm.get_global_func("relax.ext.cublas", True)
-
-cublas_enabled = pytest.mark.skipif(
-not has_cublas,
-reason="CUBLAS not enabled.",
-)
-
-pytestmark = [cublas_enabled]
+pytestmark = tvm.testing.requires_cublas.marks()
 
 
 def build_and_run(mod, inputs_np, target, legalize=False, cuda_graph=False):
@@ -231,6 +224,7 @@ def test_matmul_igemm_offload(
 tvm.testing.assert_allclose(out, ref, rtol=1e-2, atol=1e-2)
 
 
+@tvm.testing.requires_cuda_compute_version(9)
 @pytest.mark.skipif(ml_dtypes is None, reason="requires ml_dtypes to be 
installed")
 @pytest.mark.parametrize(
 "x_shape, y_shape, transpose_y, out_dtype",
diff --git a/tests/python/relax/test_codegen_cudnn.py 
b/tests/python/relax/test_codegen_cudnn.py
index c913559232..f342705878 100644
--- a/tests/python/relax/test_codegen_cudnn.py
+++ b/tests/python/relax/test_codegen_cudnn.py
@@ -34,14 +34,7 @@ def reset_seed():
 np.random.seed(0)
 
 
-has_cudnn = tvm.get_global_func("relax.ext.cudnn", True)
-
-cudnn_enabled = pytest.mark.skipif(
-not has_cudnn,
-reason="cuDNN not enabled.",
-)
-
-pytestmark = [cudnn_enabled]
+pytestmark = tvm.testing.requires_cudnn.marks()
 
 
 _activation_table = {
diff --git a/tests/python/relax/test_codegen_cutlass.py 
b/tests/python/relax/test_codegen_cutlass.py
index fced7a84a8..57f47ca6e6 100644
--- a/tests/python/relax/test_codegen_cutlass.py
+++ b/tests/python/relax/test_codegen_cutlass.py
@@ -75,14 +75,7 @@ class Conv2dx2:
 return conv2
 
 
-has_cutlass = tvm.get_global_func("relax.ext.cutlass", True)
-
-cutlass_enabled = pytest.mark.skipif(
-not has_cutlass,
-reason="CUTLASS not enabled.",
-)
-
-pytestmark = [cutlass_enabled]
+pytestmark = tvm.testing.requires_cutlass.marks()
 
 
 def build_and_run(mod, inputs_np, target, legalize=True, cuda_graph=False):
diff --git a/tests/python/relax/test_codegen_tensorrt.py 
b/tests/python/relax/test_codegen_tensorrt.py
index 23dc7d887f..009bb24c63 100644
--- a/tests/python/relax/test_codegen_tensorrt.py
+++ b/tests/python/relax/test_codegen_tensorrt.py
@@ -43,13 +43,22 @@ class Conv2dResidualBlock:
 
 
 has_tensorrt = tvm.get_global_func("relax.ext.tensorrt", True)
+env_checker_runtime = tvm.get_global_func("relax.is_tensorrt_runtime_enabled", 
True)
 
-tensorrt_enabled = pytest.mark.skipif(
+requires_tensorrt_codegen = pytest.mark.skipif(
 not has_tensorrt,
 reason="TENSORRT not enabled.",
 )
 
-pytestmark = [tensorrt_enabled]
+requires_tensorrt_runtime = pytest.mark.skipif(
+not env_checker_runtime or not env_checker_runtime(),
+reason="TensorRT runtime not available",
+)
+
+pytestmark = [
+requires_tensorrt_codegen,
+requires_tensorrt_runtime,
+] + tvm.testing.requires_cuda.marks()
 
 
 def build_and_run(mod, inputs_np, target, legalize=False):
diff --git a/tests/python/relax/test_contrib_vllm.py 
b/tests/python/relax/test_contrib_vllm.py
index dd2149e572..f3c4839133 100644
--- a/tests/python/relax/test_contrib_vllm.py
+++ b/tests/python/relax/test_contrib_vllm.py
@@ -32,

(tvm) branch main updated: [Meta-Schedule][OpenCL] Enable MS tuning for Android OpenCL (#16846)

2024-04-05 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new c93f0bae9b [Meta-Schedule][OpenCL] Enable MS tuning for Android OpenCL 
(#16846)
c93f0bae9b is described below

commit c93f0bae9bf9aa3bd42f3239d4e4a0f2da37ee84
Author: Egor Churaev 
AuthorDate: Fri Apr 5 09:52:41 2024 +0300

[Meta-Schedule][OpenCL] Enable MS tuning for Android OpenCL (#16846)

Added OpenCL as a GPU target for Meta-Scheduler. Implemented export
function for Android which can be used when MS builder is configured.
Added an integration test which checks that MS tuning on Android GPU
works fine.
---
 python/tvm/contrib/ndk.py  | 12 
 src/meta_schedule/utils.h  |  3 +-
 tests/python/contrib/test_android/__init__.py  | 18 ++
 .../python/contrib/test_android/infrastructure.py  | 57 +
 .../contrib/test_android/test_meta_schedule.py | 71 ++
 5 files changed, 160 insertions(+), 1 deletion(-)

diff --git a/python/tvm/contrib/ndk.py b/python/tvm/contrib/ndk.py
index 2a1105ed2b..14820c0ca8 100644
--- a/python/tvm/contrib/ndk.py
+++ b/python/tvm/contrib/ndk.py
@@ -22,7 +22,10 @@ import subprocess
 import os
 import shutil
 from typing import Dict
+import tempfile
+from pathlib import Path
 
+from .._ffi import register_func
 from .._ffi.base import py_str
 from . import utils as _utils, tar as _tar, cc as _cc
 from .cc import get_target_by_dump_machine
@@ -152,3 +155,12 @@ def get_global_symbol_section_map(path, *, nm=None) -> 
Dict[str, str]:
 base_path = os.path.dirname(compiler)
 nm = os.path.join(base_path, "llvm-nm")
 return _cc.get_global_symbol_section_map(path, nm=nm)
+
+
+@register_func("meta_schedule.builder.export_ndk")
+def _ndk_export(mod):
+tmp_dir = tempfile.mkdtemp()
+binary_name = "tmp_binary.so"
+binary_path = Path(tmp_dir) / binary_name
+mod.export_library(binary_path, fcompile=create_shared)
+return str(binary_path)
diff --git a/src/meta_schedule/utils.h b/src/meta_schedule/utils.h
index 60840ca163..ceb0356cbc 100644
--- a/src/meta_schedule/utils.h
+++ b/src/meta_schedule/utils.h
@@ -513,7 +513,8 @@ inline void CloneRules(const SpaceGeneratorNode* src, 
SpaceGeneratorNode* dst) {
 
 /*! \brief Returns true if the given target is one of the supported gpu 
targets. */
 inline bool IsGPUTarget(const std::string& target_name) {
-  static const std::unordered_set gpu_targets{"cuda", "rocm", 
"vulkan", "metal"};
+  static const std::unordered_set gpu_targets{"cuda", "rocm", 
"vulkan", "metal",
+   "opencl"};
   return gpu_targets.count(target_name);
 }
 
diff --git a/tests/python/contrib/test_android/__init__.py 
b/tests/python/contrib/test_android/__init__.py
new file mode 100644
index 00..9669578bb7
--- /dev/null
+++ b/tests/python/contrib/test_android/__init__.py
@@ -0,0 +1,18 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+""" Testing infrastructure for Android """
diff --git a/tests/python/contrib/test_android/infrastructure.py 
b/tests/python/contrib/test_android/infrastructure.py
new file mode 100644
index 00..b78d0bb40e
--- /dev/null
+++ b/tests/python/contrib/test_android/infrastructure.py
@@ -0,0 +1,57 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the Lic

(tvm) branch main updated: [RUNTIME][OPENCL] Bugfix for ciImage create with host ptr (#16768)

2024-03-23 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 77a7b01081 [RUNTIME][OPENCL] Bugfix for ciImage create with host ptr 
(#16768)
77a7b01081 is described below

commit 77a7b010817d2d8fbdf89223bb814e9c38f68365
Author: Siva 
AuthorDate: Sat Mar 23 20:23:36 2024 +0530

[RUNTIME][OPENCL] Bugfix for ciImage create with host ptr (#16768)

Added couple more tests for host ptr data validation
---
 src/runtime/opencl/opencl_device_api.cc  |  2 +-
 tests/cpp-runtime/opencl/opencl_nativeptr.cc | 40 +++-
 2 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/src/runtime/opencl/opencl_device_api.cc 
b/src/runtime/opencl/opencl_device_api.cc
index 96ec8ed69f..ab553052bb 100644
--- a/src/runtime/opencl/opencl_device_api.cc
+++ b/src/runtime/opencl/opencl_device_api.cc
@@ -294,7 +294,7 @@ cl_mem OpenCLWorkspace::AllocTexture(Device dev, size_t 
width, size_t height,
   cl_channel_type cl_type = DTypeToOpenCLChannelType(type_hint);
   cl_image_format format = {CL_RGBA, cl_type};
   cl_image_desc descriptor = {CL_MEM_OBJECT_IMAGE2D, width, height, 0, 0, 0, 
0, 0, 0};
-  cl_mem mptr = clCreateImage(this->contexts[platform], CL_MEM_CREATE_FLAGS, 
, ,
+  cl_mem mptr = clCreateImage(this->contexts[platform], CL_MEM_READ_WRITE, 
, ,
   nullptr, _code);
   OPENCL_CHECK_ERROR(err_code);
   return mptr;
diff --git a/tests/cpp-runtime/opencl/opencl_nativeptr.cc 
b/tests/cpp-runtime/opencl/opencl_nativeptr.cc
index ebfb62e920..8f894c4bff 100644
--- a/tests/cpp-runtime/opencl/opencl_nativeptr.cc
+++ b/tests/cpp-runtime/opencl/opencl_nativeptr.cc
@@ -20,17 +20,55 @@
 #include 
 #include 
 
+#include 
+#include 
+
 #include "../src/runtime/opencl/opencl_common.h"
 
 using namespace tvm::runtime;
 using namespace tvm::runtime::cl;
 
 #if defined(OPENCL_ENABLE_HOST_PTR)
-TEST(OpenCLNDArray, native_ptr) {
+TEST(OpenCLNativePtr, access_memory) {
   OpenCLWorkspace* workspace = OpenCLWorkspace::Global();
 
   auto A = tvm::runtime::NDArray::Empty({128, 128}, {kDLFloat, 32, 1}, 
{kDLOpenCL, 0});
   void* nptr = workspace->GetNativePtr(A);
   memset(nptr, 0x0, 128 * 128 * 4);
 }
+
+TEST(OpenCLNatvePtr, data_loop) {
+  OpenCLWorkspace* workspace = OpenCLWorkspace::Global();
+
+  auto cl_arr = tvm::runtime::NDArray::Empty({1024}, {kDLFloat, 32, 1}, 
{kDLOpenCL, 0});
+  auto cpu_arr = tvm::runtime::NDArray::Empty({1024}, {kDLFloat, 32, 1}, 
{kDLCPU, 0});
+
+  std::random_device rdev;
+  std::mt19937 mt(rdev());
+  std::uniform_real_distribution<> random(-10.0, 10.0);
+
+  // Random initialize host ndarray
+  for (size_t i = 0; i < 1024; i++) {
+static_cast(cpu_arr->data)[i] = random(mt);
+  }
+  // Do a roundtrip from cpu arr to opencl array and native ptr.
+  cpu_arr.CopyTo(cl_arr);
+  void* nptr = workspace->GetNativePtr(cl_arr);
+  for (size_t i = 0; i < 1024; ++i) {
+ICHECK_LT(std::fabs(static_cast(cpu_arr->data)[i] - 
static_cast(nptr)[i]),
+  1e-5);
+  }
+
+  // Random initialize cl ndarray
+  for (size_t i = 0; i < 1024; i++) {
+static_cast(nptr)[i] = random(mt);
+  }
+  // Do a roundtrip from native ptr to cl arr to cpu array.
+  cl_arr.CopyTo(cpu_arr);
+  for (size_t i = 0; i < 1024; ++i) {
+ICHECK_LT(std::fabs(static_cast(cpu_arr->data)[i] - 
static_cast(nptr)[i]),
+  1e-5);
+  }
+}
+
 #endif

(tvm) branch main updated: [Relay][ONNX] Fix the attribute mode parse of operator Upsample (#16622)

2024-02-22 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 72ce7013e4 [Relay][ONNX] Fix the attribute mode parse of operator 
Upsample  (#16622)
72ce7013e4 is described below

commit 72ce7013e46c432dd1f8c3e1ec862a1e72b9798e
Author: Qingchao Shen 
AuthorDate: Fri Feb 23 15:54:50 2024 +0800

[Relay][ONNX] Fix the attribute mode parse of operator Upsample  (#16622)

* add the default value for mode attrbute of Upsample

* Update test_forward.py

* Update onnx.py

* Update test_forward.py
---
 python/tvm/relay/frontend/onnx.py  |  2 +-
 tests/python/frontend/onnx/test_forward.py | 22 ++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/python/tvm/relay/frontend/onnx.py 
b/python/tvm/relay/frontend/onnx.py
index 3023cd039c..b95afae1d1 100644
--- a/python/tvm/relay/frontend/onnx.py
+++ b/python/tvm/relay/frontend/onnx.py
@@ -2392,7 +2392,7 @@ class Upsample(OnnxOpConverter):
 if not isinstance(scales, _expr.Expr):
 assert scales[0] == 1.0 and scales[1] == 1.0
 
-mode = attr.get("mode")
+mode = attr.get("mode", b"nearest")
 if mode == b"nearest":
 method = "nearest_neighbor"
 elif mode == b"linear":
diff --git a/tests/python/frontend/onnx/test_forward.py 
b/tests/python/frontend/onnx/test_forward.py
index cfa30ad346..543aa7f518 100644
--- a/tests/python/frontend/onnx/test_forward.py
+++ b/tests/python/frontend/onnx/test_forward.py
@@ -1726,6 +1726,27 @@ def test_upsample_nearest(target, dev):
 verify_with_ort_with_inputs(model, [in_array], [out_shape], opset=7, 
target=target, dev=dev)
 
 
+@tvm.testing.parametrize_targets
+def test_upsample_nearest_default(target, dev):
+"""test_upsample_nearest_default"""
+scale = 2
+in_shape = (1, 1, 3, 3)
+out_shape = (1, 1, 3 * scale, 3 * scale)
+y = helper.make_node("Upsample", ["in"], ["out"], scales=[1.0, 1.0, 2.0, 
2.0])
+
+in_array = np.random.uniform(size=in_shape).astype(np.float32)
+
+graph = helper.make_graph(
+[y],
+"upsample_nearest_test",
+inputs=[helper.make_tensor_value_info("in", TensorProto.FLOAT, 
list(in_shape))],
+outputs=[helper.make_tensor_value_info("out", TensorProto.FLOAT, 
list(out_shape))],
+)
+
+model = helper.make_model(graph, producer_name="upsample_nearest_test")
+verify_with_ort_with_inputs(model, [in_array], [out_shape], opset=7, 
target=target, dev=dev)
+
+
 @tvm.testing.parametrize_targets
 def test_upsample3d_nearest(target, dev):
 """test_upsample3d_nearest"""
@@ -5708,6 +5729,7 @@ unsupported_onnx_tests = [
 "test_unique_sorted_with_axis_3d",
 "test_unique_sorted_with_negative_axis",
 "test_upsample_nearest",
+"test_upsample_nearest_default",
 ]

(tvm) branch main updated: [Relay][ONNX] Fix the Resize operator in ONNX frontend (#16626)

2024-02-22 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new aa55528714 [Relay][ONNX] Fix the Resize operator in ONNX frontend 
(#16626)
aa55528714 is described below

commit aa5552871415409d2696bc5864535c910ee12018
Author: Qingchao Shen 
AuthorDate: Fri Feb 23 15:54:04 2024 +0800

[Relay][ONNX] Fix the Resize operator in ONNX frontend (#16626)

* Update onnx.py

* Update test_forward.py
---
 python/tvm/relay/frontend/onnx.py  | 4 ++--
 tests/python/frontend/onnx/test_forward.py | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/python/tvm/relay/frontend/onnx.py 
b/python/tvm/relay/frontend/onnx.py
index ddd0d34c5c..3023cd039c 100644
--- a/python/tvm/relay/frontend/onnx.py
+++ b/python/tvm/relay/frontend/onnx.py
@@ -3932,7 +3932,7 @@ class Resize(OnnxOpConverter):
 
 @classmethod
 def _impl_v10(cls, inputs, attr, params):
-mode = attr.get("mode").decode("ascii")
+mode = attr.get("mode", b"nearest").decode("ascii")
 if mode == "nearest":
 method = "nearest_neighbor"
 elif mode == "linear":
@@ -4007,7 +4007,7 @@ class Resize(OnnxOpConverter):
 if roi is not None and infer_shape(roi)[0] == 0:
 roi = None
 ndims = len(infer_shape(inputs[0]))
-mode = attr.get("mode").decode("ascii")
+mode = attr.get("mode", b"nearest").decode("ascii")
 if mode == "nearest":
 method = "nearest_neighbor"
 elif mode == "linear":
diff --git a/tests/python/frontend/onnx/test_forward.py 
b/tests/python/frontend/onnx/test_forward.py
index 51748462d0..cfa30ad346 100644
--- a/tests/python/frontend/onnx/test_forward.py
+++ b/tests/python/frontend/onnx/test_forward.py
@@ -4503,6 +4503,7 @@ def test_resize(target, dev):
 # scales are specified instead of sizes
 verify([1, 16] + [32] * ndim, [], [1, 1] + [0.5] * ndim, method, 
coord_trans)
 verify([1, 16] + [32] * ndim, [], [1, 1] + [2] * ndim, method, 
coord_trans)
+verify([1, 16] + [32] * ndim, [], [1, 1] + [2] * ndim, None, 
coord_trans)
 
 method = "linear"
 # upsampling

(tvm) branch main updated: [OpenCL][CMake] Fix OpenCL tests compilation (#16394)

2024-01-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new fe9814c73e [OpenCL][CMake] Fix OpenCL tests compilation (#16394)
fe9814c73e is described below

commit fe9814c73e24df3ecd031b216492fb555a1ab95a
Author: Egor Churaev 
AuthorDate: Mon Jan 15 09:46:47 2024 +0300

[OpenCL][CMake] Fix OpenCL tests compilation (#16394)

[OpenCL] Fix OpenCL tests compilation

Found a problem when you are in a different cmake project (not TVM) and you 
run TVM build with OpenCL tests, then `CMAKE_SOURCE_DIR` returns the path to 
the `CMakeList.txt` in the current project (not to the TVM) and in this case we 
will see the following error: `No SOURCES given to target: opencl-cpptest`.

To be consistent with code style in `OpenCL.cmake`, I removed the usage of 
`CMAKE_SOURCE_DIR` variable. It also fixes the issue if TVM cmake was called 
from directory with another cmake project.
---
 cmake/modules/OpenCL.cmake | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmake/modules/OpenCL.cmake b/cmake/modules/OpenCL.cmake
index 2dc1fc18f3..ddcd1e4190 100644
--- a/cmake/modules/OpenCL.cmake
+++ b/cmake/modules/OpenCL.cmake
@@ -81,7 +81,7 @@ if(USE_OPENCL)
 if(Build_OpenCL_GTests)
 message(STATUS "Building OpenCL-Gtests")
 tvm_file_glob(GLOB_RECURSE OPENCL_TEST_SRCS
-  "${CMAKE_SOURCE_DIR}/tests/cpp-runtime/opencl/*.cc"
+  "tests/cpp-runtime/opencl/*.cc"
 )
 add_executable(opencl-cpptest ${OPENCL_TEST_SRCS})
 target_link_libraries(opencl-cpptest PRIVATE gtest_main tvm_runtime)

(tvm) branch echuraev/fix_cmake_issue_for_opencl_tests created (now 4eb9d880fd)

2024-01-12 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch echuraev/fix_cmake_issue_for_opencl_tests
in repository https://gitbox.apache.org/repos/asf/tvm.git


  at 4eb9d880fd [OpenCL] Fix OpenCL tests compilation

This branch includes the following new commits:

 new 4eb9d880fd [OpenCL] Fix OpenCL tests compilation

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.

(tvm) 01/01: [OpenCL] Fix OpenCL tests compilation

2024-01-12 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch echuraev/fix_cmake_issue_for_opencl_tests
in repository https://gitbox.apache.org/repos/asf/tvm.git

commit 4eb9d880fd049730f452a651419bf73b7abdab1b
Author: Egor Churaev 
AuthorDate: Fri Jan 12 14:56:10 2024 +0300

[OpenCL] Fix OpenCL tests compilation

Found a problem when you are in a different cmake project (not TVM) and you 
run TVM build with OpenCL tests, then `CMAKE_SOURCE_DIR` returns the path to 
the `CMakeList.txt` in the current project (not to the TVM) and in this case we 
will see the following error: `No SOURCES given to target: opencl-cpptest`.

To be consistent with code style in `OpenCL.cmake`, I removed the usage of 
`CMAKE_SOURCE_DIR` variable. It also fixes the issue if TVM cmake was called 
from directory with another cmake project.
---
 cmake/modules/OpenCL.cmake | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmake/modules/OpenCL.cmake b/cmake/modules/OpenCL.cmake
index 2dc1fc18f3..ddcd1e4190 100644
--- a/cmake/modules/OpenCL.cmake
+++ b/cmake/modules/OpenCL.cmake
@@ -81,7 +81,7 @@ if(USE_OPENCL)
 if(Build_OpenCL_GTests)
 message(STATUS "Building OpenCL-Gtests")
 tvm_file_glob(GLOB_RECURSE OPENCL_TEST_SRCS
-  "${CMAKE_SOURCE_DIR}/tests/cpp-runtime/opencl/*.cc"
+  "tests/cpp-runtime/opencl/*.cc"
 )
 add_executable(opencl-cpptest ${OPENCL_TEST_SRCS})
 target_link_libraries(opencl-cpptest PRIVATE gtest_main tvm_runtime)

(tvm) branch main updated: [TOPI][ADRENO] Add conv2d transpose nchw texture schedule (#15786)

2023-11-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 015da7c185 [TOPI][ADRENO] Add conv2d transpose nchw texture schedule 
(#15786)
015da7c185 is described below

commit 015da7c185199add4af943970b6ee3d0a0661ec4
Author: krishnaraj36 
AuthorDate: Wed Nov 15 11:37:40 2023 +0530

[TOPI][ADRENO] Add conv2d transpose nchw texture schedule (#15786)

* [TOPI][ADRENO] Add conv2d transpose nchw texture schedule

Added the conv2d transpose strategy for adreno target and enable the
optimized schedule.

* Fix the whitespace lint error

* Fix lint errors

* Fix whitespace lint error

* Removed unused variables

* Add more conv2dTranspose testcases

* empty update

empty update for retrigger ci

* Update test_conv2d_transpose_nchw_texture.py

* Added more testcase to check memory scopes

* Device specific alter_op_layout for conv2d_transpose

* Fix in virtual device setup and added test case with scope check

* Add the comment conv2d algo

* Add the comment conv2d algo

* Removed fp16 test case from texture

It is failing for few gpu devices.

* remove opencl config change for mainline confilct

* Add the test case for 3 channel input which run with cuda schecule

* Fix in op strategy for out channel 3

* Comment in test case for memory scope

-

Co-authored-by: Siva 
---
 python/tvm/relay/op/nn/_nn.py  |   6 +
 python/tvm/relay/op/strategy/adreno.py |  52 +++
 python/tvm/topi/adreno/__init__.py |   2 +
 .../tvm/topi/adreno/conv2d_transpose_alter_op.py   | 121 ++
 python/tvm/topi/adreno/conv2d_transpose_nchw.py| 412 +
 python/tvm/topi/adreno/utils.py|  23 ++
 python/tvm/topi/nn/conv2d.py   |  23 ++
 src/relay/transforms/annotate_texture_storage.cc   |   4 +
 .../test_conv2d_transpose_nchw_texture.py  | 325 
 .../relay/opencl_texture/utils/adreno_utils.py |   5 +-
 10 files changed, 972 insertions(+), 1 deletion(-)

diff --git a/python/tvm/relay/op/nn/_nn.py b/python/tvm/relay/op/nn/_nn.py
index c68685f0ae..6acaf43fe7 100644
--- a/python/tvm/relay/op/nn/_nn.py
+++ b/python/tvm/relay/op/nn/_nn.py
@@ -335,6 +335,12 @@ def legalize_conv2d_transpose(attrs, inputs, types):
 return topi.nn.conv2d_transpose_legalize(attrs, inputs, types)
 
 
+@reg.register_alter_op_layout("nn.conv2d_transpose")
+def alter_op_layout_conv2d_transpose(attrs, inputs, tinfos, out_type):
+"""Alternate the layout of conv2d_transpose"""
+return topi.nn.conv2d_transpose_alter_layout(attrs, inputs, tinfos, 
out_type)
+
+
 @reg.register_convert_op_layout("nn.conv2d_transpose")
 def convert_conv2d_transpose(attrs, inputs, tinfos, desired_layouts):
 """Convert Layout pass registration for conv2d_transpose op.
diff --git a/python/tvm/relay/op/strategy/adreno.py 
b/python/tvm/relay/op/strategy/adreno.py
index c180eeec74..bacace9ad4 100644
--- a/python/tvm/relay/op/strategy/adreno.py
+++ b/python/tvm/relay/op/strategy/adreno.py
@@ -215,6 +215,58 @@ def 
conv2d_winograd_without_weight_transform_strategy_adreno(attrs, inputs, out_
 return strategy
 
 
+@conv2d_transpose_strategy.register("adreno")
+def conv2d_transpose_strategy_adreno(attrs, inputs, out_type, target):
+"""conv2d_transpose adreno strategy"""
+strategy = _op.OpStrategy()
+_, kernel = inputs
+dilation = attrs.get_int_tuple("dilation")
+groups = attrs.groups
+data_layout = attrs.data_layout
+kernel_layout = attrs.kernel_layout
+assert dilation == (1, 1), "not support dilate now"
+
+if (groups == 1) and (
+(data_layout == "NCHW" and kernel_layout == "IOHW")
+or (data_layout == "NCHW4c" and kernel_layout == "IOHW4o")
+or (data_layout == "NCHW" and kernel_layout == "IOHW4o")
+):
+if len(kernel.shape) == 4:
+_, oc, _, _ = get_const_tuple(kernel.shape)
+else:
+_, oc, _, _, _ = get_const_tuple(kernel.shape)
+# We cannot use textures for case than number of channels is less than 
4.
+# So, we use compute functions from cuda.
+if len(kernel.shape) == 4 and oc < 4:
+strategy.add_implementation(
+wrap_compute_conv2d_transpose(topi.cuda.conv2d_transpose_nchw),
+wrap_topi_schedule(topi.cuda.schedule_conv2d_transpose_nchw),
+name="conv2d_transpose_nchw.cuda&qu

[tvm] branch main updated: [CI][BUGFIX] Better to pass the build folder (#15965)

2023-10-23 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 6a8cb323a0 [CI][BUGFIX] Better to pass the build folder (#15965)
6a8cb323a0 is described below

commit 6a8cb323a0a35fa1fd6b185b268ca2e9efb8d712
Author: Siva 
AuthorDate: Mon Oct 23 12:22:43 2023 +0530

[CI][BUGFIX] Better to pass the build folder (#15965)

build_dir is not passed to allow custom build folders
---
 tests/scripts/ci.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/scripts/ci.py b/tests/scripts/ci.py
index 1c9981bcb2..1add6bcde3 100755
--- a/tests/scripts/ci.py
+++ b/tests/scripts/ci.py
@@ -593,7 +593,7 @@ generated = [
 "run unit tests",
 [
 "./tests/scripts/task_java_unittest.sh",
-"./tests/scripts/task_opencl_cpp_unittest.sh",
+"./tests/scripts/task_opencl_cpp_unittest.sh {build_dir}",
 "./tests/scripts/task_python_unittest_gpuonly.sh",
 "./tests/scripts/task_python_integration_gpuonly.sh",
 ],

[tvm] branch main updated: [Fix] Remove duplicated words from comments, NFC (#15873)

2023-10-04 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 958c27123a [Fix] Remove duplicated words from comments, NFC (#15873)
958c27123a is described below

commit 958c27123a45a9629e57cee20dbca28263c836bd
Author: Krzysztof Parzyszek 
AuthorDate: Thu Oct 5 00:37:35 2023 -0500

[Fix] Remove duplicated words from comments, NFC (#15873)

Removed instances of accidentally repeated words from comments. There
are cases where duplicated words appear legitimately, those cases remain
unmodified.
---
 docs/Doxyfile| 2 +-
 docs/conf.py | 4 ++--
 include/tvm/runtime/logging.h| 2 +-
 include/tvm/runtime/ndarray.h| 2 +-
 include/tvm/runtime/packed_func.h| 2 +-
 include/tvm/tir/expr.h   | 2 +-
 include/tvm/tir/stmt.h   | 2 +-
 python/tvm/relay/op/contrib/clml.py  | 2 +-
 python/tvm/relay/transform/memory_plan.py| 2 +-
 python/tvm/runtime/ndarray.py| 2 +-
 python/tvm/te/hybrid/__init__.py | 2 +-
 python/tvm/te/schedule.py| 2 +-
 python/tvm/topi/arm_cpu/qnn.py   | 2 +-
 src/arith/const_fold.h   | 4 ++--
 src/arith/product_normal_form.h  | 4 ++--
 src/relay/collage/partition_rule.h   | 2 +-
 src/relay/transforms/combine_parallel_op_batch.h | 2 +-
 src/runtime/c_runtime_api.cc | 2 +-
 src/runtime/crt/aot_executor/aot_executor.c  | 2 +-
 src/target/source/ptx.h  | 2 +-
 src/tir/analysis/control_flow_graph.h| 2 +-
 src/tir/schedule/error.h | 2 +-
 src/tir/transforms/unroll_loop.cc| 2 +-
 tests/python/contrib/test_arm_compute_lib/infrastructure.py  | 2 +-
 tests/python/contrib/test_hexagon/conv2d/test_conv2d_conv2d.md   | 2 +-
 tests/python/contrib/test_hexagon/test_benchmark_elemwise_add.py | 2 +-
 tests/python/relay/test_pass_plan_devices.py | 2 +-
 tests/python/unittest/test_tir_transform_remove_no_op.py | 2 +-
 web/src/environment.ts   | 2 +-
 29 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/docs/Doxyfile b/docs/Doxyfile
index 8615f5b2ae..d1ea82c842 100644
--- a/docs/Doxyfile
+++ b/docs/Doxyfile
@@ -306,7 +306,7 @@ EXTENSION_MAPPING  =
 
 # When enabled doxygen tries to link words that correspond to documented
 # classes, or namespaces to their corresponding documentation. Such a link can
-# be prevented in individual cases by by putting a % sign in front of the word
+# be prevented in individual cases by putting a % sign in front of the word
 # or globally by setting AUTOLINK_SUPPORT to NO.
 # The default value is: YES.
 
diff --git a/docs/conf.py b/docs/conf.py
index 330ae919a6..b1ecb37c97 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -244,7 +244,7 @@ INSTALL_TVM_CUDA_DEV = f"""\
 # Installs the latest dev build of TVM from PyPI, with CUDA enabled. To use 
this,
 # you must request a Google Colab instance with a GPU by going to Runtime ->
 # Change runtime type -> Hardware accelerator -> GPU. If you wish to build from
-# source, see see https://tvm.apache.org/docs/install/from_source.html
+# source, see https://tvm.apache.org/docs/install/from_source.html
 pip install tlcpack-nightly-cu113 --pre -f https://tlcpack.ai/wheels"";
 
 INSTALL_TVM_CUDA_FIXED = f"""\
@@ -252,7 +252,7 @@ INSTALL_TVM_CUDA_FIXED = f"""\
 # Installs TVM version {version} from PyPI, with CUDA enabled. To use this,
 # you must request a Google Colab instance with a GPU by going to Runtime ->
 # Change runtime type -> Hardware accelerator -> GPU. If you wish to build from
-# source, see see https://tvm.apache.org/docs/install/from_source.html
+# source, see https://tvm.apache.org/docs/install/from_source.html
 pip install apache-tvm-cu113=={version} -f https://tlcpack.ai/wheels"";
 
 
diff --git a/include/tvm/runtime/logging.h b/include/tvm/runtime/logging.h
index 16a176a801..45c390df1d 100644
--- a/include/tvm/runtime/logging.h
+++ b/include/tvm/runtime/logging.h
@@ -113,7 +113,7 @@
  * in a function, or 'continue' or 'break' in a loop)
  * The default behavior when quit_on_assertion is false,

[tvm] branch main updated: [Relay][Keras][Bugfix] fix the converters of GRU and SimpleRNN about the go_backwards attribute (#15829)

2023-09-29 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 28908998e0 [Relay][Keras][Bugfix] fix the converters of GRU and 
SimpleRNN about the go_backwards attribute (#15829)
28908998e0 is described below

commit 28908998e0c55025a89e8e2bd26a3fe3e6c84356
Author: Qingchao Shen 
AuthorDate: Fri Sep 29 15:54:23 2023 +0800

[Relay][Keras][Bugfix] fix the converters of GRU and SimpleRNN about the 
go_backwards attribute (#15829)

* fix bug in gru and simpleRNN about go_backwards

* Update test_forward.py

* Update keras.py
---
 python/tvm/relay/frontend/keras.py  |  4 
 tests/python/frontend/keras/test_forward.py | 12 
 2 files changed, 16 insertions(+)

diff --git a/python/tvm/relay/frontend/keras.py 
b/python/tvm/relay/frontend/keras.py
index 9e09cb400a..6c82ebb427 100644
--- a/python/tvm/relay/frontend/keras.py
+++ b/python/tvm/relay/frontend/keras.py
@@ -1062,6 +1062,8 @@ def _convert_simple_rnn(
 in_bias = etab.new_const(weightList[2])
 assert len(in_data.type_annotation.shape) == 3
 timeDim = in_data.type_annotation.shape[1].value
+if keras_layer.go_backwards:
+in_data = _op.reverse(in_data, axis=1)
 in_data_split = _op.split(in_data, indices_or_sections=timeDim, axis=1)
 for i in range(len(in_data_split)):
 in_data_split_i = _op.nn.batch_flatten(in_data_split[i])
@@ -1090,6 +1092,8 @@ def _convert_gru(
 recurrent_weight = etab.new_const(weightList[1].transpose([1, 0]))
 if keras_layer.use_bias:
 in_bias = etab.new_const(weightList[2])
+if keras_layer.go_backwards:
+in_data = _op.reverse(in_data, axis=1)
 units = list(weightList[0].shape)[1]
 assert units > 0, "The value of units must be a positive integer"
 in_data = _op.nn.batch_flatten(in_data)
diff --git a/tests/python/frontend/keras/test_forward.py 
b/tests/python/frontend/keras/test_forward.py
index ba3880e186..8c5b578060 100644
--- a/tests/python/frontend/keras/test_forward.py
+++ b/tests/python/frontend/keras/test_forward.py
@@ -568,12 +568,23 @@ class TestKeras:
 keras_mod.layers.SimpleRNN(
 units=16, return_state=False, activation="tanh", use_bias=False
 ),
+keras_mod.layers.SimpleRNN(
+units=16, return_state=False, activation="tanh", 
go_backwards=True
+),
+keras_mod.layers.GRU(
+units=16,
+return_state=False,
+recurrent_activation="sigmoid",
+activation="tanh",
+reset_after=False,
+),
 keras_mod.layers.GRU(
 units=16,
 return_state=False,
 recurrent_activation="sigmoid",
 activation="tanh",
 reset_after=False,
+use_bias=False,
 ),
 keras_mod.layers.GRU(
 units=16,
@@ -582,6 +593,7 @@ class TestKeras:
 activation="tanh",
 reset_after=False,
 use_bias=False,
+go_backwards=True,
 ),
 ]
 for rnn_func in rnn_funcs:

[tvm] branch main updated: [ADRENO] Minor changes for Adreno docs and help scripts (#15830)

2023-09-28 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 9d8e6fda50 [ADRENO] Minor changes for Adreno docs and help scripts 
(#15830)
9d8e6fda50 is described below

commit 9d8e6fda50bce14bf597de1f87711230e6001e4e
Author: Siva 
AuthorDate: Thu Sep 28 15:38:30 2023 +0530

[ADRENO] Minor changes for Adreno docs and help scripts (#15830)

[ADRENO] Minor changes for Adreno docs and help scripts

NCHW is mandatory layout for CLML offload. Updated the docs.
CI scripts will keep OpenCL enbaled as fallback always.
Enable configurable device bind ports. Helps in multi user environments.
---
 docs/how_to/deploy/adreno.rst |  2 +-
 tests/scripts/setup-adreno-env.sh | 18 +-
 tests/scripts/task_config_build_adreno.sh |  2 ++
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/docs/how_to/deploy/adreno.rst b/docs/how_to/deploy/adreno.rst
index ed016a3ff7..f0b8c6f757 100644
--- a/docs/how_to/deploy/adreno.rst
+++ b/docs/how_to/deploy/adreno.rst
@@ -432,7 +432,7 @@ as the OpenCL path is fallback option for any operator 
didn't go through OpenCLM
 
python3 -m tvm.driver.tvmc compile \
--cross-compiler 
${ANDROID_NDK_HOME}/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang
 \
-   --target="opencl, clml, llvm" --target-llvm-mtriple aarch64-linux-gnu 
--target-opencl-device adreno \
+   --target="opencl, clml, llvm" --desired-layout NCHW --target-llvm-mtriple 
aarch64-linux-gnu --target-opencl-device adreno \
--tuning-records keras-resnet50.log -o keras-resnet50.tar resnet50.h5
 
 On successful compilation, above command produce ``keras-resnet50.tar``.
diff --git a/tests/scripts/setup-adreno-env.sh 
b/tests/scripts/setup-adreno-env.sh
index 55a92c5f61..15c124a0f0 100755
--- a/tests/scripts/setup-adreno-env.sh
+++ b/tests/scripts/setup-adreno-env.sh
@@ -20,12 +20,13 @@
 ENVIRONMENT=""
 RPC_PORT=""
 ADB_SERIAL=""
+LISTEN_PORT=5000
 
 function usage() {
 echo "Helper script to setup the environment for Tracker, RPC Device and 
for application"
 echo "Usage (Help) : source setup-adreno-env.sh -h"
 echo "Usage (Tracker): source setup-adreno-env.sh -e tracker -p "
-echo "Usage (Device): source setup-adreno-env.sh -e device -p  
-d "
+echo "Usage (Device): source setup-adreno-env.sh -e device -p  
-d  [-l ]"
 echo "Usage (Query): source setup-adreno-env.sh -e query -p "
 }
 
@@ -46,6 +47,11 @@ while [[ $# -gt 0 ]]; do
   shift # past argument
   shift # past value
   ;;
+-l|--listen-port)
+  LISTEN_PORT="$2"
+  shift # past argument
+  shift # past value
+  ;;
 -h|--help)
   usage
   return 0
@@ -62,6 +68,7 @@ done
 echo "ENVIRONMENT   = ${ENVIRONMENT}"
 echo "RPC_PORT  = ${RPC_PORT}"
 echo "ADB_SERIAL= ${ADB_SERIAL}"
+echo "DEVICE LISTEN POPRT= ${LISTEN_PORT}"
 
 
 function def_environment() {
@@ -100,10 +107,11 @@ case ${ENVIRONMENT} in
 fi
 
 adb reverse tcp:${TVM_TRACKER_PORT} tcp:${TVM_TRACKER_PORT}
-adb forward tcp:5000 tcp:5000
-adb forward tcp:5001 tcp:5001
-adb forward tcp:5002 tcp:5002
-adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; 
LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 
--port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} 
--key=${RPC_DEVICE_KEY}"
+adb forward tcp:${LISTEN_PORT} tcp:${LISTEN_PORT}
+adb forward tcp:$((LISTEN_PORT + 1)) tcp:$((LISTEN_PORT + 1))
+adb forward tcp:$((LISTEN_PORT + 2)) tcp:$((LISTEN_PORT + 2))
+adb forward tcp:$((LISTEN_PORT + 3)) tcp:$((LISTEN_PORT + 3))
+adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; 
LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 
--port=${LISTEN_PORT} --port-end=$((LISTEN_PORT + 10)) 
--tracker=127.0.0.1:${TVM_TRACKER_PORT} --key=${RPC_DEVICE_KEY}"
 ;;
 
   "query")
diff --git a/tests/scripts/task_config_build_adreno.sh 
b/tests/scripts/task_config_build_adreno.sh
index 62e6ffecbc..1b6750f165 100755
--- a/tests/scripts/task_config_build_adreno.sh
+++ b/tests/scripts/task_config_build_adreno.sh
@@ -25,6 +25,8 @@ cp ../cmake/config.cmake .
 
 if [ -f "${ADRENO_OPENCL}/CL/cl_qcom_ml_ops.h" ] ; then
 echo set\(USE_CLML ${ADRENO_OPENCL}\) >> config.cmake
+else
+echo set\(USE_OPENCL ON\) >> config.cmake
 fi
 echo set\(USE_RPC ON\) >> config.cmake
 echo set\(USE_GRAPH_EXECUTOR ON\) >> config.cmake

[tvm] branch main updated: fix _convert_simple_rnn (#15723)

2023-09-12 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 7fd4704003 fix _convert_simple_rnn (#15723)
7fd4704003 is described below

commit 7fd4704003dec853abbfc15a47a0d07d941b7a8a
Author: Haoyang 
AuthorDate: Wed Sep 13 12:53:41 2023 +0800

fix _convert_simple_rnn (#15723)

* fix _convert_simple_rnn

* fix _convert_simple_rnn

* fix errors in the last pr
---
 python/tvm/relay/frontend/keras.py  | 25 ++---
 tests/python/frontend/keras/test_forward.py | 11 +++
 2 files changed, 25 insertions(+), 11 deletions(-)

diff --git a/python/tvm/relay/frontend/keras.py 
b/python/tvm/relay/frontend/keras.py
index 205b2be490..9e09cb400a 100644
--- a/python/tvm/relay/frontend/keras.py
+++ b/python/tvm/relay/frontend/keras.py
@@ -1052,23 +1052,26 @@ def _convert_simple_rnn(
 inexpr = [inexpr, prev_op]
 in_data = inexpr[0]
 prev_op = inexpr[1]
+prev_op = _op.nn.batch_flatten(prev_op)
 weightList = keras_layer.get_weights()
 kernel_weight = etab.new_const(weightList[0].transpose([1, 0]))
 recurrent_weight = etab.new_const(weightList[1].transpose([1, 0]))
-if keras_layer.use_bias:
-in_bias = etab.new_const(weightList[2])
 units = list(weightList[0].shape)[1]
 assert units > 0, "The value of units must be a positive integer"
-in_data = _op.nn.batch_flatten(in_data)
-ixh = _op.nn.dense(in_data, kernel_weight, units=units)
 if keras_layer.use_bias:
-ixh = _op.nn.bias_add(ixh, bias=in_bias)
-prev_op = _op.nn.batch_flatten(prev_op)
-ixh2 = _op.nn.dense(prev_op, recurrent_weight, units=units)
-output = ixh + ixh2
-output = _convert_activation(output, keras_layer, etab, data_layout)
-out_shape = tuple(dim if dim else 1 for dim in 
_as_list(keras_layer.output_shape)[0])
-output = _op.reshape(output, newshape=out_shape)
+in_bias = etab.new_const(weightList[2])
+assert len(in_data.type_annotation.shape) == 3
+timeDim = in_data.type_annotation.shape[1].value
+in_data_split = _op.split(in_data, indices_or_sections=timeDim, axis=1)
+for i in range(len(in_data_split)):
+in_data_split_i = _op.nn.batch_flatten(in_data_split[i])
+ixh = _op.nn.dense(in_data_split_i, kernel_weight, units=units)
+if keras_layer.use_bias:
+ixh = _op.nn.bias_add(ixh, bias=in_bias)
+ixh2 = _op.nn.dense(prev_op, recurrent_weight, units=units)
+output = ixh + ixh2
+output = _convert_activation(output, keras_layer, etab, data_layout)
+prev_op = output
 return [output, output]
 
 
diff --git a/tests/python/frontend/keras/test_forward.py 
b/tests/python/frontend/keras/test_forward.py
index 80460f6063..9d33b15a91 100644
--- a/tests/python/frontend/keras/test_forward.py
+++ b/tests/python/frontend/keras/test_forward.py
@@ -825,6 +825,16 @@ class TestKeras:
 )
 verify_keras_frontend(dense_model, need_transpose=False)
 
+def test_simplernn_with_infertype(self, keras_mod):
+"""This test case is from https://github.com/apache/tvm/issues/14868"";
+input_shape = (2, 2, 2)
+x = keras_mod.layers.Input(shape=input_shape[1:], dtype="float32")
+layer = keras_mod.layers.SimpleRNN(units=4)
+y = layer(x)
+model = keras_mod.models.Model(x, y)
+mod, _ = relay.frontend.from_keras(model, {model.input_names[0]: 
input_shape})
+relay.transform.InferType()(mod)
+
 
 if __name__ == "__main__":
 for k in [keras, tf_keras]:
@@ -867,3 +877,4 @@ if __name__ == "__main__":
 sut.test_forward_repeat_vector(keras_mod=k)
 sut.test_forward_l2_normalize(keras_mod=k)
 sut.test_forward_time_distributed(keras_mod=k)
+sut.test_simplernn_with_infertype(keras_mod=k)

[tvm] branch main updated: [Relay][Bugfix] Fix the wrong implementation about operator Threshold in oneflow (#15715)

2023-09-12 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new d8136fbaa1 [Relay][Bugfix] Fix the wrong implementation about operator 
Threshold in oneflow (#15715)
d8136fbaa1 is described below

commit d8136fbaa158da4fb8759b88eb6eda35e55e9e53
Author: Qingchao Shen 
AuthorDate: Tue Sep 12 14:58:23 2023 +0800

[Relay][Bugfix] Fix the wrong implementation about operator Threshold in 
oneflow (#15715)

* [Relay][BugFix] fix the wrong implementation of Threshold in OneFlow

* Update test_forward.py

* Update oneflow.py

* Update test_forward.py

* Update oneflow.py

* Update test_forward.py

add version checking
---
 python/tvm/relay/frontend/oneflow.py  | 15 +--
 tests/python/frontend/oneflow/test_forward.py | 15 +++
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/python/tvm/relay/frontend/oneflow.py 
b/python/tvm/relay/frontend/oneflow.py
index 7a713e5e15..ddfa043caf 100644
--- a/python/tvm/relay/frontend/oneflow.py
+++ b/python/tvm/relay/frontend/oneflow.py
@@ -1025,15 +1025,17 @@ class Dropout(OneFlowOpConverter):
 return out
 
 
-class ThresholdedRelu(OneFlowOpConverter):
-"""Operator converter for ThresholdedRelu."""
+class Threshold(OneFlowOpConverter):
+"""Operator converter for Threshold."""
 
 @classmethod
 def _impl_v1(cls, inputs, attrs, params):
-alpha = float(attrs.get("alpha", 1.0))
-alpha_tensor = _op.full_like(inputs[0], fill_value=_expr.const(alpha))
-mask = _op.greater(inputs[0], alpha_tensor).astype("float32")
-return inputs[0] * mask
+threshold = float(attrs.get("threshold_val", 1.0))
+threshold_tensor = _op.full_like(inputs[0], 
fill_value=_expr.const(threshold))
+value = float(attrs.get("value"))
+value_tensor = _op.full_like(inputs[0], fill_value=_expr.const(value))
+mask = _op.greater(inputs[0], threshold_tensor)
+return _op.where(mask, inputs[0], value_tensor)
 
 
 class Elu(OneFlowOpConverter):
@@ -1425,6 +1427,7 @@ def get_convert_map():
 "relu": Renamer("relu"),
 "leaky_relu": Renamer("leaky_relu"),
 "prelu": PReLU.get_converter(),
+"threshold": Threshold.get_converter(),
 "selu": Selu.get_converter(),
 "silu": Silu.get_converter(),
 "gelu": Gelu.get_converter(),
diff --git a/tests/python/frontend/oneflow/test_forward.py 
b/tests/python/frontend/oneflow/test_forward.py
index 17583b3c25..7ddc347e86 100644
--- a/tests/python/frontend/oneflow/test_forward.py
+++ b/tests/python/frontend/oneflow/test_forward.py
@@ -24,6 +24,7 @@ import tvm
 import tvm.testing
 import tvm.topi.testing
 from tvm import relay
+from packaging import version as package_version
 
 MODEL_HOME = "test_model"
 
@@ -702,6 +703,15 @@ def test_activation():
 x = x.softmax(dim=-1)
 return x
 
+class Threshold(flow.nn.Module):
+def __init__(self):
+super().__init__()
+self.active = flow.nn.Threshold(0.5, 0.2)
+
+def forward(self, x):
+x = self.active(x)
+return x
+
 if os.path.exists(MODEL_HOME):
 rmdir(MODEL_HOME)
 
@@ -738,6 +748,11 @@ def test_activation():
 inputs=flow.tensor(np.random.rand(1, 12, 197, 
197).astype(np.float32)),
 )
 
+# Threshold was introduced in the version 0.8.0 of oneflow
+if package_version.parse(flow.__version__) >= 
package_version.parse("0.8.0"):
+model14 = Threshold().eval()
+verify_activation(model14, device="llvm")
+
 
 @tvm.testing.uses_gpu
 def test_math():

[tvm] branch main updated: Fix a bug caused by PyTorch instance_norm when the input shape is [1,1,1,2] (#15683)

2023-09-08 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new cd7d64e914 Fix a bug caused by PyTorch instance_norm when the input 
shape is [1,1,1,2] (#15683)
cd7d64e914 is described below

commit cd7d64e914f763f1f3abc9105142a3f60580c31b
Author: Haoyang 
AuthorDate: Fri Sep 8 15:28:00 2023 +0800

Fix a bug caused by PyTorch instance_norm when the input shape is [1,1,1,2] 
(#15683)

* Fix an adaptive_max_pool1d operator conversion bug

* Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* Fix an adaptive_max_pool1d operator conversion bug

* Fix an adaptive_max_pool1d operator conversion bug

* Add a TODO

* Add a TODO

* Add a TODO

* Fix the Bug caused by torch's instance_norm when input data is [1, 1, 1, 
2]

* Add a unit test

* Fix the Bug caused by torch's instance_norm when input data is [1, 1, 1, 
2]

* Fix the Bug caused by torch's instance_norm when input data is [1, 1, 1, 
2]

* simplify the last fix
---
 python/tvm/relay/frontend/pytorch.py  |  2 +-
 tests/python/frontend/pytorch/test_forward.py | 14 ++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index 683b94dd92..9ddd04b5b4 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -4424,7 +4424,7 @@ def _create_typed_const(data, dtype):
 dtype should be a TVM dtype"""
 
 if dtype == "float64":
-typed_data = _expr.const(np.float64(data), dtype=dtype)
+typed_data = _expr.const(np.asarray(data, dtype="float64"), 
dtype=dtype)
 elif dtype == "float32":
 typed_data = _expr.const(np.float32(data), dtype=dtype)
 elif dtype == "float16":
diff --git a/tests/python/frontend/pytorch/test_forward.py 
b/tests/python/frontend/pytorch/test_forward.py
index 8c1cdbb0cf..9ee03512e7 100644
--- a/tests/python/frontend/pytorch/test_forward.py
+++ b/tests/python/frontend/pytorch/test_forward.py
@@ -3449,6 +3449,20 @@ def test_forward_adaptive_max_pool1d():
 verify_model(m.float().eval(), input_data=input_data)
 
 
+@tvm.testing.uses_gpu
+def test_forward_instance_norm():
+"""test_forward_instance_norm"""
+
+class instance_norm(Module):
+def forward(self, *args):
+return torch.nn.functional.instance_norm(args[0], 
use_input_stats=True)
+
+m = instance_norm().float().eval()
+input_data = torch.randn([1, 1, 1, 2], dtype=torch.float64)
+
+verify_model(m.float().eval(), input_data=input_data)
+
+
 @tvm.testing.uses_gpu
 def test_forward_full_like():
 """test_forward_full_like"""

[tvm] branch main updated: [Relay] Fix an adaptive_max_pool1d operator conversion bug (#15386)

2023-09-04 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new d75083cd97 [Relay] Fix an adaptive_max_pool1d operator conversion bug 
(#15386)
d75083cd97 is described below

commit d75083cd97ede706338ab413dbc964009456d01b
Author: Haoyang 
AuthorDate: Mon Sep 4 15:26:00 2023 +0800

[Relay] Fix an adaptive_max_pool1d operator conversion bug (#15386)

* Fix an adaptive_max_pool1d operator conversion bug

* Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* add tests for Fix an adaptive_max_pool1d operator conversion bug

* Fix an adaptive_max_pool1d operator conversion bug

* Fix an adaptive_max_pool1d operator conversion bug

* Add a TODO

* Add a TODO

* Add a TODO
---
 python/tvm/relay/frontend/pytorch.py  | 10 +-
 tests/python/frontend/pytorch/test_forward.py | 10 ++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index 42765d2fb0..683b94dd92 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -4291,7 +4291,15 @@ class PyTorchOpConverter:
 
 self.current_op.pop()
 
-return [_wrap_const(outputs[ret_name]) for ret_name in ret_names]
+# TODO(@haoyang9804): outputs[ret_name] could be None and cause some 
issue
+# revealed by https://github.com/apache/tvm/issues/15004
+# Now only adaptive_max_pool1d is considered. Maybe other ops could 
also
+# trigger this problem.
+return [
+_wrap_const(outputs[ret_name])
+for ret_name in ret_names
+if ret_name != "aten::adaptive_max_pool1d_0_1"
+]
 
 def _set_parameter_source_name(self, op_node, outputs):
 """A helper function to rewrite source_name of parameter."""
diff --git a/tests/python/frontend/pytorch/test_forward.py 
b/tests/python/frontend/pytorch/test_forward.py
index cb49e837fe..8c1cdbb0cf 100644
--- a/tests/python/frontend/pytorch/test_forward.py
+++ b/tests/python/frontend/pytorch/test_forward.py
@@ -3439,6 +3439,16 @@ def test_forward_full():
 verify_model(Full2().float().eval(), input_data=[])
 
 
+@tvm.testing.uses_gpu
+def test_forward_adaptive_max_pool1d():
+"""test_forward_adaptive_max_pool1d"""
+torch.set_grad_enabled(False)
+input_data = [torch.randn([2, 2, 4], dtype=torch.float32)]
+m = torch.nn.AdaptiveMaxPool1d(3)
+
+verify_model(m.float().eval(), input_data=input_data)
+
+
 @tvm.testing.uses_gpu
 def test_forward_full_like():
 """test_forward_full_like"""

[tvm] branch main updated: Fixed search task comment (#15535)

2023-08-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 2b718e5d00 Fixed search task comment (#15535)
2b718e5d00 is described below

commit 2b718e5d00948f56b880d14e13ab9fb3bad26a54
Author: Thais Camacho 
AuthorDate: Mon Aug 14 11:13:45 2023 -0300

Fixed search task comment (#15535)

fixed search task comment
---
 python/tvm/auto_scheduler/search_task.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/tvm/auto_scheduler/search_task.py 
b/python/tvm/auto_scheduler/search_task.py
index 51a74dae93..767baf916d 100644
--- a/python/tvm/auto_scheduler/search_task.py
+++ b/python/tvm/auto_scheduler/search_task.py
@@ -576,7 +576,7 @@ class SearchTask(Object):
 # workload[0] is either the compute function name or the ComputeDAG 
hash.
 # The compute functions are already registered when importing TVM, so 
here
 # we only register the ComputeDAG workloads. If the same workload has
-# already been registered, the later registration overrides the 
prvious one.
+# already been registered, the later registration overrides the 
previous one.
 if workload[0] not in WORKLOAD_FUNC_REGISTRY:
 register_workload_tensors(state["workload_key"], 
state["compute_dag"].tensors)

[tvm] branch main updated: [Fix] Fix the typo in compile flag (#15542)

2023-08-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 482b3c2c99 [Fix] Fix the typo in compile flag (#15542)
482b3c2c99 is described below

commit 482b3c2c998e345f7daffa7bfff099898dc7b4c6
Author: Haolin Zhang <44371223+chrisz...@users.noreply.github.com>
AuthorDate: Mon Aug 14 07:12:39 2023 -0700

[Fix] Fix the typo in compile flag (#15542)

Fix the compile flag typo from --shared to -shared in fucntion 
_linux_compile
---
 python/tvm/contrib/cc.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/tvm/contrib/cc.py b/python/tvm/contrib/cc.py
index ce02a132e2..b69c925411 100644
--- a/python/tvm/contrib/cc.py
+++ b/python/tvm/contrib/cc.py
@@ -280,7 +280,7 @@ def _linux_compile(output, objects, options, compile_cmd, 
compile_shared=False):
 cmd += ["-c"]
 else:
 if compile_shared or output.endswith(".so") or 
output.endswith(".dylib"):
-cmd += ["--shared"]
+cmd += ["-shared"]
 cmd += ["-o", output]
 if isinstance(objects, str):
 cmd += [objects]

[tvm] branch main updated: [Relay] Disable exception for ADT in mixed precision pass (#15533)

2023-08-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 927df59662 [Relay] Disable exception for ADT in mixed precision pass 
(#15533)
927df59662 is described below

commit 927df5966237f10978319044716d93c90bf8843c
Author: Egor Churaev 
AuthorDate: Mon Aug 14 10:32:50 2023 +0300

[Relay] Disable exception for ADT in mixed precision pass (#15533)

If topology contains while loop and we want to transform it to mixed
precision then we get an exception that "ADT are not supported for
mixed precision pass". It happens, because while loop implemented as a
lambda which is assigned to a VarNode.

In this commit I changed the behavior of ToMixedPrecision pass and
instead of generating exception, it just do nothing.

Correspondent regression test is added.
---
 src/relay/transforms/to_mixed_precision.cc|  9 ---
 tests/python/relay/test_to_mixed_precision.py | 35 ++-
 2 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/src/relay/transforms/to_mixed_precision.cc 
b/src/relay/transforms/to_mixed_precision.cc
index 4638ee5477..5026b1bcba 100644
--- a/src/relay/transforms/to_mixed_precision.cc
+++ b/src/relay/transforms/to_mixed_precision.cc
@@ -350,10 +350,11 @@ class MixedPrecisionPass : public MixedModeMutator {
 
 // TODO(AndrewZhaoLuo): Support ADTs
 // Relay's algebraic data types are not supported yet.
-ICHECK(!cur_op.as()   // used to declare functions for 
recursion
-   && !cur_op.as()  // constructing ADT types
-   && !cur_op.as()) // used for calling recursive 
functions
-<< "Algebraic Data Types (ADT) are not supported yet for mixed 
precision pass.";
+bool isADT = (cur_op.as()   // used to declare 
functions for recursion
+  || cur_op.as()  // constructing ADT types
+  || cur_op.as()  // used for binding lambdas
+  || cur_op.as());// used for calling 
recursive functions
+if (isADT) return post;
 
 // Get info on the operation being called:
 // conversion category (int), accumulation dtype (str), output dtype (str)
diff --git a/tests/python/relay/test_to_mixed_precision.py 
b/tests/python/relay/test_to_mixed_precision.py
index a802eee6d6..4c97642498 100644
--- a/tests/python/relay/test_to_mixed_precision.py
+++ b/tests/python/relay/test_to_mixed_precision.py
@@ -49,7 +49,6 @@ def verify_mixed_precision_output_close(
 atol: float = 0,
 keep_orig_output_dtype=False,
 ) -> tvm.runtime.Module:
-
 mod = InferType()(mod)
 result_fp32 = run_module(mod, mod_params)
 
@@ -586,5 +585,39 @@ def test_clip_with_pre_op(target_precision):
 assert tvm.ir.structural_equal(expected_mod, output_mod)
 
 
+def test_loop(target_precision):
+i = relay.var("i", shape=(), dtype="int32")
+st = relay.var("st", shape=(relay.Any(), 1), dtype="int32")
+
+def int32(val):
+return relay.const(val, "int32")
+
+def _cond(i, st):
+return relay.op.min(relay.op.less(i, int32(10)))
+
+def _body(i, st):
+i_vec = relay.op.reshape(i, (1, 1))
+ret = relay.op.concatenate([st, i_vec], axis=0)
+return i + int32(1), ret
+
+loop = relay.loops.while_loop(_cond, [i, st], _body)
+start = relay.var("start", shape=(), dtype="int32")
+body = loop(start, relay.op.reshape(relay.const(0), newshape=(1, 1)))
+func = relay.Function([start], relay.TupleGetItem(body, 1))
+mod = tvm.IRModule()
+mod["main"] = func
+
+mod_params = {
+"start": np.random.uniform(-1, 1, size=()).astype("int32"),
+}
+output_mod = verify_mixed_precision_output_close(
+mod, mod_params, mixed_precision_dtype=target_precision, atol=0.01, 
rtol=0.01
+)
+
+# Create expected module
+expected_mod = InferType()(mod)
+assert tvm.ir.structural_equal(expected_mod, output_mod)
+
+
 if __name__ == "__main__":
 tvm.testing.main()

[tvm] branch main updated: [Relay] Stop ToMixedPrecision when constant is out of dtype range (#15461)

2023-08-02 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 0e905aa4d7 [Relay] Stop ToMixedPrecision when constant is out of dtype 
range (#15461)
0e905aa4d7 is described below

commit 0e905aa4d755cbaeb71cd1fe979b91434177b256
Author: Egor Churaev 
AuthorDate: Thu Aug 3 08:27:34 2023 +0300

[Relay] Stop ToMixedPrecision when constant is out of dtype range (#15461)

* [Relay] Stop ToMixedPrecision when constant is out of dtype range

In some layers, e.g. Clip, we might have a compilation error in the
case when operation takes on the input a constant which is out of
target data type range.

To prevent such situation, a new method was introduced. It compares
values of constant attributes with the range of the target data type. In
case if the value is out of range then float32 will be used.

* Fix lint
---
 src/relay/transforms/to_mixed_precision.cc| 41 --
 tests/python/relay/test_to_mixed_precision.py | 49 +++
 2 files changed, 88 insertions(+), 2 deletions(-)

diff --git a/src/relay/transforms/to_mixed_precision.cc 
b/src/relay/transforms/to_mixed_precision.cc
index 820bc6e58e..4638ee5477 100644
--- a/src/relay/transforms/to_mixed_precision.cc
+++ b/src/relay/transforms/to_mixed_precision.cc
@@ -31,6 +31,7 @@
 
 #include 
 
+#include "../../support/scalars.h"
 #include "pattern_utils.h"
 
 namespace tvm {
@@ -110,6 +111,39 @@ class MixedPrecisionPass : public MixedModeMutator {
   std::vector original_dtype_;
   bool keep_orig_output_dtype_;
 
+  /*! \brief If some of the constant attributes are out of 
mixed_precision_type_ bounds, then
+   * computation cannot be performed in mixed precision. */
+  bool IsMixedPrecisionApplicableToAttrs(const Attrs& attrs) const {
+if (attrs.get() != nullptr) {
+  double min_bound;
+  double max_bound;
+  if (mixed_precision_type_.is_float16()) {
+min_bound = -support::kMaxFloat16;
+max_bound = support::kMaxFloat16;
+  } else if (mixed_precision_type_.is_bfloat16()) {
+min_bound = -support::kMaxBFloat16;
+max_bound = support::kMaxBFloat16;
+  } else if (mixed_precision_type_.is_float8()) {
+double bound = (mixed_precision_type_.code() == DataType::kE4M3Float) 
? support::kMaxE4M3
+  
: support::kMaxE5M2;
+min_bound = -bound;
+max_bound = bound;
+  } else if (mixed_precision_type_.is_float()) {
+min_bound = std::numeric_limits::lowest();
+max_bound = std::numeric_limits::max();
+  } else {
+return true;
+  }
+
+  if (auto cur_attrs = attrs.as()) {
+if (cur_attrs->a_min < min_bound || cur_attrs->a_max > max_bound) {
+  return false;
+}
+  }
+}
+return true;
+  }
+
   Attrs GetNewAttrs(const CallNode* call, const DataType& accumulation_dtype) 
const {
 /* If the accumulation dtype is in the attributes make a copy and mutate 
the field. */
 Attrs cur_attrs = call->attrs;
@@ -382,9 +416,12 @@ class MixedPrecisionPass : public MixedModeMutator {
   all_args_mixed_type_compatible ? MIXED_PRECISION_ALWAYS : 
MIXED_PRECISION_NEVER;
 }
 
+bool is_mixed_precision_applicable =
+static_cast(final_category == MIXED_PRECISION_ALWAYS &&
+  
IsMixedPrecisionApplicableToAttrs(pre_call_node->attrs));
 // Create the new arguments to the call.
 DataType wanted_arg_dtypes =
-final_category == MIXED_PRECISION_ALWAYS ? mixed_precision_type_ : 
DataType::Float(32);
+is_mixed_precision_applicable ? mixed_precision_type_ : 
DataType::Float(32);
 auto call_args_and_types = CastAllArgs(post_call_node->args, 
cur_arg_types, wanted_arg_dtypes);
 Array new_args = call_args_and_types.first;
 Array new_arg_types;
@@ -397,7 +434,7 @@ class MixedPrecisionPass : public MixedModeMutator {
 }
 
 // Finally create the new attributes.
-if (final_category == MIXED_PRECISION_ALWAYS) {
+if (is_mixed_precision_applicable) {
   Attrs new_attrs = GetNewAttrs(pre_call_node, accumulation_dtype);
   Expr output = Call(cur_op, new_args, new_attrs, new_arg_types, 
pre_call_node->span);
   if (accumulation_dtype != output_dtype) {
diff --git a/tests/python/relay/test_to_mixed_precision.py 
b/tests/python/relay/test_to_mixed_precision.py
index 771d366df0..a802eee6d6 100644
--- a/tests/python/relay/test_to_mixed_precision.py
+++ b/tests/python/relay/test_to_mixed_precision.py
@@ -537,5 +537,54 @@ def 
test_convert_follow_node_with_integer_arguments(target_precision):
 assert tvm.ir.structural_equal(expected_mod, output_mod)
 
 
+def test

[tvm] branch main updated: [Codegen][Metal] Support metal warp-level primitive (#15401)

2023-07-26 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 22ec541a6f [Codegen][Metal] Support metal warp-level primitive (#15401)
22ec541a6f is described below

commit 22ec541a6f12a6430246e4d3aed47c5cfc462b1e
Author: Ruihang Lai 
AuthorDate: Wed Jul 26 14:03:40 2023 -0700

[Codegen][Metal] Support metal warp-level primitive (#15401)

This PR introduces the warp-level shuffle primitives used in Metal
Shading Language, and uses them in the implementation of allreduce
lowering.

The introduced primitives are:
* `simd_shuffle`,
* `simd_shuffle_up`,
* `simd_shuffle_down`.

See section 6.9.2 of 
https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf
for details.

The correctness are validated by `test_allreduce_cuda` with the backend
changed to Metal. Given we do not have Metal CI tests, the correctness
is checked only locally.

Given the Metal shuffle primitives do not support (or need) masking,
the pass LowerThreadAllreduce is updated to support such backend
which does not have masks. One unit test for metal is added to ensure
that no mask is used.
---
 src/target/source/intrin_rule_metal.cc |  53 +++
 src/tir/transforms/lower_thread_allreduce.cc   |  35 ---
 .../test_tir_transform_lower_thread_all_reduce.py  | 103 +
 3 files changed, 180 insertions(+), 11 deletions(-)

diff --git a/src/target/source/intrin_rule_metal.cc 
b/src/target/source/intrin_rule_metal.cc
index dd924b9255..cc83eb1462 100644
--- a/src/target/source/intrin_rule_metal.cc
+++ b/src/target/source/intrin_rule_metal.cc
@@ -30,6 +30,28 @@ namespace codegen {
 namespace intrin {
 using tir::FLowerIntrinsic;
 
+struct MetalWarpIntrinsic {
+  const Op operator()(DataType t, const Op& orig_op) const {
+if (orig_op.same_as(builtin::tvm_warp_shuffle())) {
+  return Op::Get("tir.metal.simd_shuffle");
+} else if (orig_op.same_as(builtin::tvm_warp_shuffle_up())) {
+  return Op::Get("tir.metal.simd_shuffle_up");
+} else {
+  ICHECK(orig_op.same_as(builtin::tvm_warp_shuffle_down()));
+  return Op::Get("tir.metal.simd_shuffle_down");
+}
+  }
+};
+
+template 
+static PrimExpr DispatchMetalShuffle(const PrimExpr& e) {
+  const CallNode* call = e.as();
+  ICHECK(call != nullptr);
+  ICHECK_EQ(call->args.size(), 5);  // mask, value, warp_id, width, warp_size
+  Array metal_args{{call->args[1], call->args[2]}};
+  return Call(call->dtype, T()(call->dtype, Downcast(call->op)), 
metal_args);
+}
+
 TVM_REGISTER_OP("tir.floor")
 .set_attr("metal.FLowerIntrinsic", 
DispatchPureExtern);
 
@@ -95,6 +117,37 @@ TVM_REGISTER_OP("tir.cosh")
 
 TVM_REGISTER_OP("tir.erf").set_attr("metal.FLowerIntrinsic", 
DispatchFastErf);
 
+TVM_REGISTER_OP("tir.tvm_warp_shuffle")
+.set_attr("metal.FLowerIntrinsic", 
DispatchMetalShuffle);
+
+TVM_REGISTER_OP("tir.tvm_warp_shuffle_up")
+.set_attr("metal.FLowerIntrinsic", 
DispatchMetalShuffle);
+
+TVM_REGISTER_OP("tir.tvm_warp_shuffle_down")
+.set_attr("metal.FLowerIntrinsic", 
DispatchMetalShuffle);
+
+// Register low-level builtin ops.
+TVM_REGISTER_OP("tir.metal.simd_shuffle")
+.set_num_inputs(2)
+.add_argument("var", "Expr", "The variable to sync.")
+.add_argument("lane", "Expr", "The source thread id.")
+.set_attr("TGlobalSymbol", "simd_shuffle")
+.set_attr("TCallEffectKind", 
Integer(CallEffectKind::kOpaque));
+
+TVM_REGISTER_OP("tir.metal.simd_shuffle_up")
+.set_num_inputs(2)
+.add_argument("var", "Expr", "The variable to sync.")
+.add_argument("delta", "Expr", "The source lane id offset to be added.")
+.set_attr("TGlobalSymbol", "simd_shuffle_up")
+.set_attr("TCallEffectKind", 
Integer(CallEffectKind::kOpaque));
+
+TVM_REGISTER_OP("tir.metal.simd_shuffle_down")
+.set_num_inputs(2)
+.add_argument("var", "Expr", "The variable to sync.")
+.add_argument("delta", "Expr", "The source lane id offset to be 
subtracted.")
+.set_attr("TGlobalSymbol", "simd_shuffle_down")
+.set_attr("TCallEffectKind", 
Integer(CallEffectKind::kOpaque));
+
 }  // namespace intrin
 }  // namespace codegen
 }  // namespace tvm
diff --git a/src/tir/transforms/lower_thread_allreduce.cc 
b/src/tir/transforms/lower_thread_allreduce.cc
index 438dccff0b..fb

[tvm] branch main updated: [Bugfix][Frontend][Keras]Fix a corner case bug in softmax converter of keras frontend (#15337)

2023-07-17 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new e2d6511161 [Bugfix][Frontend][Keras]Fix a corner case bug in softmax 
converter of keras frontend (#15337)
e2d6511161 is described below

commit e2d65111616dfa95797c0dd7e082e4050b71701d
Author: Qingchao Shen 
AuthorDate: Tue Jul 18 13:02:34 2023 +0800

[Bugfix][Frontend][Keras]Fix a corner case bug in softmax converter of 
keras frontend (#15337)

* Fix softmax converter about keras

* add new test cases to capture the bug

* Update keras.py
---
 python/tvm/relay/frontend/keras.py  | 6 --
 tests/python/frontend/keras/test_forward.py | 7 +++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/python/tvm/relay/frontend/keras.py 
b/python/tvm/relay/frontend/keras.py
index 1913d4a268..aba4160695 100644
--- a/python/tvm/relay/frontend/keras.py
+++ b/python/tvm/relay/frontend/keras.py
@@ -131,11 +131,13 @@ def _convert_advanced_activation(inexpr, keras_layer, 
etab, data_layout, input_s
 
 if act_type == "Softmax":
 axis = keras_layer.axis
-dims = len(input_shape)
+dims = len(input_shape) if input_shape else 0
 if isinstance(axis, list):
 raise tvm.error.OpAttributeUnImplemented(f"Softmax with axes 
{axis} is not supported.")
 if data_layout == "NCHW":
-if axis == -1:
+if dims == 0:
+axis = 0
+elif axis == -1:
 axis = 1
 else:
 axis = axis + 1 if axis < dims - 1 else 1
diff --git a/tests/python/frontend/keras/test_forward.py 
b/tests/python/frontend/keras/test_forward.py
index 50a0e98505..53e2ca8dbe 100644
--- a/tests/python/frontend/keras/test_forward.py
+++ b/tests/python/frontend/keras/test_forward.py
@@ -229,6 +229,13 @@ class TestKeras:
 keras_model = keras_mod.models.Model(data, x)
 verify_keras_frontend(keras_model)
 verify_keras_frontend(keras_model, need_transpose=False, 
layout="NHWC")
+# Test the input dimension = 1
+data = keras_mod.layers.Input(shape=(11,))
+act_func = keras_mod.layers.Softmax()
+x = act_func(data)
+keras_model = keras_mod.models.Model(data, x)
+verify_keras_frontend(keras_model)
+verify_keras_frontend(keras_model, need_transpose=False, layout="NHWC")
 
 def test_forward_activations_except(self, keras_mod):
 """

[tvm] branch main updated: Fix keras version problem (#15265)

2023-07-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new b6502f4e27 Fix keras version problem (#15265)
b6502f4e27 is described below

commit b6502f4e278da391719155936aeefb6544115c1f
Author: Shikamaru:) 
AuthorDate: Fri Jul 14 18:20:16 2023 +0800

Fix keras version problem (#15265)

* Fix keras version problem

* Fix keras version problem

* Fix keras version problem

* Fix keras version problem

* Fix keras version problem

* Fix keras version problem

* Fix keras version problem
---
 python/tvm/relay/frontend/keras.py | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/python/tvm/relay/frontend/keras.py 
b/python/tvm/relay/frontend/keras.py
index 0d932cadcc..1913d4a268 100644
--- a/python/tvm/relay/frontend/keras.py
+++ b/python/tvm/relay/frontend/keras.py
@@ -1526,12 +1526,19 @@ def from_keras(model, shape=None, layout="NCHW"):
 raise ValueError("Keras frontend currently supports tensorflow 
backend only.")
 if keras.backend.image_data_format() != "channels_last":
 raise ValueError("Keras frontend currently supports data_format = 
channels_last only.")
-expected_model_class = keras.engine.training.Model
-if hasattr(keras.engine, "InputLayer"):
-input_layer_class = keras.engine.InputLayer
+try:
+import keras.engine as E
+except ImportError:
+try:
+import keras.src.engine as E
+except ImportError:
+raise ImportError("Cannot find Keras's engine")
+expected_model_class = E.training.Model
+if hasattr(E, "InputLayer"):
+input_layer_class = E.InputLayer
 else:
 # TFlite >=2.6
-input_layer_class = keras.engine.input_layer.InputLayer
+input_layer_class = E.input_layer.InputLayer
 else:
 # Importing from Tensorflow Keras (tf.keras)
 try:

[tvm] branch main updated: [CMAKE] Add Vulkan header for Android (#15229)

2023-07-04 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 23fb568521 [CMAKE] Add Vulkan header for Android (#15229)
23fb568521 is described below

commit 23fb56852122919a2ea896000d87be163e2b1bb5
Author: Ruihang Lai 
AuthorDate: Tue Jul 4 22:41:42 2023 -0700

[CMAKE] Add Vulkan header for Android (#15229)

This PR enhances the Vulkan search for Android NDK, so that we can
find the Vulkan header for Android builds.

Co-authored-by: tqchen 
---
 cmake/utils/FindVulkan.cmake | 57 +---
 1 file changed, 32 insertions(+), 25 deletions(-)

diff --git a/cmake/utils/FindVulkan.cmake b/cmake/utils/FindVulkan.cmake
index a2a6fb6035..032ff1dffa 100644
--- a/cmake/utils/FindVulkan.cmake
+++ b/cmake/utils/FindVulkan.cmake
@@ -37,36 +37,43 @@ macro(find_vulkan use_vulkan use_khronos_spirv)
   if(IS_DIRECTORY ${__use_vulkan})
 set(__vulkan_sdk ${__use_vulkan})
 message(STATUS "Custom Vulkan SDK PATH=" ${__use_vulkan})
-   elseif(IS_DIRECTORY $ENV{VULKAN_SDK})
- set(__vulkan_sdk $ENV{VULKAN_SDK})
-   else()
- set(__vulkan_sdk "")
-   endif()
+  elseif(IS_DIRECTORY $ENV{VULKAN_SDK})
+set(__vulkan_sdk $ENV{VULKAN_SDK})
+  else()
+set(__vulkan_sdk "")
+  endif()
 
 
-   if(IS_DIRECTORY ${use_khronos_spirv})
- set(__use_khronos_spirv ${use_khronos_spirv})
- message(STATUS "Custom khronos spirv PATH=" ${__use_khronos_spirv})
-   else()
- set(__use_khronos_spirv "")
-   endif()
+  if(IS_DIRECTORY ${use_khronos_spirv})
+set(__use_khronos_spirv ${use_khronos_spirv})
+message(STATUS "Custom khronos spirv PATH=" ${__use_khronos_spirv})
+  else()
+set(__use_khronos_spirv "")
+  endif()
 
+  if(CMAKE_SYSTEM_NAME STREQUAL "Android")
+set(VULKAN_NDK_SRC ${CMAKE_ANDROID_NDK}/sources/third_party/vulkan/src)
+set(Vulkan_INCLUDE_DIRS ${VULKAN_NDK_SRC}/include)
+set(Vulkan_FOUND TRUE)
+message(STATUS "Android Vulkan_INCLUDE_DIRS=" ${Vulkan_INCLUDE_DIRS})
+message(STATUS "Skip finding SPIRV in Android, make sure you only build 
tvm runtime.")
+return()
+  endif()
 
-   if(__vulkan_sdk)
- set(Vulkan_INCLUDE_DIRS ${__vulkan_sdk}/include)
- find_library(Vulkan_LIBRARY NAMES vulkan vulkan-1 PATHS 
${__vulkan_sdk}/lib)
- if(Vulkan_LIBRARY)
-   set(Vulkan_FOUND TRUE)
- endif()
-   endif(__vulkan_sdk)
+  if(__vulkan_sdk)
+set(Vulkan_INCLUDE_DIRS ${__vulkan_sdk}/include)
+find_library(Vulkan_LIBRARY NAMES vulkan vulkan-1 PATHS 
${__vulkan_sdk}/lib)
+if(Vulkan_LIBRARY)
+  set(Vulkan_FOUND TRUE)
+endif()
+  endif(__vulkan_sdk)
 
-   # resort to find vulkan of option is on
-   if(NOT Vulkan_FOUND)
- if(${__use_vulkan} MATCHES ${IS_TRUE_PATTERN})
-   find_package(Vulkan QUIET)
- endif()
-   endif()
-   # additional libraries
+  # resort to find vulkan of option is on
+  if(NOT Vulkan_FOUND)
+if(${__use_vulkan} MATCHES ${IS_TRUE_PATTERN})
+  find_package(Vulkan QUIET)
+endif()
+  endif()
 
   if(Vulkan_FOUND)
 get_filename_component(VULKAN_LIBRARY_PATH ${Vulkan_LIBRARY} DIRECTORY)

[tvm] branch main updated: [Test] Improve check for TVMError exception in test_cast (#15138)

2023-06-22 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 2c48d7bf21 [Test] Improve check for TVMError exception in test_cast 
(#15138)
2c48d7bf21 is described below

commit 2c48d7bf21cbea2792aebf58e8c9ac085b4cdd38
Author: Krzysztof Parzyszek 
AuthorDate: Thu Jun 22 06:32:31 2023 -0500

[Test] Improve check for TVMError exception in test_cast (#15138)

The original code has
```
with pytest.raises(tvm.error.TVMError):
s.astype("int")
assert "Can't cast a handle to other types" in str(e)
```
The `astype` is expected to throw a TVMError exception, but if it does,
the following assertion never gets executed.

Add try/except to the test to verify that the expected exception is
thrown.
---
 tests/python/unittest/test_tir_nodes.py | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tests/python/unittest/test_tir_nodes.py 
b/tests/python/unittest/test_tir_nodes.py
index 7826b5960b..49816778f1 100644
--- a/tests/python/unittest/test_tir_nodes.py
+++ b/tests/python/unittest/test_tir_nodes.py
@@ -118,9 +118,12 @@ def test_cast():
 assert z.lanes == 4
 
 s = tvm.tir.StringImm("s")
-with pytest.raises(tvm.error.TVMError) as cm:
-s.astype("int")
-assert "Can't cast a handle to other types" in str(cm.execption)
+with pytest.raises(tvm.error.TVMError):
+try:
+s.astype("int")
+except Exception as e:
+assert "Can't cast a handle to other types" in str(e)
+raise
 
 
 def test_attr():

[tvm] branch main updated: [Bugfix][Relay][Keras] Fix the wrong implementation logic about cropping2D (#15053)

2023-06-15 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 90b5acca59 [Bugfix][Relay][Keras] Fix the wrong implementation logic 
about cropping2D (#15053)
90b5acca59 is described below

commit 90b5acca59421d67494955c4fc943a5ddbbc7373
Author: Qingchao Shen 
AuthorDate: Thu Jun 15 17:46:37 2023 +0800

[Bugfix][Relay][Keras] Fix the wrong implementation logic about cropping2D 
(#15053)

* fix the wrong calculation logic of cropping2d

The implementation of cropping2D is wrong. This pr fix it.

* add a test case to caputure the bug

* Update test_forward.py

* Update test_forward.py

* correct the patch

* Update keras.py

* Update test_forward.py

* Update test_forward.py

* Update test_forward.py
---
 python/tvm/relay/frontend/keras.py  | 10 --
 tests/python/frontend/keras/test_forward.py | 10 +-
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/python/tvm/relay/frontend/keras.py 
b/python/tvm/relay/frontend/keras.py
index d963a5d160..8d437027e5 100644
--- a/python/tvm/relay/frontend/keras.py
+++ b/python/tvm/relay/frontend/keras.py
@@ -816,10 +816,16 @@ def _convert_cropping(
 f"Operator {crop_type} is not supported for frontend Keras."
 )
 int32_max = np.iinfo(np.int32).max
+if data_layout == "NHWC":
+begin = [0, crop_t, crop_l, 0]
+end = [int32_max, in_h - crop_b, in_w - crop_r, int32_max]
+else:
+begin = [0, 0, crop_t, crop_l]
+end = [int32_max, int32_max, in_h - crop_b, in_w - crop_r]
 return _op.strided_slice(
 inexpr,
-begin=[0, 0, crop_t, crop_l],
-end=[int32_max, int32_max, in_h - crop_b, in_w - crop_r],
+begin=begin,
+end=end,
 )
 
 
diff --git a/tests/python/frontend/keras/test_forward.py 
b/tests/python/frontend/keras/test_forward.py
index 45935f87f4..cc6421614e 100644
--- a/tests/python/frontend/keras/test_forward.py
+++ b/tests/python/frontend/keras/test_forward.py
@@ -449,7 +449,15 @@ class TestKeras:
 x = keras_mod.layers.Cropping2D(cropping=0)(x)
 x = keras_mod.layers.Add()([x, x])
 keras_model = keras_mod.models.Model(data, x)
-verify_keras_frontend(keras_model)
+verify_keras_frontend(keras_model, layout="NHWC")
+verify_keras_frontend(keras_model, layout="NHWC")
+
+data = keras_mod.layers.Input(shape=(32, 32, 3))
+x = keras_mod.layers.Cropping2D(cropping=(2, 1))(data)
+x = keras_mod.layers.Cropping2D(cropping=(1, 2))(x)
+keras_model = keras_mod.models.Model(data, x)
+verify_keras_frontend(keras_model, layout="NHWC")
+verify_keras_frontend(keras_model, layout="NCHW")
 
 def test_forward_multi_inputs(self, keras_mod):
 data1 = keras_mod.layers.Input(shape=(32, 32, 3))

[tvm] branch main updated: [Bugfix][Relay][Keras] Fix UpSampling2D about the wrong assertion about size (#15082)

2023-06-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 081cc2ef64 [Bugfix][Relay][Keras] Fix UpSampling2D about the wrong 
assertion about size (#15082)
081cc2ef64 is described below

commit 081cc2ef64c866f6dc2e33dbf6912395d09ca749
Author: Qingchao Shen 
AuthorDate: Wed Jun 14 17:13:19 2023 +0800

[Bugfix][Relay][Keras] Fix UpSampling2D about the wrong assertion about 
size (#15082)

* fix wrong assertion about unsample in keras.py

* Update test_forward.py

* Update test_forward.py
---
 python/tvm/relay/frontend/keras.py  | 4 +---
 tests/python/frontend/keras/test_forward.py | 5 +
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/python/tvm/relay/frontend/keras.py 
b/python/tvm/relay/frontend/keras.py
index 16192617fe..d963a5d160 100644
--- a/python/tvm/relay/frontend/keras.py
+++ b/python/tvm/relay/frontend/keras.py
@@ -767,10 +767,8 @@ def _convert_upsample(
 params["scale_h"] = h
 elif upsample_type == "UpSampling2D":
 h, w = keras_layer.size
-if h != w:
-raise tvm.error.OpAttributeInvalid("Height must equal width for 
operator Upsample.")
 params["scale_h"] = h
-params["scale_w"] = h
+params["scale_w"] = w
 
 if hasattr(keras_layer, "interpolation"):
 interpolation = keras_layer.interpolation
diff --git a/tests/python/frontend/keras/test_forward.py 
b/tests/python/frontend/keras/test_forward.py
index debd50b37a..45935f87f4 100644
--- a/tests/python/frontend/keras/test_forward.py
+++ b/tests/python/frontend/keras/test_forward.py
@@ -389,6 +389,11 @@ class TestKeras:
 x = keras_mod.layers.UpSampling2D(size=(3, 3), 
interpolation=interpolation)(data)
 keras_model = keras_mod.models.Model(data, x)
 verify_keras_frontend(keras_model)
+# Height and width are not equal for the attribute size
+data = keras_mod.layers.Input(shape=(2, 1, 3))
+x = keras_mod.layers.UpSampling2D(size=(1, 2), 
interpolation=interpolation)(data)
+keras_model = keras_mod.models.Model(data, x)
+verify_keras_frontend(keras_model)
 
 def test_forward_reshape(self, keras_mod):
 """test_forward_reshape"""

[tvm] branch main updated: Revert "[CI] Make Graviton3 default AArch64 job runner node (#14983)" (#15065)

2023-06-09 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new f172f6c70d Revert "[CI] Make Graviton3 default AArch64 job runner node 
(#14983)" (#15065)
f172f6c70d is described below

commit f172f6c70d8e72556225877c7522ef13b7b250e2
Author: masahi 
AuthorDate: Fri Jun 9 15:08:14 2023 +0900

Revert "[CI] Make Graviton3 default AArch64 job runner node (#14983)" 
(#15065)

This reverts commit 7dc1d6a99f7984855edf1a5c42b66215addeb13f.
---
 ci/jenkins/generated/arm_jenkinsfile.groovy| 38 +-
 ci/jenkins/templates/arm_jenkinsfile.groovy.j2 |  8 +++---
 2 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/ci/jenkins/generated/arm_jenkinsfile.groovy 
b/ci/jenkins/generated/arm_jenkinsfile.groovy
index 2eb6687bee..40211e4eb9 100644
--- a/ci/jenkins/generated/arm_jenkinsfile.groovy
+++ b/ci/jenkins/generated/arm_jenkinsfile.groovy
@@ -60,7 +60,7 @@
 // 'python3 jenkins/generate.py'
 // Note: This timestamp is here to ensure that updates to the Jenkinsfile are
 // always rebased on main before merging:
-// Generated at 2023-06-02T10:54:19.255872
+// Generated at 2023-05-05T13:39:06.460410
 
 import org.jenkinsci.plugins.pipeline.modeldefinition.Utils
 // These are set at runtime from data in ci/jenkins/docker-images.yml, update
@@ -577,17 +577,17 @@ def build(node_type) {
   }
 }
 try {
-build('ARM-GRAVITON3-SPOT')
+build('ARM-SMALL-SPOT')
 } catch (Exception ex) {
-build('ARM-GRAVITON3')
+build('ARM-SMALL')
 }
 
 
 
-def shard_run_integration_aarch64_1_of_4(node_type='ARM-GRAVITON3-SPOT', 
on_demand=false) {
+def shard_run_integration_aarch64_1_of_4(node_type='ARM-SMALL-SPOT', 
on_demand=false) {
   if (!skip_ci && is_docs_only_build != 1) {
 if (on_demand==true || node_type.contains('ARM')) {
-node_type = 'ARM-GRAVITON3'
+node_type = 'ARM-SMALL'
 }
 node(node_type) {
   ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") {
@@ -633,10 +633,10 @@ def 
shard_run_integration_aarch64_1_of_4(node_type='ARM-GRAVITON3-SPOT', on_dema
   }
 }
 
-def shard_run_integration_aarch64_2_of_4(node_type='ARM-GRAVITON3-SPOT', 
on_demand=false) {
+def shard_run_integration_aarch64_2_of_4(node_type='ARM-SMALL-SPOT', 
on_demand=false) {
   if (!skip_ci && is_docs_only_build != 1) {
 if (on_demand==true || node_type.contains('ARM')) {
-node_type = 'ARM-GRAVITON3'
+node_type = 'ARM-SMALL'
 }
 node(node_type) {
   ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") {
@@ -682,10 +682,10 @@ def 
shard_run_integration_aarch64_2_of_4(node_type='ARM-GRAVITON3-SPOT', on_dema
   }
 }
 
-def shard_run_integration_aarch64_3_of_4(node_type='ARM-GRAVITON3-SPOT', 
on_demand=false) {
+def shard_run_integration_aarch64_3_of_4(node_type='ARM-SMALL-SPOT', 
on_demand=false) {
   if (!skip_ci && is_docs_only_build != 1) {
 if (on_demand==true || node_type.contains('ARM')) {
-node_type = 'ARM-GRAVITON3'
+node_type = 'ARM-SMALL'
 }
 node(node_type) {
   ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") {
@@ -731,10 +731,10 @@ def 
shard_run_integration_aarch64_3_of_4(node_type='ARM-GRAVITON3-SPOT', on_dema
   }
 }
 
-def shard_run_integration_aarch64_4_of_4(node_type='ARM-GRAVITON3-SPOT', 
on_demand=false) {
+def shard_run_integration_aarch64_4_of_4(node_type='ARM-SMALL-SPOT', 
on_demand=false) {
   if (!skip_ci && is_docs_only_build != 1) {
 if (on_demand==true || node_type.contains('ARM')) {
-node_type = 'ARM-GRAVITON3'
+node_type = 'ARM-SMALL'
 }
 node(node_type) {
   ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") {
@@ -782,10 +782,10 @@ def 
shard_run_integration_aarch64_4_of_4(node_type='ARM-GRAVITON3-SPOT', on_dema
 
 
 
-def shard_run_topi_aarch64_1_of_2(node_type='ARM-GRAVITON3-SPOT', 
on_demand=false) {
+def shard_run_topi_aarch64_1_of_2(node_type='ARM-SMALL-SPOT', on_demand=false) 
{
   if (!skip_ci && is_docs_only_build != 1) {
 if (on_demand==true || node_type.contains('ARM')) {
-node_type = 'ARM-GRAVITON3'
+node_type = 'ARM-SMALL'
 }
 node(node_type) {
   ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") {
@@ -836,10 +836,10 @@ def 
shard_run_topi_aarch64_1_of_2(node_type='ARM-GRAVITON3-SPOT', on_demand=fals
   }
 }
 
-def shard_run_topi_aarch64_2_of_2(node_type='ARM-GRAVITON3-SPOT', 
on_demand=false) {
+def shard_run_topi_aarch64_2_of_2(node_type='ARM-SMALL-SPOT', on_demand=false) 
{
   if (!skip_ci && is_docs_only_build != 1) {
 if (on_demand==true || node_type.contains('ARM')) {
-node_type = 'ARM-GRAVITON3'
+node_type = 'ARM-SMALL'
 }

[tvm] branch main updated (80079b6493 -> 1366f2e250)

2023-06-04 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 80079b6493 Fix type parse error about AdaptiveMaxPool (#15016)
 add 1366f2e250 [OpenCLML] Reactor and introduce on chip memory and memory 
planner (#14922)

No new revisions were added by this update.

Summary of changes:
 CMakeLists.txt |2 +-
 apps/cpp_clml/clml_runner.cc   |   36 +-
 apps/cpp_clml/clml_runner.h|2 +-
 apps/cpp_clml/scripts/clml_codegen.py  |2 +-
 cmake/modules/OpenCL.cmake |   39 +-
 python/tvm/relay/op/contrib/clml.py|  165 ++--
 src/runtime/contrib/clml/clml_memory_planner.cc|  268 +
 .../runtime/contrib/clml/clml_memory_planner.h |   21 +-
 src/runtime/contrib/clml/clml_runtime.cc   | 1035 ++--
 src/runtime/contrib/clml/clml_runtime.h|  185 
 src/runtime/contrib/clml/clml_utils.cc |  257 +
 src/runtime/contrib/clml/clml_utils.h  |   77 ++
 tests/cpp-runtime/opencl/clml_memory_planner.cc|  439 +
 tests/python/contrib/test_clml/infrastructure.py   |   19 +
 tests/python/contrib/test_clml/test_ops.py |  110 ++-
 tests/scripts/task_build_adreno_bins.sh|4 +-
 16 files changed, 2056 insertions(+), 605 deletions(-)
 create mode 100644 src/runtime/contrib/clml/clml_memory_planner.cc
 copy include/tvm/runtime/contrib/libtorch_runtime.h => 
src/runtime/contrib/clml/clml_memory_planner.h (65%)
 create mode 100644 src/runtime/contrib/clml/clml_runtime.h
 create mode 100644 src/runtime/contrib/clml/clml_utils.cc
 create mode 100644 src/runtime/contrib/clml/clml_utils.h
 create mode 100644 tests/cpp-runtime/opencl/clml_memory_planner.cc

[tvm] branch main updated: [DOCKER][ADRENO] Docker setup changes for multi user environments (#15003)

2023-06-01 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new ca30b13668 [DOCKER][ADRENO] Docker setup changes for multi user 
environments (#15003)
ca30b13668 is described below

commit ca30b136686747e2817fb48b302dfa08eda935de
Author: Siva 
AuthorDate: Thu Jun 1 21:13:25 2023 +0530

[DOCKER][ADRENO] Docker setup changes for multi user environments (#15003)

Docker setup changes for multi user environments

Target setup shouldn't conflict while multiple users on the same target 
device
libc++_shared.so may not be available for some android devices from system
---
 tests/scripts/setup-adreno-env.sh   | 13 +
 tests/scripts/task_python_adreno.sh | 13 +
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/tests/scripts/setup-adreno-env.sh 
b/tests/scripts/setup-adreno-env.sh
index 44eb7edce8..55a92c5f61 100755
--- a/tests/scripts/setup-adreno-env.sh
+++ b/tests/scripts/setup-adreno-env.sh
@@ -90,15 +90,20 @@ case ${ENVIRONMENT} in
 def_environment
 export ANDROID_SERIAL=${ADB_SERIAL}
 
-adb shell "mkdir -p /data/local/tmp/tvm_ci"
-adb push build-adreno-target/tvm_rpc /data/local/tmp/tvm_ci/tvm_rpc_ci
-adb push build-adreno-target/libtvm_runtime.so /data/local/tmp/tvm_ci
+TARGET_FOLDER=/data/local/tmp/tvm_ci-${USER}
+CPP_LIB=`find ${ANDROID_NDK_HOME} -name libc++_shared.so | grep aarch64`
+adb shell "mkdir -p ${TARGET_FOLDER}"
+adb push build-adreno-target/tvm_rpc ${TARGET_FOLDER}/tvm_rpc-${USER}
+adb push build-adreno-target/libtvm_runtime.so ${TARGET_FOLDER}
+if [ -f ${CPP_LIB} ] ; then
+adb push ${CPP_LIB} ${TARGET_FOLDER}
+fi
 
 adb reverse tcp:${TVM_TRACKER_PORT} tcp:${TVM_TRACKER_PORT}
 adb forward tcp:5000 tcp:5000
 adb forward tcp:5001 tcp:5001
 adb forward tcp:5002 tcp:5002
-adb shell "cd /data/local/tmp/tvm_ci; killall -9 tvm_rpc_ci; sleep 2; 
LD_LIBRARY_PATH=/data/local/tmp/tvm_ci/ ./tvm_rpc_ci server --host=0.0.0.0 
--port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} 
--key=${RPC_DEVICE_KEY}"
+adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; 
LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 
--port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} 
--key=${RPC_DEVICE_KEY}"
 ;;
 
   "query")
diff --git a/tests/scripts/task_python_adreno.sh 
b/tests/scripts/task_python_adreno.sh
index 809df1ed2d..6d9c370aec 100755
--- a/tests/scripts/task_python_adreno.sh
+++ b/tests/scripts/task_python_adreno.sh
@@ -38,15 +38,20 @@ sleep 5   # Wait for tracker to bind
 
 export ANDROID_SERIAL=$1
 
-adb shell "mkdir -p /data/local/tmp/tvm_ci"
-adb push build-adreno-target/tvm_rpc /data/local/tmp/tvm_ci/tvm_rpc_ci
-adb push build-adreno-target/libtvm_runtime.so /data/local/tmp/tvm_ci
+TARGET_FOLDER=/data/local/tmp/tvm_ci-${USER}
+adb shell "mkdir -p ${TARGET_FOLDER}"
+adb push build-adreno-target/tvm_rpc ${TARGET_FOLDER}/tvm_rpc-${USER}
+adb push build-adreno-target/libtvm_runtime.so ${TARGET_FOLDER}
+CPP_LIB=`find ${ANDROID_NDK_HOME} -name libc++_shared.so | grep aarch64`
+if [ -f ${CPP_LIB} ] ; then
+adb push ${CPP_LIB} ${TARGET_FOLDER}
+fi
 
 adb reverse tcp:${TVM_TRACKER_PORT} tcp:${TVM_TRACKER_PORT}
 adb forward tcp:5000 tcp:5000
 adb forward tcp:5001 tcp:5001
 adb forward tcp:5002 tcp:5002
-env adb shell "cd /data/local/tmp/tvm_ci; killall -9 tvm_rpc_ci; sleep 2; 
LD_LIBRARY_PATH=/data/local/tmp/tvm_ci/ ./tvm_rpc_ci server --host=0.0.0.0 
--port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} 
--key=${RPC_DEVICE_KEY}" &
+env adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; 
LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 
--port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} 
--key=${RPC_DEVICE_KEY}" &
 DEVICE_PID=$!
 sleep 5 # Wait for the device connections
 trap "{ kill ${TRACKER_PID}; kill ${DEVICE_PID}; }" 0

[tvm] branch main updated: [METAL] Fix int8 vectorized cast (#14962)

2023-05-26 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 6198c7fd8a [METAL] Fix int8 vectorized cast (#14962)
6198c7fd8a is described below

commit 6198c7fd8a75534d98efd0ef800b36fc4e3dc021
Author: Siyuan Feng 
AuthorDate: Fri May 26 22:22:46 2023 +0800

[METAL] Fix int8 vectorized cast (#14962)

Current codegen output `(half4)*(device uint*)A` tries to create a `int32`
number and then cast it to `half4`, which is not the expected behavior.

As Metal supports `uchar4` and `char4` types, we can direct use them to
solve that problem.
---
 src/target/source/codegen_metal.cc |  5 
 tests/python/unittest/test_target_codegen_metal.py | 30 +-
 2 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/src/target/source/codegen_metal.cc 
b/src/target/source/codegen_metal.cc
index bd2b930166..b7105e4bcd 100644
--- a/src/target/source/codegen_metal.cc
+++ b/src/target/source/codegen_metal.cc
@@ -220,11 +220,6 @@ void CodeGenMetal::PrintType(DataType t, std::ostream& os) 
{  // NOLINT(*)
 if (t.is_uint()) {
   os << 'u';
 }
-if (t.bits() == 8 && t.lanes() == 4) {
-  // directly 4 8 bit int in integer.
-  os << "int";
-  return;
-}
 switch (t.bits()) {
   case 8:
 os << "char";
diff --git a/tests/python/unittest/test_target_codegen_metal.py 
b/tests/python/unittest/test_target_codegen_metal.py
index 3b1cdb4422..dcbbba8c9c 100644
--- a/tests/python/unittest/test_target_codegen_metal.py
+++ b/tests/python/unittest/test_target_codegen_metal.py
@@ -14,12 +14,12 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-import tvm
-from tvm import te
 import numpy as np
 
-import tvm.testing
+import tvm
 import tvm.script
+import tvm.testing
+from tvm import te
 from tvm.script import tir as T
 
 
@@ -149,7 +149,25 @@ def test_select_vectorize():
 np.testing.assert_allclose(b_nd.numpy(), a, atol=1e-5, rtol=1e-5)
 
 
+@tvm.testing.requires_gpu
+@tvm.testing.requires_metal
+def test_vectorized_uint8():
+@T.prim_func
+def func(A: T.Buffer((16), "uint8"), B: T.Buffer((16), "float32")):
+for i in T.thread_binding(4, thread="threadIdx.x"):
+for j in T.vectorized(4):
+with T.block("block"):
+vi = T.axis.spatial(16, i * 4 + j)
+B[vi] = T.Cast("float32", A[vi])
+
+dev = tvm.metal()
+a = np.arange(16).astype("uint8")
+a_nd = tvm.nd.array(a, dev)
+b_nd = tvm.nd.empty((16,), "float32", dev)
+f = tvm.build(func, target="metal")
+f(a_nd, b_nd)
+np.testing.assert_allclose(b_nd.numpy(), a.astype("float32"), atol=1e-5, 
rtol=1e-5)
+
+
 if __name__ == "__main__":
-test_ramp()
-test_metal_inf_nan()
-test_metal_erf()
+tvm.testing.main()

[tvm] branch main updated (1c39613811 -> bcf7abba29)

2023-05-25 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 1c39613811 [LLVM] Expose Host CPU Feature Detection (#14946)
 add bcf7abba29 Fix pytorch axis (#14930)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py  | 3 ++-
 tests/python/frontend/pytorch/test_forward.py | 6 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

[tvm] branch main updated: [Target] Add target to all TVM callbacks (#14939)

2023-05-25 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new e11913be06 [Target] Add target to all TVM callbacks (#14939)
e11913be06 is described below

commit e11913be06b30d806e6c09a6610b4e8fbaaa30a0
Author: Junru Shao 
AuthorDate: Wed May 24 23:10:41 2023 -0700

[Target] Add target to all TVM callbacks (#14939)

* [Target] Add target to all TVM callbacks

This PR adds an extra parameter `target` to all `tvm_callback_*` so that
the callback can decide its own behavior by querying which target to
compile against.

* fix lint

* fix lint
---
 apps/ios_rpc/tests/ios_rpc_mobilenet.py| 27 +++---
 apps/ios_rpc/tests/ios_rpc_test.py | 13 ++-
 apps/topi_recipe/broadcast/test_broadcast_map.py   | 12 --
 apps/topi_recipe/conv/depthwise_conv2d_test.py | 13 +--
 apps/topi_recipe/conv/test_conv2d_hwcn_map.py  |  9 
 apps/topi_recipe/reduce/test_reduce_map.py | 10 
 apps/topi_recipe/rnn/lstm.py   |  9 
 apps/topi_recipe/rnn/matexp.py | 13 ++-
 jvm/core/src/test/scripts/test_add_gpu.py  |  4 ++--
 python/tvm/contrib/nvcc.py |  6 ++---
 python/tvm/contrib/sdaccel.py  | 10 
 src/target/opt/build_cuda_on.cc|  4 ++--
 src/target/source/codegen_aocl.cc  |  2 +-
 src/target/source/codegen_metal.cc |  2 +-
 src/target/source/codegen_opencl.cc|  2 +-
 src/target/source/codegen_vhls.cc  |  5 ++--
 src/target/spirv/spirv_utils.cc|  2 +-
 tests/python/integration/test_ewise.py |  5 ++--
 tests/python/integration/test_ewise_fpga.py|  3 +--
 .../test_tir_transform_inject_ptx_async_copy.py|  2 +-
 20 files changed, 76 insertions(+), 77 deletions(-)

diff --git a/apps/ios_rpc/tests/ios_rpc_mobilenet.py 
b/apps/ios_rpc/tests/ios_rpc_mobilenet.py
index b90b459280..0c958a07d8 100644
--- a/apps/ios_rpc/tests/ios_rpc_mobilenet.py
+++ b/apps/ios_rpc/tests/ios_rpc_mobilenet.py
@@ -15,24 +15,24 @@
 # specific language governing permissions and limitations
 # under the License.
 
-import tvm
-from tvm import rpc, relay
-from tvm.contrib.download import download_testdata
-from tvm.relay.expr_functor import ExprMutator
-from tvm.relay import transform
-from tvm.relay.op.annotation import compiler_begin, compiler_end
-from tvm.relay.quantize.quantize import prerequisite_optimize
-from tvm.contrib import utils, xcode, graph_executor, coreml_runtime
-from tvm.contrib.target import coreml as _coreml
-
+import argparse
 import os
 import re
 import sys
+
+import coremltools
 import numpy as np
+import tvm
 from mxnet import gluon
 from PIL import Image
-import coremltools
-import argparse
+from tvm import relay, rpc
+from tvm.contrib import coreml_runtime, graph_executor, utils, xcode
+from tvm.contrib.download import download_testdata
+from tvm.contrib.target import coreml as _coreml
+from tvm.relay import transform
+from tvm.relay.expr_functor import ExprMutator
+from tvm.relay.op.annotation import compiler_begin, compiler_end
+from tvm.relay.quantize.quantize import prerequisite_optimize
 
 # Change target configuration, this is setting for iphone6s
 # arch = "x86_64"
@@ -43,9 +43,10 @@ target_host = "llvm -mtriple=%s-apple-darwin" % arch
 
 MODES = {"proxy": rpc.connect, "tracker": rpc.connect_tracker, "standalone": 
rpc.connect}
 
+
 # override metal compiler to compile to iphone
 @tvm.register_func("tvm_callback_metal_compile")
-def compile_metal(src):
+def compile_metal(src, target):
 return xcode.compile_metal(src, sdk=sdk)
 
 
diff --git a/apps/ios_rpc/tests/ios_rpc_test.py 
b/apps/ios_rpc/tests/ios_rpc_test.py
index 94340dcd4e..78f1d3a6c5 100644
--- a/apps/ios_rpc/tests/ios_rpc_test.py
+++ b/apps/ios_rpc/tests/ios_rpc_test.py
@@ -20,15 +20,15 @@ To use it, start a rpc proxy with "python -m 
tvm.exec.rpc_proxy".
 And configure the proxy host field as commented.
 """
 
-import tvm
-from tvm import te
+import argparse
 import os
 import re
 import sys
-from tvm import rpc
-from tvm.contrib import utils, xcode
+
 import numpy as np
-import argparse
+import tvm
+from tvm import rpc, te
+from tvm.contrib import utils, xcode
 
 # Change target configuration, this is setting for iphone6s
 arch = "arm64"
@@ -37,9 +37,10 @@ target = "llvm -mtriple=%s-apple-darwin" % arch
 
 MODES = {"proxy": rpc.connect, "tracker": rpc.connect_tracker, "standalone": 
rpc.connect}
 
+
 # override metal compiler to compile to iphone
 @tvm.register_func("tvm_cal

[tvm] branch main updated: [Relay] add a dimension check to reject invalid input (#14925)

2023-05-23 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new d776bccc1e [Relay] add a dimension check to reject invalid input 
(#14925)
d776bccc1e is described below

commit d776bccc1edb90b06d06b4ea4a94157115b1d972
Author: Qingchao Shen 
AuthorDate: Wed May 24 13:28:17 2023 +0800

[Relay] add a dimension check to reject invalid input (#14925)

* add a dimension check

* Update test_forward.py

* Update pytorch.py

* Update pytorch.py
---
 python/tvm/relay/frontend/pytorch.py  | 6 ++
 tests/python/frontend/pytorch/test_forward.py | 5 +
 2 files changed, 11 insertions(+)

diff --git a/python/tvm/relay/frontend/pytorch.py 
b/python/tvm/relay/frontend/pytorch.py
index 08bf5d517c..f3bcda7c23 100644
--- a/python/tvm/relay/frontend/pytorch.py
+++ b/python/tvm/relay/frontend/pytorch.py
@@ -4642,6 +4642,12 @@ def _get_relay_input_vars(graph, input_infos, prelude, 
is_module=True, default_d
 ):
 msg = "Shapes of input list and information in the graph do 
not match"
 raise RuntimeError(msg)
+if len(ishape) > 1 and any(dim <= 0 for dim in ishape[1:]):
+msg = (
+"Expected input's non-batch dimensions to have positive 
length, "
+f"but input has a shape of {pt_type.sizes()}"
+)
+raise RuntimeError(msg)
 pt_dtype = pt_type.scalarType()
 if not pt_dtype and itype:
 pt_dtype = itype
diff --git a/tests/python/frontend/pytorch/test_forward.py 
b/tests/python/frontend/pytorch/test_forward.py
index b602c14df3..de1de6421b 100644
--- a/tests/python/frontend/pytorch/test_forward.py
+++ b/tests/python/frontend/pytorch/test_forward.py
@@ -789,6 +789,11 @@ def test_forward_celu():
 input_data = torch.tensor([-1.0, 2.0], dtype=torch.float32)
 verify_model(torch.nn.CELU().eval(), input_data=input_data)
 
+input_shape = [2, 0, 1]
+input_data = torch.rand(input_shape).float()
+with pytest.raises(RuntimeError):
+verify_model(torch.nn.CELU().eval(), input_data=input_data)
+
 
 @tvm.testing.uses_gpu
 def test_forward_gelu():

[tvm] branch main updated (dddc339d2f -> 4f99750611)

2023-05-22 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from dddc339d2f [Test Cases][BugFix] Add some version check to make test 
cases run in all PyTorch versions (#14903)
 add 4f99750611 [TypoFix] fix some typo problem in keras frontend (#14916)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/keras.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

[tvm] branch main updated: [RUNTIME] Make systemlib unique per prefix (#14887)

2023-05-19 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 006b11df04 [RUNTIME] Make systemlib unique per prefix (#14887)
006b11df04 is described below

commit 006b11df046f3aecc076384fb96cb6733287b25f
Author: Tianqi Chen 
AuthorDate: Fri May 19 08:04:44 2023 -0400

[RUNTIME] Make systemlib unique per prefix (#14887)

This PR enhances systemlib to make it unique per prefix.
Can help reduce flaky problems in multiple loading of same lib.
---
 src/runtime/system_library.cc | 53 +--
 tests/python/unittest/test_target_codegen_blob.py |  2 +
 2 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/src/runtime/system_library.cc b/src/runtime/system_library.cc
index be9257e53f..55335649a7 100644
--- a/src/runtime/system_library.cc
+++ b/src/runtime/system_library.cc
@@ -32,30 +32,30 @@
 namespace tvm {
 namespace runtime {
 
-class SystemLibraryRegistry {
+class SystemLibSymbolRegistry {
  public:
   void RegisterSymbol(const std::string& name, void* ptr) {
 std::lock_guard lock(mutex_);
-auto it = tbl_.find(name);
-if (it != tbl_.end() && ptr != it->second) {
+auto it = symbol_table_.find(name);
+if (it != symbol_table_.end() && ptr != it->second) {
   LOG(WARNING) << "SystemLib symbol " << name << " get overriden to a 
different address " << ptr
<< "->" << it->second;
 }
-tbl_[name] = ptr;
+symbol_table_[name] = ptr;
   }
 
   void* GetSymbol(const char* name) {
 std::lock_guard lock(mutex_);
-auto it = tbl_.find(name);
-if (it != tbl_.end()) {
+auto it = symbol_table_.find(name);
+if (it != symbol_table_.end()) {
   return it->second;
 } else {
   return nullptr;
 }
   }
 
-  static SystemLibraryRegistry* Global() {
-static SystemLibraryRegistry* inst = new SystemLibraryRegistry();
+  static SystemLibSymbolRegistry* Global() {
+static SystemLibSymbolRegistry* inst = new SystemLibSymbolRegistry();
 return inst;
   }
 
@@ -63,7 +63,7 @@ class SystemLibraryRegistry {
   // Internal mutex
   std::mutex mutex_;
   // Internal symbol table
-  std::unordered_map tbl_;
+  std::unordered_map symbol_table_;
 };
 
 class SystemLibrary : public Library {
@@ -80,22 +80,49 @@ class SystemLibrary : public Library {
   }
 
  private:
-  SystemLibraryRegistry* reg_ = SystemLibraryRegistry::Global();
+  SystemLibSymbolRegistry* reg_ = SystemLibSymbolRegistry::Global();
   std::string symbol_prefix_;
 };
 
+class SystemLibModuleRegistry {
+ public:
+  runtime::Module GetOrCreateModule(std::string symbol_prefix) {
+std::lock_guard lock(mutex_);
+auto it = lib_map_.find(symbol_prefix);
+if (it != lib_map_.end()) {
+  return it->second;
+} else {
+  auto mod = 
CreateModuleFromLibrary(make_object(symbol_prefix));
+  lib_map_[symbol_prefix] = mod;
+  return mod;
+}
+  }
+
+  static SystemLibModuleRegistry* Global() {
+static SystemLibModuleRegistry* inst = new SystemLibModuleRegistry();
+return inst;
+  }
+
+ private:
+  // Internal mutex
+  std::mutex mutex_;
+  // we need to make sure each lib map have an unique
+  // copy through out the entire lifetime of the process
+  // so the cached PackedFunc in the system do not get out dated.
+  std::unordered_map lib_map_;
+};
+
 TVM_REGISTER_GLOBAL("runtime.SystemLib").set_body([](TVMArgs args, 
TVMRetValue* rv) {
   std::string symbol_prefix = "";
   if (args.size() != 0) {
 symbol_prefix = args[0].operator std::string();
   }
-  auto mod = 
CreateModuleFromLibrary(make_object(symbol_prefix));
-  *rv = mod;
+  *rv = SystemLibModuleRegistry::Global()->GetOrCreateModule(symbol_prefix);
 });
 }  // namespace runtime
 }  // namespace tvm
 
 int TVMBackendRegisterSystemLibSymbol(const char* name, void* ptr) {
-  tvm::runtime::SystemLibraryRegistry::Global()->RegisterSymbol(name, ptr);
+  tvm::runtime::SystemLibSymbolRegistry::Global()->RegisterSymbol(name, ptr);
   return 0;
 }
diff --git a/tests/python/unittest/test_target_codegen_blob.py 
b/tests/python/unittest/test_target_codegen_blob.py
index d7683fd68c..2848c875a4 100644
--- a/tests/python/unittest/test_target_codegen_blob.py
+++ b/tests/python/unittest/test_target_codegen_blob.py
@@ -122,6 +122,8 @@ def test_cuda_multi_lib():
 b_nd = tvm.nd.array(a_np, dev)
 syslibA = tvm.runtime.system_lib("modA_")
 syslibB = tvm.runtime.system_lib("modB_")
+# reload same lib twice
+syslibA = tvm.runtime.system_lib("modA_")
 syslibA["my_inplace_update"](a_nd)
 syslibB["my_inplace_update"](b_nd)
 np.testing.assert_equal(a_nd.numpy(), a_np + 1)

[tvm] branch main updated (613ad5cc51 -> af06b30a98)

2023-05-18 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 613ad5cc51 [bugfix] fix the undeclared identifier 'f' (#14879)
 add af06b30a98 [CONTRIB] Enable create_staticlib to take in tar files 
(#14876)

No new revisions were added by this update.

Summary of changes:
 python/tvm/contrib/cc.py  | 50 +++
 python/tvm/contrib/tar.py | 41 +++
 tests/python/unittest/test_target_codegen_blob.py | 15 ---
 3 files changed, 100 insertions(+), 6 deletions(-)

[tvm] branch main updated: [bugfix][Relay] Fix softplus in paddlepaddle frontend (#14845)

2023-05-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 602133e6b9 [bugfix][Relay] Fix softplus in paddlepaddle frontend 
(#14845)
602133e6b9 is described below

commit 602133e6b9a3c7925abd5bbdb315f92fa7170c93
Author: Qingchao Shen 
AuthorDate: Mon May 15 12:34:16 2023 +0800

[bugfix][Relay] Fix softplus in paddlepaddle frontend (#14845)

* fix softplus in paddlepaddle.py

* add test case

* Update test_forward.py
---
 python/tvm/relay/frontend/paddlepaddle.py  | 5 -
 tests/python/frontend/paddlepaddle/test_forward.py | 4 +++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/python/tvm/relay/frontend/paddlepaddle.py 
b/python/tvm/relay/frontend/paddlepaddle.py
index fdbc96676f..4268a4876a 100755
--- a/python/tvm/relay/frontend/paddlepaddle.py
+++ b/python/tvm/relay/frontend/paddlepaddle.py
@@ -2163,7 +2163,10 @@ def convert_softplus(g, op, block):
 dtype = infer_type(x).checked_type.dtype
 beta = op.attr("beta")
 beta = _expr.const(beta, dtype=dtype)
-out = _op.log(_op.exp(x * beta) + _expr.const(1.0, dtype=dtype)) / beta
+threshold = op.attr("threshold")
+threshold = _expr.const(threshold, dtype=dtype)
+out_softplus = _op.log(_op.exp(x * beta) + _expr.const(1.0, dtype=dtype)) 
/ beta
+out = _op.where(_op.greater(x * beta, threshold), x, out_softplus)
 g.add_node(op.output("Out")[0], out)
 
 
diff --git a/tests/python/frontend/paddlepaddle/test_forward.py 
b/tests/python/frontend/paddlepaddle/test_forward.py
index 289fc0faa3..1555ba1aaa 100755
--- a/tests/python/frontend/paddlepaddle/test_forward.py
+++ b/tests/python/frontend/paddlepaddle/test_forward.py
@@ -1722,7 +1722,9 @@ def test_forward_sin():
 
 @run_math_api
 def test_forward_softplus():
-pass
+x = paddle.to_tensor([-0.4, 1], dtype="float32")
+m = paddle.nn.Softplus(5, 1)
+verify_model(m, [x])
 
 
 @run_math_api

[tvm] branch unity updated: [Unity][Training] Categorical cross entropy loss (#14757)

2023-05-12 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch unity
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/unity by this push:
 new 25a491d52b [Unity][Training] Categorical cross entropy loss (#14757)
25a491d52b is described below

commit 25a491d52bac4a043dd75f08fa24b31a2c7c1928
Author: Nikita Demashov 
AuthorDate: Fri May 12 10:09:53 2023 +0300

[Unity][Training] Categorical cross entropy loss (#14757)

Add categorical cross entropy loss

fix lint

Add targets dtype checker and description of the reason for two 
implementations

Lift the common part

fix dtype checker

fix lint
---
 python/tvm/relax/training/loss.py| 94 +++-
 tests/python/relax/test_training_loss.py | 76 ++
 2 files changed, 169 insertions(+), 1 deletion(-)

diff --git a/python/tvm/relax/training/loss.py 
b/python/tvm/relax/training/loss.py
index 466c2996e7..d98d2d727c 100644
--- a/python/tvm/relax/training/loss.py
+++ b/python/tvm/relax/training/loss.py
@@ -27,7 +27,7 @@ from typing_extensions import Literal
 from ..block_builder import BlockBuilder
 from ..expr import Expr, Var, Function, StructInfo
 
-from ..op import abs, sum, mean, subtract, multiply
+from ..op import abs, sum, mean, subtract, multiply, reshape, argmax
 from ..op.nn import log_softmax, nll_loss
 
 
@@ -290,3 +290,95 @@ class CrossEntropyLoss(Loss):
 bb.emit_func_output(loss)
 
 return bb.get()[self._loss_name]
+
+
+class CategoricalCrossEntropyLoss(Loss):
+r"""CategoricalCrossEntropyLoss.
+It is a combination of a converting one-hot target vector to a label,
+a log_softmax computation and a nll_loss.
+
+Parameters
+--
+reduction : Literal["mean", "sum", "none"]
+The reduction method to apply to output. Can be "mean", "sum" or 
"none".
+
+none : no reduction will be applied,
+mean : the sum of the output will be divided by the batch_size,
+sum : the output will be summed.
+
+ignore_index : int
+Specifies a target value that is ignored and does not contribute to 
the input gradient.
+"""
+
+ignore_index: int
+
+def __init__(
+self,
+reduction: Literal["mean", "sum", "none"] = "mean",
+ignore_index: int = -100,
+) -> None:
+super().__init__("categorical_cross_entropy_loss", 1, reduction)
+self.ignore_index = ignore_index
+
+def __call__(
+self,
+predictions: Union[Var, StructInfo],
+targets: Union[Var, StructInfo],
+weights: Optional[Union[Var, StructInfo]] = None,
+) -> Function:
+"""Get the relax function of CategoricalCrossEntropyLoss. If the 
parameters are
+struct info, it will create corresponding variables.
+
+Parameters
+--
+predictions : Union[Var, StructInfo]
+The predictions of the model in the calculation of loss.
+
+targets : Union[Var, StructInfo]
+The ground truth in the calculation of loss.
+
+weights : Optional[Union[Var, StructInfo]]
+a manual rescaling weight given to each class. It has to be a 
Tensor of size C.
+
+Returns
+---
+The relax function of CategoricalCrossEntropyLoss with the loss name 
as its global symbol.
+"""
+
+if not "int" in targets.dtype:
+raise TypeError(
+f"Dtype of targets expected to be int/uint. \
+  However, the dtype of targets is {targets.dtype}"
+)
+
+bb = BlockBuilder()
+
+predictions = _create_param_var(predictions, "predictions")
+targets = _create_param_var(targets, "targets")
+
+arg_list = [predictions, targets]
+if weights:
+weights = _create_param_var(weights, "weights")
+arg_list.append(weights)
+
+# In the case of ignore_index >= 0,
+# the nll_loss function is used to handle the ignore index.
+# In other cases where ignore_index is not needed, just use the simpe 
product.
+with bb.function(self._loss_name, arg_list):
+with bb.dataflow():
+logits = bb.emit(log_softmax(predictions))
+if self.ignore_index >= 0:
+targets = bb.emit(
+reshape(argmax(targets, axis=1), 
shape=(targets.struct_info.shape[0],))
+)
+loss = bb.emit_output(
+nll_loss(logits, targets, weights, self._reduction, 
self.ignore_index)
+)
+

[tvm] branch main updated: [Relay][BugFix] fix a bug about ReLu in the threshold attribute which causes a different results with keras (#14824)

2023-05-11 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new fd2a510ccf [Relay][BugFix] fix a bug about ReLu in the threshold 
attribute which causes a different results with keras (#14824)
fd2a510ccf is described below

commit fd2a510ccf76efd7b678c4746c1fa72ce496b58d
Author: Qingchao Shen 
AuthorDate: Thu May 11 16:48:41 2023 +0800

[Relay][BugFix] fix a bug about ReLu in the threshold attribute which 
causes a different results with keras (#14824)

* fix relu threshold attribute

* Update test_forward.py
---
 python/tvm/relay/frontend/keras.py  | 2 ++
 tests/python/frontend/keras/test_forward.py | 1 +
 2 files changed, 3 insertions(+)

diff --git a/python/tvm/relay/frontend/keras.py 
b/python/tvm/relay/frontend/keras.py
index ef94c74e03..b820ad586d 100644
--- a/python/tvm/relay/frontend/keras.py
+++ b/python/tvm/relay/frontend/keras.py
@@ -145,6 +145,8 @@ def _convert_advanced_activation(inexpr, keras_layer, etab, 
data_layout, input_s
 axis = axis + 1 if axis < dims - 1 else 1
 return _op.nn.softmax(inexpr, axis=axis)
 if act_type == "ReLU":
+if np.isnan(keras_layer.threshold).any():
+raise tvm.error.OpAttributeInvalid("The threshold value of a ReLU 
cannot be None.")
 threshold = _expr.const(keras_layer.threshold, dtype="float32")
 if keras_layer.max_value and float(keras_layer.threshold) == 0:
 # f(x) = max_value, for x >= max_value
diff --git a/tests/python/frontend/keras/test_forward.py 
b/tests/python/frontend/keras/test_forward.py
index 92b8ce0a64..a5305430c2 100644
--- a/tests/python/frontend/keras/test_forward.py
+++ b/tests/python/frontend/keras/test_forward.py
@@ -227,6 +227,7 @@ class TestKeras:
 act_funcs = [
 keras_mod.layers.LeakyReLU(alpha=None),
 keras_mod.layers.LEU(2, 3, 4),
+keras_mod.layers.ReLU(threshold=None),
 ]
 data = keras_mod.layers.Input(shape=(2, 3, 4))
 for act_func in act_funcs:

[tvm] branch main updated (3829ebb68c -> cca7d78334)

2023-05-10 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 3829ebb68c [TVMScript] Handle AllocatedPoolInfo, ConstantPoolInfo, 
ConstantInfo (#14812)
 add cca7d78334 [OpenCLML] Transposed convolution support and other fixes 
(#14767)

No new revisions were added by this update.

Summary of changes:
 apps/cpp_rtvm/tvm_runner.h   |   3 +-
 cmake/modules/LibInfo.cmake  |   1 +
 python/tvm/relay/op/contrib/clml.py  |  45 +-
 src/relay/backend/contrib/clml/codegen.cc|  40 +++---
 src/runtime/contrib/clml/clml_runtime.cc | 176 ---
 src/support/libinfo.cc   |   1 +
 tests/python/contrib/test_clml/infrastructure.py |   2 +
 tests/python/contrib/test_clml/test_ops.py   | 107 +-
 8 files changed, 271 insertions(+), 104 deletions(-)

[tvm] branch main updated: [OpenCL] Refactor OpenCL runtime to support SPIRV binary ingestion (#14792)

2023-05-09 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 182de86b2c [OpenCL] Refactor OpenCL runtime to support SPIRV binary 
ingestion (#14792)
182de86b2c is described below

commit 182de86b2cddabbd3903efe88fbff52d654117d7
Author: masahi 
AuthorDate: Tue May 9 15:08:04 2023 +0900

[OpenCL] Refactor OpenCL runtime to support SPIRV binary ingestion (#14792)

* [OpenCL] Support SPIRV module ingestion

* compile fixed

* clean up

* fix build when vulkan is not enabled

* Introduce spirv_utils.cc

* add dummy impl for LowerToSPIRV in case vulkan is not enabled

* more fix

* TVM_USE_VULKAN -> TVM_ENABLE_SPIRV

* build fix when opencl is not enabled

* mv spirv_shader.h under new spirv folder

* mv spirv module to its own file

* fix after reorg

* build fix
---
 cmake/modules/OpenCL.cmake |   1 +
 cmake/modules/Vulkan.cmake |   1 +
 src/runtime/opencl/opencl_common.h |  60 +
 src/runtime/opencl/opencl_module.cc|  36 +++---
 src/runtime/opencl/opencl_module.h |  14 ++-
 src/runtime/opencl/opencl_module_spirv.cc  | 137 +
 .../vulkan_shader.h => spirv/spirv_shader.h}   |  16 +--
 src/runtime/vulkan/vulkan_module.cc|   6 +-
 src/runtime/vulkan/vulkan_module.h |   4 +-
 src/runtime/vulkan/vulkan_wrapped_func.h   |   6 +-
 src/target/opt/build_opencl_off.cc |   7 ++
 src/target/source/codegen_opencl.cc|   9 ++
 src/target/spirv/build_vulkan.cc   | 134 +---
 src/target/spirv/codegen_spirv.cc  |   5 +-
 src/target/spirv/codegen_spirv.h   |   6 +-
 src/target/spirv/spirv_support.cc  |   5 +-
 .../spirv/{build_vulkan.cc => spirv_utils.cc}  |  64 ++
 .../vulkan_module.h => target/spirv/spirv_utils.h} |  32 +++--
 18 files changed, 316 insertions(+), 227 deletions(-)

diff --git a/cmake/modules/OpenCL.cmake b/cmake/modules/OpenCL.cmake
index 53199f19cb..f380ad75d1 100644
--- a/cmake/modules/OpenCL.cmake
+++ b/cmake/modules/OpenCL.cmake
@@ -41,6 +41,7 @@ endif(USE_AOCL)
 
 if(USE_OPENCL)
   tvm_file_glob(GLOB RUNTIME_OPENCL_SRCS src/runtime/opencl/*.cc)
+  list(APPEND COMPILER_SRCS src/target/spirv/spirv_utils.cc)
 
   if(${USE_OPENCL} MATCHES ${IS_TRUE_PATTERN})
 message(STATUS "Enabled runtime search for OpenCL library location")
diff --git a/cmake/modules/Vulkan.cmake b/cmake/modules/Vulkan.cmake
index 7470fb6125..1f303f3a03 100644
--- a/cmake/modules/Vulkan.cmake
+++ b/cmake/modules/Vulkan.cmake
@@ -34,4 +34,5 @@ if(USE_VULKAN)
   list(APPEND COMPILER_SRCS ${COMPILER_VULKAN_SRCS})
   list(APPEND TVM_LINKER_LIBS ${Vulkan_SPIRV_TOOLS_LIBRARY})
   list(APPEND TVM_RUNTIME_LINKER_LIBS ${Vulkan_LIBRARY})
+  add_definitions(-DTVM_ENABLE_SPIRV=1)
 endif(USE_VULKAN)
diff --git a/src/runtime/opencl/opencl_common.h 
b/src/runtime/opencl/opencl_common.h
index a8a4cf3dc6..d25d2db0eb 100644
--- a/src/runtime/opencl/opencl_common.h
+++ b/src/runtime/opencl/opencl_common.h
@@ -411,18 +411,16 @@ struct BufferDescriptor {
 // To make the call thread-safe, we create a thread-local kernel table
 // and lazily install new kernels into the kernel table when the kernel is 
called.
 // The kernels are recycled when the module get destructed.
-class OpenCLModuleNode : public ModuleNode {
+class OpenCLModuleNodeBase : public ModuleNode {
  public:
   // Kernel table reference entry.
   struct KTRefEntry {
 size_t kernel_id;
 size_t version;
   };
-  explicit OpenCLModuleNode(std::string data, std::string fmt,
-std::unordered_map 
fmap, std::string source)
-  : data_(data), fmt_(fmt), fmap_(fmap), source_(source) {}
+  explicit OpenCLModuleNodeBase(std::unordered_map 
fmap) : fmap_(fmap) {}
   // destructor
-  ~OpenCLModuleNode();
+  ~OpenCLModuleNodeBase();
 
   /*!
* \brief Get the global workspace
@@ -436,38 +434,56 @@ class OpenCLModuleNode : public ModuleNode {
 return ModulePropertyMask::kBinarySerializable | 
ModulePropertyMask::kRunnable;
   }
 
-  PackedFunc GetFunction(const std::string& name, const ObjectPtr& 
sptr_to_self) final;
-  void SaveToFile(const std::string& file_name, const std::string& format) 
final;
-  void SaveToBinary(dmlc::Stream* stream) final;
-  std::string GetSource(const std::string& format) final;
+  PackedFunc GetFunction(const std::string& name, const ObjectPtr& 
sptr_to_self) override;
+
   // Initialize the programs
-  void Init();
+  virtual void Init() = 0;
   // install a new kernel to thread local entry
-

[tvm] branch main updated (d6e0f1d001 -> e01cb479b8)

2023-05-07 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from d6e0f1d001 [NDArray] Allow arbitrary stride when the corresponding 
shape is 1 (#14797)
 add e01cb479b8 [bugfix][relay] fix wrong calculate logic about celu 
(#14796)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/pytorch.py  | 5 +++--
 tests/python/frontend/pytorch/test_forward.py | 2 ++
 2 files changed, 5 insertions(+), 2 deletions(-)

[tvm] branch main updated (706a5b2dd5 -> eca6edf7fa)

2023-05-04 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 706a5b2dd5 [Target][Minor] Add A6000 Target Tag (#14762)
 add eca6edf7fa [CI] sccache support while using ci.py under multi user 
environments (#14666)

No new revisions were added by this update.

Summary of changes:
 tests/scripts/ci.py | 1 +
 1 file changed, 1 insertion(+)

[tvm] branch main updated (4d37a0a078 -> f1033adf04)

2023-05-04 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 4d37a0a078 [CI] Upgrade CI (#14635)
 add f1033adf04 [TIR] Keep block annotations from tensorization (#14741)

No new revisions were added by this update.

Summary of changes:
 src/tir/schedule/primitive/blockize_tensorize.cc   |  9 +
 .../python/unittest/test_tir_schedule_tensorize.py | 40 +-
 2 files changed, 48 insertions(+), 1 deletion(-)

[tvm] branch main updated: [TOPI] Vectorize depthwise conv2d output operator (#14519)

2023-05-03 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new cc8cce50fa [TOPI] Vectorize depthwise conv2d output operator (#14519)
cc8cce50fa is described below

commit cc8cce50fabb5a92e5830bb81b5fa96fb613a698
Author: Jack Frankland <30410009+franklandj...@users.noreply.github.com>
AuthorDate: Wed May 3 21:18:41 2023 +0100

[TOPI] Vectorize depthwise conv2d output operator (#14519)

Depthwise Conv2D operations may consists of a convolution + an output
operator e.g. Relu. This commit will:
* Apply vectorization across the inner channel loop when there is an
  output operator.
* Remove some unused variables in `schedule_depthwise_conv2d_nhwc`.
* Limit the loop splitting to 8 elements in the inner loop.
---
 python/tvm/topi/arm_cpu/depthwise_conv2d.py | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/python/tvm/topi/arm_cpu/depthwise_conv2d.py 
b/python/tvm/topi/arm_cpu/depthwise_conv2d.py
index b6c15a30c0..59660e6bb9 100644
--- a/python/tvm/topi/arm_cpu/depthwise_conv2d.py
+++ b/python/tvm/topi/arm_cpu/depthwise_conv2d.py
@@ -292,13 +292,13 @@ def schedule_depthwise_conv2d_nhwc(cfg, outs):
 out = outs[0]
 
 # space definition begin #
-n, h, w, c = s[out].op.axis
+_, h, w, c = s[out].op.axis
 # Split the number of input/output channels
-cfg.define_split("tile_c", c, num_outputs=2)
+cfg.define_split("tile_c", c, num_outputs=2, filter=lambda entry: 
entry.size[1] <= 8)
 # Split the height of the convolution
-_, hi = cfg.define_split("tile_h", h, num_outputs=2)
+cfg.define_split("tile_h", h, num_outputs=2)
 # Split the width of the convolution
-_, wi = cfg.define_split("tile_w", w, num_outputs=2)
+cfg.define_split("tile_w", w, num_outputs=2)
 # Additional out (e.g., requantization, bias addition, etc..)
 # 0: locate the output on the second last axis of the main compuation
 # 1: locate the output closest to the main computation
@@ -394,7 +394,8 @@ def schedule_depthwise_conv2d_nhwc(cfg, outs):
 ci_outer, ci_inner = s[out].split(ci, 4)
 s[out].vectorize(ci_inner)
 s[out].unroll(ci_outer)
-
+else:
+s[out].vectorize(ci)
 fused_n_ho = s[out].fuse(n, ho)
 return hi, wi, fused_n_ho

[tvm] branch main updated: [bugfix][relay] Fix alpha attribute with None in ELU (#14742)

2023-05-01 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new d1e1b4c8b2 [bugfix][relay] Fix alpha attribute with None in ELU  
(#14742)
d1e1b4c8b2 is described below

commit d1e1b4c8b2ff1ae7cdfbf8823ecffb3c305f078f
Author: Qingchao Shen 
AuthorDate: Tue May 2 12:54:04 2023 +0800

[bugfix][relay] Fix alpha attribute with None in ELU  (#14742)

* fix alpha in elu

* add test case
---
 python/tvm/relay/frontend/keras.py  |  2 ++
 tests/python/frontend/keras/test_forward.py | 19 +--
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/python/tvm/relay/frontend/keras.py 
b/python/tvm/relay/frontend/keras.py
index 4539c221c9..ef94c74e03 100644
--- a/python/tvm/relay/frontend/keras.py
+++ b/python/tvm/relay/frontend/keras.py
@@ -160,6 +160,8 @@ def _convert_advanced_activation(inexpr, keras_layer, etab, 
data_layout, input_s
 raise tvm.error.OpAttributeInvalid("The alpha value of a LeakyReLU 
cannot be None.")
 return _op.nn.leaky_relu(inexpr, alpha=float(keras_layer.alpha))
 if act_type == "ELU":
+if np.isnan(keras_layer.alpha).any():
+raise tvm.error.OpAttributeInvalid("The alpha value of a ELU 
cannot be None.")
 alpha = keras_layer.alpha if hasattr(keras_layer, "alpha") else 1.0
 alpha = _expr.const(alpha, dtype="float32")
 return _get_elu(inexpr, alpha)
diff --git a/tests/python/frontend/keras/test_forward.py 
b/tests/python/frontend/keras/test_forward.py
index 1377c180ae..86e88d0764 100644
--- a/tests/python/frontend/keras/test_forward.py
+++ b/tests/python/frontend/keras/test_forward.py
@@ -214,19 +214,26 @@ class TestKeras:
 
 def test_forward_activations_except(self, keras_mod):
 """
-test invalid attribute alpha=None for LeakyReLU.
+test invalid attribute alpha=None for LeakyReLU and ELU.
 after version 2.3.1 in keras, checking was added to reject the invalid 
api call:
-LeakyReLU(alpha=None), (issue: 
https://github.com/tensorflow/tensorflow/pull/47017)
+LeakyReLU(alpha=None) and ELU(alpha=None),
+(see issue: https://github.com/tensorflow/tensorflow/pull/47017)
 Thus, it's necessary to check the keras version to avoid crash at 
LeakyReLU(alpha=None)
+and ELU(alpha=None)
 """
 if package_version.parse(keras_mod.__version__.split("-tf")[0]) <= 
package_version.parse(
 "2.3.1"
 ):
+act_funcs = [
+keras_mod.layers.LeakyReLU(alpha=None),
+keras_mod.layers.LEU(2, 3, 4),
+]
 data = keras_mod.layers.Input(shape=(2, 3, 4))
-layer = keras_mod.layers.LeakyReLU(alpha=None)(data)
-keras_model = keras_mod.models.Model(data, layer)
-with pytest.raises(tvm.error.OpAttributeInvalid):
-verify_keras_frontend(keras_model)
+for act_func in act_funcs:
+layer = act_func(data)
+keras_model = keras_mod.models.Model(data, layer)
+with pytest.raises(tvm.error.OpAttributeInvalid):
+verify_keras_frontend(keras_model)
 
 def test_forward_dense(self, keras_mod):
 """test_forward_dense"""

[tvm] branch main updated: [Keras][Bugfix] fix a bug about alpha attribute in LeakyReLU which lead to passes conflict (#14707)

2023-04-28 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new f9ae487aee [Keras][Bugfix] fix a bug about alpha attribute in 
LeakyReLU which lead to passes conflict (#14707)
f9ae487aee is described below

commit f9ae487aeeeccf0997344813d7412f7c9bddde7f
Author: Qingchao Shen 
AuthorDate: Fri Apr 28 17:21:52 2023 +0800

[Keras][Bugfix] fix a bug about alpha attribute in LeakyReLU which lead to 
passes conflict (#14707)

* fix a bug in LeakyReLU

* add CI test

* Update test_forward.py

* Update test_forward.py

* Update test_forward.py

* Update test_forward.py

* Update test_forward.py

* Update test_forward.py

* Update test_forward.py

* Update test_forward.py
---
 python/tvm/relay/frontend/keras.py  |  2 ++
 tests/python/frontend/keras/test_forward.py | 19 +++
 2 files changed, 21 insertions(+)

diff --git a/python/tvm/relay/frontend/keras.py 
b/python/tvm/relay/frontend/keras.py
index 532c93c399..4539c221c9 100644
--- a/python/tvm/relay/frontend/keras.py
+++ b/python/tvm/relay/frontend/keras.py
@@ -156,6 +156,8 @@ def _convert_advanced_activation(inexpr, keras_layer, etab, 
data_layout, input_s
 return _op.multiply(negative_slope, _op.subtract(inexpr, 
threshold))
 return _op.nn.relu(inexpr)
 if act_type == "LeakyReLU":
+if np.isnan(keras_layer.alpha).any():
+raise tvm.error.OpAttributeInvalid("The alpha value of a LeakyReLU 
cannot be None.")
 return _op.nn.leaky_relu(inexpr, alpha=float(keras_layer.alpha))
 if act_type == "ELU":
 alpha = keras_layer.alpha if hasattr(keras_layer, "alpha") else 1.0
diff --git a/tests/python/frontend/keras/test_forward.py 
b/tests/python/frontend/keras/test_forward.py
index 50fa97cbca..1377c180ae 100644
--- a/tests/python/frontend/keras/test_forward.py
+++ b/tests/python/frontend/keras/test_forward.py
@@ -15,11 +15,13 @@
 # specific language governing permissions and limitations
 # under the License.
 """Unit tests for various models and operators"""
+from packaging import version as package_version
 import numpy as np
 import tvm
 from tvm import relay
 from tvm.contrib import graph_executor
 import tvm.testing
+import pytest
 
 try:
 import tensorflow.compat.v1 as tf
@@ -210,6 +212,22 @@ class TestKeras:
 verify_keras_frontend(keras_model)
 verify_keras_frontend(keras_model, need_transpose=False, 
layout="NHWC")
 
+def test_forward_activations_except(self, keras_mod):
+"""
+test invalid attribute alpha=None for LeakyReLU.
+after version 2.3.1 in keras, checking was added to reject the invalid 
api call:
+LeakyReLU(alpha=None), (issue: 
https://github.com/tensorflow/tensorflow/pull/47017)
+Thus, it's necessary to check the keras version to avoid crash at 
LeakyReLU(alpha=None)
+"""
+if package_version.parse(keras_mod.__version__.split("-tf")[0]) <= 
package_version.parse(
+"2.3.1"
+):
+data = keras_mod.layers.Input(shape=(2, 3, 4))
+layer = keras_mod.layers.LeakyReLU(alpha=None)(data)
+keras_model = keras_mod.models.Model(data, layer)
+with pytest.raises(tvm.error.OpAttributeInvalid):
+verify_keras_frontend(keras_model)
+
 def test_forward_dense(self, keras_mod):
 """test_forward_dense"""
 data = keras_mod.layers.Input(shape=(32, 32, 1))
@@ -749,6 +767,7 @@ if __name__ == "__main__":
 sut.test_forward_merge_dot(keras_mod=k)
 sut.test_forward_merge(keras_mod=k)
 sut.test_forward_activations(keras_mod=k)
+sut.test_forward_activations_except(keras_mod=k)
 sut.test_forward_dense(keras_mod=k)
 sut.test_forward_permute(keras_mod=k)
 sut.test_forward_sequential(keras_mod=k)

[tvm] branch main updated: [Tensorflow] Fix conv2d_transpose for NHWC layout (#14546)

2023-04-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 62f9b1d29a [Tensorflow] Fix conv2d_transpose for NHWC layout (#14546)
62f9b1d29a is described below

commit 62f9b1d29ae25fbdeb425bfc600c5dac7c23f694
Author: Qingchao Shen 
AuthorDate: Fri Apr 14 15:12:29 2023 +0800

[Tensorflow] Fix conv2d_transpose for NHWC layout (#14546)

* [Tensorflow] Fix conv2d_transpose for NHWC layout

If "data_format" == "NHWC", the kernel_layout should be "HWOI" rather than 
"HWIO".

* remove deed code

* add test cases

* Update test_forward.py

* Update test_forward.py

* Update tensorflow_ops.py

* Update tensorflow_ops.py
---
 python/tvm/relay/frontend/tensorflow_ops.py  |  4 ++--
 tests/python/frontend/tensorflow/test_forward.py | 21 -
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/python/tvm/relay/frontend/tensorflow_ops.py 
b/python/tvm/relay/frontend/tensorflow_ops.py
index 6b3f144619..27374fad1a 100644
--- a/python/tvm/relay/frontend/tensorflow_ops.py
+++ b/python/tvm/relay/frontend/tensorflow_ops.py
@@ -464,8 +464,8 @@ def _conv(opname):
 if opname == "conv":
 attr["kernel_layout"] = "HWIO" if attr["data_format"] == 
"NHWC" else "OIHW"
 elif opname == "conv_transpose":
-# conv_transpose in TVM has weights be IOHW for NCHW
-attr["kernel_layout"] = "HWIO" if attr["data_format"] == 
"NHWC" else "IOHW"
+# conv_transpose has weights be IOHW, because the 
attr["data_format"] always be NCHW
+attr["kernel_layout"] = "IOHW"
 else:
 attr["kernel_layout"] = "HWOI" if attr["data_format"] == 
"NHWC" else "OIHW"
 
diff --git a/tests/python/frontend/tensorflow/test_forward.py 
b/tests/python/frontend/tensorflow/test_forward.py
index 703df79942..bd966fa71c 100644
--- a/tests/python/frontend/tensorflow/test_forward.py
+++ b/tests/python/frontend/tensorflow/test_forward.py
@@ -742,7 +742,16 @@ def test_forward_convolution():
 "NCHW",
 [1, 1, 8, 8],
 )
-
+_test_convolution(
+"conv_transpose",
+[4, 19, 8, 8],
+[2, 2, 66, 19],
+[1, 1],
+[2, 2],
+"VALID",
+"NCHW",
+[4, 66, 16, 16],
+)
 _test_convolution("conv", [4, 8, 8, 176], [1, 1, 176, 32], [1, 1], [1, 1], 
"SAME", "NHWC")
 _test_convolution("conv", [4, 17, 17, 19], [3, 3, 19, 19], [1, 1], [2, 2], 
"VALID", "NHWC")
 _test_convolution("conv", [4, 17, 17, 124], [1, 1, 124, 19], [1, 1], [1, 
1], "SAME", "NHWC")
@@ -917,6 +926,16 @@ def test_forward_convolution():
 [4, 8, 8, 176],
 add_shapes_to_graph_def=False,
 )
+_test_convolution(
+"conv_transpose",
+[4, 8, 8, 19],
+[2, 2, 66, 19],
+[1, 1],
+[2, 2],
+"VALID",
+"NHWC",
+[4, 16, 16, 66],
+)
 # Explicit padding
 if package_version.parse(tf.VERSION) >= package_version.parse("2.4.1"):
 _test_convolution(

[tvm] branch main updated (a7a1980480 -> f79e4ebf30)

2023-04-10 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from a7a1980480 [CI] Update ci_cpu image and build with llvm-15 (#14466)
 add f79e4ebf30 [Bugfix][Topi] Output strides in pack_buffer() utility 
(#14566)

No new revisions were added by this update.

Summary of changes:
 include/tvm/topi/detail/extern.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

[tvm] branch main updated: [OpenCL][Textures] Always use SSA for texture loading (#14397)

2023-03-29 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 4011280b16 [OpenCL][Textures] Always use SSA for texture loading  
(#14397)
4011280b16 is described below

commit 4011280b16d89678cddd716d214b1840e4de73ec
Author: Egor Churaev 
AuthorDate: Thu Mar 30 08:35:14 2023 +0300

[OpenCL][Textures] Always use SSA for texture loading  (#14397)

* [OpenCL][Textures] Always use SSA for texture loading

In some cases we must use SSA for textures loading but we didn't do
that. Example of such cases:
1. Storing texture (NCHW4c) directly (w/o temporary buffer) to the
   output buffer (NCHW). In this case we have to use SSA because we
   need to get only one channel from the pixel. In case of storing to
   the local buffer the SSA was used because the buffer was allocated
   in kernel and the logic was written that if the buffer was allocated
   then we should use SSA. But if we store the same texture directly to
   the output buffer then SSA wasn't used and this OpenCL code wasn't
   compiled.
2. Casting texture (NCHW4c) to another data type and then storing it to
   the buffer (NCHW). The SSA for textures was disabled in case of cast
   operation. As a result it was necessary to take an channel from the
   pixel but we got the vector data type (e.g. float4) and then we
   tried to cast it to scalar data type. This code also wasn't
   compiled.

In this PR SSA form was enabled for all cases when `texture2d_load` is
used. The relevant tests cases were added.

* Add regression test on injective

* Fix lit

* Add skip for FP16 test

* Add additional test cases

* Fix lint

* Apply comment

* Fix lint

-

Co-authored-by: Andrey Malyshev 
---
 src/target/source/codegen_opencl.cc|  48 +--
 src/target/source/codegen_opencl.h |   5 -
 .../relay/opencl_texture/test_injection_texture.py |  85 +
 .../python/unittest/test_target_codegen_opencl.py  |   6 +-
 .../unittest/test_target_texture_codegen_opencl.py | 375 +
 5 files changed, 469 insertions(+), 50 deletions(-)

diff --git a/src/target/source/codegen_opencl.cc 
b/src/target/source/codegen_opencl.cc
index 613b1d0847..525ee95f41 100644
--- a/src/target/source/codegen_opencl.cc
+++ b/src/target/source/codegen_opencl.cc
@@ -382,33 +382,6 @@ std::string CodeGenOpenCL::CastTo(std::string value, 
DataType target) {
   return os.str();
 }
 
-void CodeGenOpenCL::VisitStmt_(const BufferStoreNode* op) {
-  if (auto call = op->value.as()) {
-if (call->op.same_as(builtin::texture2d_load())) {
-  need_texture_ssa_ = false;
-  // If storing a texture load into a buffer, don't use an
-  // intermediate local unless the buffer allocation is a
-  // single element selected from the texture read.
-  auto it = allocation_size_.find(op->buffer->data.get());
-  if (it != allocation_size_.end() && it->second == 1) {
-need_texture_ssa_ = true;
-  }
-}
-  }
-  CodeGenC::VisitStmt_(op);
-  need_texture_ssa_ = true;
-}
-
-void CodeGenOpenCL::VisitExpr_(const CastNode* op, std::ostream& os) {
-  if (auto call = op->value.as()) {
-if (call->op.same_as(builtin::texture2d_load())) {
-  need_texture_ssa_ = false;
-}
-  }
-  CodeGenC::VisitExpr_(op, os);
-  need_texture_ssa_ = true;
-}
-
 void CodeGenOpenCL::VisitStmt_(const AllocateNode* op) {
   allocation_size_.insert({op->buffer_var.get(), op->ConstantAllocationSize() 
* op->dtype.lanes()});
   CodeGenC::VisitStmt_(op);
@@ -472,20 +445,15 @@ void CodeGenOpenCL::VisitExpr_(const CallNode* op, 
std::ostream& os) {
 this->PrintExpr(op->args[2], ss);
 ss << ")))";
 
-// Only use local SSA if texture is not already being stored
-if (need_texture_ssa_) {
-  std::string rhs = SSAGetID(ss.str(), op->dtype.with_lanes(4));
-  if (op->args.back().as()) {
-os << rhs;
-  } else {
-os << "((";
-this->PrintType(op->dtype.with_lanes(1), os);
-os << "*)&" << rhs << ")[";
-this->PrintExpr(op->args.back(), os);
-os << "]";
-  }
+std::string rhs = SSAGetID(ss.str(), op->dtype.with_lanes(4));
+if (op->args.back().as()) {
+  os << rhs;
 } else {
-  os << ss.str();
+  os << "((";
+  this->PrintType(op->dtype.with_lanes(1), os);
+  os << "*)&" << rhs << ")[";
+  this->PrintExpr(op->args.back(), os);
+  os << "]";
 }

[tvm] branch main updated (970cd1def8 -> 6eb4b873e1)

2023-03-14 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 970cd1def8 [TIR][Hexagon] Enhancement of NarrowDataType pass for 
binary ops (#14298)
 add 6eb4b873e1 [DOCS][ADRENO] Improved Adreno documentation (#13867)

No new revisions were added by this update.

Summary of changes:
 docs/how_to/deploy/adreno.rst  | 718 +++--
 .../how_to/deploy_models/deploy_model_on_adreno.py | 309 +
 .../deploy_models/deploy_model_on_adreno_tvmc.py   | 198 ++
 tests/python/relay/opencl_texture/test_network.py  |  40 +-
 tests/scripts/setup-adreno-env.sh  | 113 
 tests/scripts/task_build_adreno_bins.sh|   1 -
 tests/scripts/task_config_build_adreno.sh  |   1 -
 7 files changed, 1063 insertions(+), 317 deletions(-)
 create mode 100644 gallery/how_to/deploy_models/deploy_model_on_adreno_tvmc.py
 create mode 100755 tests/scripts/setup-adreno-env.sh

[tvm] branch main updated: [COMMUNITY] Cheng Wen -> Reviewer (#14153)

2023-02-28 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 428400c6e0 [COMMUNITY] Cheng Wen -> Reviewer (#14153)
428400c6e0 is described below

commit 428400c6e0574f5bf43619ca2684397a2e14ba21
Author: Siyuan Feng 
AuthorDate: Wed Mar 1 12:41:12 2023 +0800

[COMMUNITY] Cheng Wen -> Reviewer (#14153)

Please join me @chengven027-intellif as a new Reviewer in TVM.
Cheng has contributed to ONNX/PyTorch frontend and Relay passes, making TVM 
support more input models.

- [Commits 
History](https://github.com/apache/tvm/pulls?q=author%3Achengven027-intellif+)
- [Code 
Review](https://github.com/apache/tvm/pulls?q=reviewed-by%3Achengven027-intellif+)
---
 CONTRIBUTORS.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/CONTRIBUTORS.md b/CONTRIBUTORS.md
index a0ff1bc57e..4304d6c647 100644
--- a/CONTRIBUTORS.md
+++ b/CONTRIBUTORS.md
@@ -196,6 +196,7 @@ We do encourage everyone to work anything they are 
interested in.
 - [Alex Weaver](https://github.com/alex-weaver): @alex-weaver
 - [Logan Weber](https://github.com/weberlo): @weberlo
 - [Matt Welsh](https://github.com/mdw-octoml): @mdw-octoml
+- [Cheng Wen](https://github.com/chengven027-intellif): @chengven027-intellif
 - [Jian Weng](https://github.com/were): @were
 - [wrongtest](https://github.com/wrongtest-intellif): @wrongtest-intellif
 - [Yong Wu](https://github.com/yongwww): @yongwww

[tvm] branch main updated: [DOCKER] Configurable NDK version support (#14000)

2023-02-26 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new d9b0a80e1b [DOCKER] Configurable NDK version support (#14000)
d9b0a80e1b is described below

commit d9b0a80e1b97fae695d54eec6627c6b7fe840d3b
Author: Siva 
AuthorDate: Mon Feb 27 09:36:27 2023 +0530

[DOCKER] Configurable NDK version support (#14000)

Let the Android NDK version configurable as a command line argument
---
 docker/Dockerfile.ci_adreno |  5 +---
 docker/Dockerfile.ci_cpu|  3 --
 docker/Dockerfile.ci_hexagon|  2 --
 docker/Dockerfile.demo_android  |  2 --
 docker/install/ubuntu_install_androidsdk.sh | 43 -
 5 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/docker/Dockerfile.ci_adreno b/docker/Dockerfile.ci_adreno
index 8f4ede3a5e..11be0a8baa 100644
--- a/docker/Dockerfile.ci_adreno
+++ b/docker/Dockerfile.ci_adreno
@@ -22,10 +22,7 @@ COPY utils/apt-install-and-clear.sh 
/usr/local/bin/apt-install-and-clear
 
 # Android SDK
 COPY install/ubuntu_install_androidsdk.sh /install/ubuntu_install_androidsdk.sh
-RUN bash /install/ubuntu_install_androidsdk.sh
-ENV ANDROID_HOME=/opt/android-sdk-linux
-ENV ANDROID_NDK_HOME=/opt/android-sdk-linux/ndk/21.3.6528147
-ENV ANDROID_NDK_MAJOR=21
+RUN bash /install/ubuntu_install_androidsdk.sh 25.2.9519653 3.22.1 33.0.2 33
 ENV PATH /opt/android-sdk-linux/platform-tools:$PATH
 
 # Clang tool for CLML source codegen
diff --git a/docker/Dockerfile.ci_cpu b/docker/Dockerfile.ci_cpu
index 74e90670ab..c75bf31361 100644
--- a/docker/Dockerfile.ci_cpu
+++ b/docker/Dockerfile.ci_cpu
@@ -128,9 +128,6 @@ RUN bash /install/ubuntu_install_vitis_ai_packages_ci.sh
 # Android SDK
 COPY install/ubuntu_install_androidsdk.sh /install/ubuntu_install_androidsdk.sh
 RUN bash /install/ubuntu_install_androidsdk.sh
-ENV ANDROID_HOME=/opt/android-sdk-linux/
-ENV ANDROID_NDK_HOME=/opt/android-sdk-linux/ndk/21.3.6528147/
-ENV ANDROID_NDK_MAJOR=21
 
 # PaddlePaddle deps
 COPY install/ubuntu_install_paddle.sh /install/ubuntu_install_paddle.sh
diff --git a/docker/Dockerfile.ci_hexagon b/docker/Dockerfile.ci_hexagon
index bf4e33fc7c..3b4c58ef43 100644
--- a/docker/Dockerfile.ci_hexagon
+++ b/docker/Dockerfile.ci_hexagon
@@ -62,8 +62,6 @@ RUN bash /install/ubuntu_install_java.sh
 COPY install/ubuntu_install_androidsdk.sh /install/ubuntu_install_androidsdk.sh
 RUN bash /install/ubuntu_install_androidsdk.sh
 ENV ANDROID_HOME=/opt/android-sdk-linux
-ENV ANDROID_NDK_HOME=/opt/android-sdk-linux/ndk/21.3.6528147
-ENV ANDROID_NDK_MAJOR=21
 ENV PATH /opt/android-sdk-linux/platform-tools:$PATH
 
 # Hexagon
diff --git a/docker/Dockerfile.demo_android b/docker/Dockerfile.demo_android
index ebfacabb4e..42bf705106 100644
--- a/docker/Dockerfile.demo_android
+++ b/docker/Dockerfile.demo_android
@@ -81,5 +81,3 @@ RUN cd /usr && \
 # Environment variables
 ENV PYTHONPATH=/usr/tvm/python:/usr/tvm/vta/python:${PYTHONPATH}
 ENV ANDROID_HOME=/opt/android-sdk-linux/
-ENV ANDROID_NDK_HOME=/opt/android-sdk-linux/ndk/21.3.6528147/
-ENV ANDROID_NDK_MAJOR=21
diff --git a/docker/install/ubuntu_install_androidsdk.sh 
b/docker/install/ubuntu_install_androidsdk.sh
index bf2d80cd28..5e7278c5d6 100755
--- a/docker/install/ubuntu_install_androidsdk.sh
+++ b/docker/install/ubuntu_install_androidsdk.sh
@@ -26,6 +26,32 @@ ASDKTOOLS_HOME=/opt/android-sdk-tools
 ASDKTOOLS_VERSION=3859397
 
ASDKTOOLS_SHA256=444e22ce8ca0f67353bda4b85175ed3731cae3ffa695ca18119cbacef1c1bea0
 
+ANDROID_NDK_VERSION=21.3.6528147
+CMAKE_VERSION=3.6.4111459
+BUILD_TOOLS_VERSION=27.0.3
+ANDROID_PLATFORM=27
+ANDROID_NDK_MAJOR=21
+
+if [ $# -gt 0 ] ; then
+  ANDROID_NDK_VERSION=$1
+  ANDROID_NDK_MAJOR=`echo $1 | cut -d "." -f1`
+fi
+if [ $# -gt 1 ] ; then
+  CMAKE_VERSION=$2
+fi
+if [ $# -gt 2 ] ; then
+  BUILD_TOOLS_VERSION=$3
+fi
+if [ $# -gt 3 ] ; then
+  ANDROID_PLATFORM=$4
+fi
+
+echo "NDK Version: ${ANDROID_NDK_VERSION}"
+echo "NDK Major  : ${ANDROID_NDK_MAJOR}"
+echo "Cmake Version: ${CMAKE_VERSION}"
+echo "Build Tools: ${BUILD_TOOLS_VERSION}"
+echo "Android Platform: ${ANDROID_PLATFORM}"
+
 wget -q 
http://dl.google.com/android/repository/sdk-tools-linux-${ASDKTOOLS_VERSION}.zip
 -O sdk-tools-linux.zip
 echo "${ASDKTOOLS_SHA256} *sdk-tools-linux.zip" | sha256sum --check -
 unzip sdk-tools-linux.zip
@@ -52,8 +78,8 @@ done
 
 cat >/install/package-list-minimal.txt </dev/null || true
@@ -77,9 +103,12 @@ touch /root/.android/repositories.cfg
 # NOTE: sdkmanager returns exit code 141
 (yes || true) | sdkmanager --licenses --sdk_root="$ANDROID_HOME" || true
 sdkmanager --verbose --package_file=/install/package-list-minimal.txt 
--sdk_root="$ANDROID_HOME"
-test -d "${ANDROID_HOME}/build-tools/2

[tvm] branch main updated (aa927df6b3 -> e7ad4bc071)

2023-02-16 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from aa927df6b3 [Metaschedule] EvolutionarySearchNode::State constructor 
typo fix (#14002)
 add e7ad4bc071 [CLML] Changes corresponding to OpenCL workspace 
refactorization (#13972)

No new revisions were added by this update.

Summary of changes:
 src/runtime/contrib/clml/clml_runtime.cc | 127 +--
 src/runtime/opencl/opencl_common.h   |   2 +
 2 files changed, 70 insertions(+), 59 deletions(-)

[tvm] branch main updated (bea4919a75 -> 266ff51d2a)

2023-02-13 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from bea4919a75 [TVMScript] Encourage using T.Buffer directly (#13971)
 add 266ff51d2a [ONNX] Extend reduction types supported by ScatterND 
(#13946)

No new revisions were added by this update.

Summary of changes:
 include/tvm/relay/attrs/transform.h   |  5 ++-
 python/tvm/relay/frontend/onnx.py | 53 ++-
 python/tvm/relay/op/transform.py  |  8 +++-
 python/tvm/topi/cuda/scatter.py   | 16 +++-
 python/tvm/topi/scatter.py| 28 +-
 tests/python/frontend/onnx/test_forward.py|  2 -
 tests/python/relay/test_op_level3.py  | 19 --
 tests/python/topi/python/test_topi_scatter.py | 18 +++--
 8 files changed, 123 insertions(+), 26 deletions(-)

[tvm] branch main updated: [ONNX] Support Bitwise operations (#13888)

2023-02-03 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 583f5ae37c [ONNX] Support Bitwise operations (#13888)
583f5ae37c is described below

commit 583f5ae37c3b34ac8e1ce36fccf22c1e7eb89ab1
Author: Valery Chernov 
AuthorDate: Fri Feb 3 14:30:55 2023 +0400

[ONNX] Support Bitwise operations (#13888)

* add base class for bitwise operations. BitwiseAnd, BitwiseNot, BitwiseOr 
and BitwiseXor were implemented

* add test for BitwiseAnd, BitwiseNot, BitwiseOr, BitwiseXor operations to 
ONNX front-end

* add test of BitShift for ONNX front-end

* fix dtype for test

* skip test due to old version of ORT

-

Co-authored-by: Valery Chernov 
---
 python/tvm/relay/frontend/onnx.py  |  83 ++--
 tests/python/frontend/onnx/test_forward.py | 150 +
 2 files changed, 225 insertions(+), 8 deletions(-)

diff --git a/python/tvm/relay/frontend/onnx.py 
b/python/tvm/relay/frontend/onnx.py
index 8b4a0cc5e8..8de5e0e08b 100644
--- a/python/tvm/relay/frontend/onnx.py
+++ b/python/tvm/relay/frontend/onnx.py
@@ -5578,13 +5578,31 @@ class ConvInteger(OnnxOpConverter):
 )
 
 
-class BitShift(OnnxOpConverter):
-"""Operator converter for NonZero"""
+class BitwiseBase(OnnxOpConverter):
+"""Base class of operator converter for Bitwise operations"""
+
+name = ""
+
+@classmethod
+def check_inputs(cls, inputs, num=2, use_int=True):
+assert len(inputs) == num, "{} takes {} inputs, {} 
given".format(cls.name, num, len(inputs))
+
+valid_types = ["uint8", "uint16", "uint32", "uint64"]
+if use_int:
+valid_types += ["int8", "int16", "int32", "int64"]
+for i in range(num):
+in_dtype = infer_type(inputs[i]).checked_type.dtype
+assert in_dtype in valid_types, "Wrong dtype of the {}-th input: 
{}".format(i, in_dtype)
+
+
+class BitShift(BitwiseBase):
+"""Operator converter for BitShift"""
+
+name = "BitShift"
 
 @classmethod
 def _impl_v11(cls, inputs, attr, params):
-if len(inputs) != 2:
-raise ValueError("Bitshift expects 2 inputs")
+cls.check_inputs(inputs, use_int=False)
 
 direction = attr.get("direction", "LEFT").decode("ascii")
 if direction == "LEFT":
@@ -5596,6 +5614,54 @@ class BitShift(OnnxOpConverter):
 return out
 
 
+class BitwiseAnd(BitwiseBase):
+"""Operator converter for BitwiseAnd"""
+
+name = "BitwiseAnd"
+
+@classmethod
+def _impl_v18(cls, inputs, attr, params):
+cls.check_inputs(inputs)
+
+return _op.bitwise_and(*inputs)
+
+
+class BitwiseNot(BitwiseBase):
+"""Operator converter for BitwiseNot"""
+
+name = "BitwiseNot"
+
+@classmethod
+def _impl_v18(cls, inputs, attr, params):
+cls.check_inputs(inputs, num=1)
+
+return _op.bitwise_not(*inputs)
+
+
+class BitwiseOr(BitwiseBase):
+"""Operator converter for BitwiseOr"""
+
+name = "BitwiseOr"
+
+@classmethod
+def _impl_v18(cls, inputs, attr, params):
+cls.check_inputs(inputs)
+
+return _op.bitwise_or(*inputs)
+
+
+class BitwiseXor(BitwiseBase):
+"""Operator converter for BitwiseXor"""
+
+name = "BitwiseXor"
+
+@classmethod
+def _impl_v18(cls, inputs, attr, params):
+cls.check_inputs(inputs)
+
+return _op.bitwise_xor(*inputs)
+
+
 class Unique(OnnxOpConverter):
 """Operator converter for unique"""
 
@@ -6319,7 +6385,12 @@ def _get_convert_map(opset):
 "OptionalHasElement": OptionalHasElement.get_converter(opset),
 "OptionalGetElement": OptionalGetElement.get_converter(opset),
 "Affine": Affine.get_converter(opset),
+# Bitwise operators
 "BitShift": BitShift.get_converter(opset),
+"BitwiseAnd": BitwiseAnd.get_converter(opset),
+"BitwiseNot": BitwiseNot.get_converter(opset),
+"BitwiseOr": BitwiseOr.get_converter(opset),
+"BitwiseXor": BitwiseXor.get_converter(opset),
 "ThresholdedRelu": ThresholdedRelu.get_converter(opset),
 "ScaledTanh": ScaledTanh.get_converter(opset),
 "ParametricSoftplus": ParametricSoftPlus.get_converter(opset),
@@ -6337,10 +6408

[tvm] branch main updated: [CLML][CODEGEN] CLML native codegen utility (#13837)

2023-02-02 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new d35a8ab135 [CLML][CODEGEN] CLML native codegen utility (#13837)
d35a8ab135 is described below

commit d35a8ab1353afc40317396b2ddfda8f35a99ba8a
Author: Siva 
AuthorDate: Fri Feb 3 11:35:55 2023 +0530

[CLML][CODEGEN] CLML native codegen utility (#13837)

* [CLML][CODEGEN] CLML native codegen utility

This util generates native CLML code given a DNN model.
It does import via tvmc, extracts clml_modules, get the json source and
finally generates clml_models.cc that holds source for various sub graphs.
cpp_clml tool has additional infrastructure to compile it as a standalong
binary that runs these models.

This PR adds symbol name to the generates json grpah.
Also, extends const_loader interface to get constant params.

* * review comments

* * review

* * review
---
 apps/cpp_clml/CMakeLists.txt   |  61 ++
 apps/cpp_clml/README.md| 145 
 apps/cpp_clml/clml_runner.cc   | 818 +
 apps/cpp_clml/clml_runner.h| 262 +++
 apps/cpp_clml/main.cc  | 243 ++
 apps/cpp_clml/scripts/clml_codegen.py  |  64 ++
 cmake/modules/contrib/CLML.cmake   |   2 +-
 docker/Dockerfile.ci_adreno|   3 +
 python/tvm/relay/op/contrib/clml.py| 772 +++
 .../backend/contrib/codegen_json/codegen_json.h|   1 +
 src/runtime/const_loader_module.cc |  10 +
 src/runtime/contrib/json/json_runtime.h|   3 +
 12 files changed, 2383 insertions(+), 1 deletion(-)

diff --git a/apps/cpp_clml/CMakeLists.txt b/apps/cpp_clml/CMakeLists.txt
new file mode 100644
index 00..8c0fd53bf9
--- /dev/null
+++ b/apps/cpp_clml/CMakeLists.txt
@@ -0,0 +1,61 @@
+cmake_minimum_required(VERSION 3.13)
+
+project(clml_run VERSION 2.0)
+
+if(NOT DEFINED CMAKE_TOOLCHAIN_FILE)
+  message( FATAL_ERROR "CMAKE_TOOLCHAIN_FILE Not set, forcing exit. Suggested 
value: {ANDROID_NDK_PATH}/build/cmake/android.toolchain.cmake." )
+endif(NOT DEFINED CMAKE_TOOLCHAIN_FILE)
+
+if(NOT DEFINED ANDROID_ABI)
+  message( FATAL_ERROR "ANDROID_ABI Not set, forcing exit. Suggested value(s): 
arm64-v8a (64), armeabi-v7a (32)" )
+endif(NOT DEFINED ANDROID_ABI)
+
+if(NOT DEFINED CLML_SDK)
+  message( FATAL_ERROR "CLML_SDK Not set, forcing exit." )
+endif(NOT DEFINED CLML_SDK)
+
+if (CMAKE_FIND_ROOT_PATH_MODE_LIBRARY STREQUAL "ONLY")
+  set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY BOTH)
+endif()
+
+find_library(CLML_LIBRARIES NAMES libOpenCL.so NO_DEFAULT_PATH PATHS 
${CLML_SDK}/lib ${CLML_SDK}/lib64)
+
+# CMake/Android variables
+set( ANDROID_STL  c++_static CACHE STRING "Target Android STL") # default
+
+# Source variables
+set( OPENCL_INCLUDE_DIRS  ${CLML_SDK} CACHE PATH "filepath to OpenCL headers")
+
+set(CMAKE_CXX_STANDARD 17)
+set(CMAKE_CXX_STANDARD_REQUIRED True)
+
+#we do not want to pass -fno-exceptions
+if(${CMAKE_CXX_FLAGS} MATCHES "-fno-exceptions")
+  message ( WARNING "Disabling -fno-exceptions")
+  string(REGEX REPLACE "-fno-exceptions" "" CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS})
+endif()
+
+#we do not want to pass -fno-rtti
+if(${CMAKE_CXX_FLAGS} MATCHES "-fno-rtti")
+  message ( WARNING "Disabling -fno-rtti")
+  string(REGEX REPLACE "-fno-rtti" "" CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS})
+endif()
+
+set(COMMON_SOURCE_FILES
+clml_models.cc
+clml_runner.cc
+clml_runner.h
+main.cc
+../../3rdparty/cnpy/cnpy.cpp
+)
+
+include_directories(
+src
+${OPENCL_INCLUDE_DIRS}
+"../../3rdparty/dmlc-core/include"
+"../../3rdparty/cnpy/"
+)
+
+add_executable(clml_run ${COMMON_SOURCE_FILES})
+target_link_options(clml_run PRIVATE 
-Wl,--unresolved-symbols=ignore-in-shared-libs)
+target_link_libraries(clml_run ${CLML_LIBRARIES} z)
diff --git a/apps/cpp_clml/README.md b/apps/cpp_clml/README.md
new file mode 100644
index 00..3200492122
--- /dev/null
+++ b/apps/cpp_clml/README.md
@@ -0,0 +1,145 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+# OpenCLML Debug Tool
+
+Tool to generate OpenCLML source file given a model from any framework and 
compile it as a native application that runs on Android target.
+This tool helps to debug or triage OpenCLML offloaded sub graphs as a 
standalone application.
+
+### Codegen
+
+Models can be downloaded from well known frameworks like Tensorflow, PyTorch, 
TFLite, Onnx ..etc.
+Assuming  ```resnet50.h5``` is a Keras ResNet50 model file, use the below

[tvm] branch main updated: [TOPHUB] use keys as a keyword for searching of existing statistics (#13874)

2023-02-02 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new ea34e6eb0b [TOPHUB] use keys as a keyword for searching of existing 
statistics (#13874)
ea34e6eb0b is described below

commit ea34e6eb0bd47b397a6c29b18b5ff23ef88f4998
Author: Andrey Malyshev 
AuthorDate: Thu Feb 2 16:43:05 2023 +0200

[TOPHUB] use keys as a keyword for searching of existing statistics (#13874)

* [TOPHUB] use keys as a keyword for searching of existing statistics

In case of ARM we might not specify -device and in this case llvm will
be used while even in this case we can determin proper filename with
stat since keys have architecture defined. The same situatin must with
with x86

* Add test on target not having arm_cpu device

* minor fix, add comment

* Fix pylint

* Fix comment
---
 python/tvm/autotvm/tophub.py   | 10 ++
 tests/python/unittest/test_autotvm_dispatch_context.py | 16 
 2 files changed, 26 insertions(+)

diff --git a/python/tvm/autotvm/tophub.py b/python/tvm/autotvm/tophub.py
index f705d591e6..99dd312d87 100644
--- a/python/tvm/autotvm/tophub.py
+++ b/python/tvm/autotvm/tophub.py
@@ -106,10 +106,20 @@ def context(target, extra_files=None):
 if isinstance(tgt, str):
 tgt = Target(tgt)
 
+# The TOPHUB file names rely on Target's device or kind. Both these 
types of
+# information exist in Target.keys, but rules of filling this filed is 
not explicitly
+# defined, we are afraid to rely only on Target.keys. At the same time 
Target.device
+# is filled only if device was pointed explicitly in target string, 
that is not mandatory
+# and in some cases we need to get information about device from 
Target.keys
+# In priority order we verify:
+# 1) Target.device
+# 2) Target.keys
+# 3) Target.kind
 possible_names = []
 device = tgt.attrs.get("device", "")
 if device != "":
 possible_names.append(_alias(device))
+possible_names.extend(tgt.keys)
 possible_names.append(tgt.kind.name)
 
 all_packages = list(PACKAGE_VERSION.keys())
diff --git a/tests/python/unittest/test_autotvm_dispatch_context.py 
b/tests/python/unittest/test_autotvm_dispatch_context.py
index 6ca062047f..ba75992128 100644
--- a/tests/python/unittest/test_autotvm_dispatch_context.py
+++ b/tests/python/unittest/test_autotvm_dispatch_context.py
@@ -19,6 +19,7 @@ The dispatcher can choose which template to use according
 to the parameters of workload"""
 
 from tvm import autotvm
+import tvm
 
 
 @autotvm.template("testing/dispatch_fallback")
@@ -31,5 +32,20 @@ def test_fallback():
 simple_template(2, 3)
 
 
+def test_tophub_kinds_match():
+def verify_arm_cpu(target):
+best_by_targetkey = autotvm.tophub.context(target).best_by_targetkey
+assert len(best_by_targetkey)
+found_arm_cpu = False
+for a, _ in best_by_targetkey:
+if "arm_cpu" in a:
+found_arm_cpu = True
+break
+assert found_arm_cpu
+
+verify_arm_cpu("llvm -device=arm_cpu -mtriple=aarch64-linux-gnu 
-mattr=+neon,+v8.2a,+dotprod")
+verify_arm_cpu("llvm -model=snapdragon835 -mtriple=arm64-linux-android 
-mattr=+neon")
+
+
 if __name__ == "__main__":
 test_fallback()

[tvm] branch main updated: [QNN][Relay][Topi] Add qnn.dense with weight layout (#13854)

2023-02-02 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 37e1a6862c [QNN][Relay][Topi] Add qnn.dense with weight layout (#13854)
37e1a6862c is described below

commit 37e1a6862ca1bb77e33ca9c03e1365d50f468bd9
Author: ibsidorenko <98739392+ibsidore...@users.noreply.github.com>
AuthorDate: Thu Feb 2 17:21:00 2023 +0300

[QNN][Relay][Topi] Add qnn.dense with weight layout (#13854)

* [Hexagon][QNN] Improve performance of qnn.mul

This commit imroves performance of qnn.mul operation without QNN
canonicalization.

* [QNN][Relay][Topi] Add qnn.dense with weight layout

This commit adds new Relay operation "qnn.contrib_dense_pack" that supports
different weights layout (nn.dense and qnn.dense do not support this
attribute). This new operation is full analog of "nn.contrib_dense_pack"
operation but in QNN space.
---
 python/tvm/relay/qnn/op/_qnn.py|  11 +-
 python/tvm/relay/qnn/op/legalizations.py   | 134 -
 python/tvm/relay/qnn/op/qnn.py |  64 ++
 python/tvm/relay/qnn/strategy/generic.py   |   6 +
 python/tvm/relay/qnn/strategy/hexagon.py   |  18 ++
 python/tvm/topi/hexagon/qnn/__init__.py|   1 +
 .../hexagon/qnn/{__init__.py => dense_alter_op.py} |  26 +--
 python/tvm/topi/hexagon/qnn/nn.py  | 216 +
 python/tvm/topi/nn/qnn.py  |  19 ++
 src/relay/backend/te_compiler_cache.cc |  20 +-
 src/relay/op/nn/nn.h   |   5 +
 src/relay/qnn/op/dense.cc  | 105 +-
 .../contrib/test_arm_compute_lib/test_dense.py |   6 +-
 .../test_hexagon/test_wo_qnn_canonicalization.py   | 172 +++-
 tests/python/relay/test_pass_qnn_legalize.py   |  92 +
 15 files changed, 779 insertions(+), 116 deletions(-)

diff --git a/python/tvm/relay/qnn/op/_qnn.py b/python/tvm/relay/qnn/op/_qnn.py
index c9c4c86e8b..e2157a051a 100644
--- a/python/tvm/relay/qnn/op/_qnn.py
+++ b/python/tvm/relay/qnn/op/_qnn.py
@@ -93,7 +93,16 @@ def alter_op_layout_qnn_conv2d(attrs, inputs, tinfos, 
out_type):
 
 # qnn.dense
 register_strategy("qnn.dense", strategy.qnn_dense_strategy)
-register_pattern("qnn.dense", OpPattern.OUT_ELEMWISE_FUSABLE)
+
+
+@register_alter_op_layout("qnn.dense")
+def alter_op_layout_qnn_dense(attrs, inputs, tinfos, out_type):
+"""Alternate the layout of qnn.dense"""
+return topi.nn.qnn_dense_alter_layout(attrs, inputs, tinfos, out_type)
+
+
+# qnn.contrib_dense_pack
+register_strategy("qnn.contrib_dense_pack", strategy.qnn_dense_pack_strategy)
 
 # qnn.batch_matmul
 register_strategy("qnn.batch_matmul", strategy.qnn_batch_matmul_strategy)
diff --git a/python/tvm/relay/qnn/op/legalizations.py 
b/python/tvm/relay/qnn/op/legalizations.py
index ef368a016e..53cb41c2fb 100644
--- a/python/tvm/relay/qnn/op/legalizations.py
+++ b/python/tvm/relay/qnn/op/legalizations.py
@@ -340,6 +340,62 @@ def helper_change_dtypes_to_int8(attrs, inputs, types, 
relay_op):
 )
 
 
+def helper_change_dtypes_to_uint8(attrs, inputs, types, relay_op):
+"""Helper function to change dtypes to uint8 x uint8.
+Legalizes QNN dense op for Hexagon DSP. It supports fast u8 x u8 vrmpy 
instruction.
+
+Converting from int8 to uint8 can be done in following manner:
+
+Original equation
+  scale * (QA - zp_a)
+  scale * (QA + 128 - 128 - zp_a)
+  scale * ( (QA + 128) - (zp_a + 128))
+
+Replacing QA + 128 with QA' and (zp_a + 128) with zp_a'
+We get our new quantized uint8 tensor - scale * (QA' - zp_a')
+
+Parameters
+--
+attrs : tvm.ir.Attrs
+Attributes of current convolution
+inputs : list of tvm.relay.Expr
+The args of the Relay expr to be legalized
+types : list of types
+List of input and output types
+
+Returns
+---
+result : tvm.relay.Expr
+The legalized expr
+"""
+# Collect the dtypes.
+data_dtype = types[0].dtype
+kernel_dtype = types[1].dtype
+
+# Do nothing since it is already uint8.
+if data_dtype == "uint8" and kernel_dtype == "uint8":
+return None
+
+# Collect the input exprs.
+data, kernel, input_zero_point, kernel_zero_point, input_scale, 
kernel_scale = inputs
+
+# Shift input if necessary.
+if data_dtype == "int8":
+# Compute (QA + 128) and (zp_a + 128)
+data, input_zero_point = _shift(data, input_zero_point, "uint8")
+
+# Shift kernel if necessary.
+if kernel_dtype == "int8":
+# Compute (QA + 128) an

[tvm] branch main updated: [RUNTIME] Fix the manual determination of cores in FillDataForMeasure (#13849)

2023-02-01 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new f0ea9e461a [RUNTIME] Fix the manual determination of cores in 
FillDataForMeasure (#13849)
f0ea9e461a is described below

commit f0ea9e461a4bc056d320dafe651135c3a8357a34
Author: dsbarinov1 <71228944+dsbarin...@users.noreply.github.com>
AuthorDate: Thu Feb 2 08:14:49 2023 +0300

[RUNTIME] Fix the manual determination of cores in FillDataForMeasure 
(#13849)

* Assertion failed during tuning

* Cleanup

* Do not commit

* Do not commit

* Undo fix + provide test for multithread random filling

* Random fill test with fix enabled

* Isolate the effect of this test on the other tests

* Correct the typo in the function name

* Import threading + lint
---
 src/runtime/contrib/random/mt_random_engine.cc | 10 -
 tests/python/contrib/test_random.py| 28 ++
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/src/runtime/contrib/random/mt_random_engine.cc 
b/src/runtime/contrib/random/mt_random_engine.cc
index ac52594360..dc01114af0 100644
--- a/src/runtime/contrib/random/mt_random_engine.cc
+++ b/src/runtime/contrib/random/mt_random_engine.cc
@@ -192,12 +192,12 @@ class RandomEngine {
 struct ParallelTask {
   static int RunTask(int task_id, TVMParallelGroupEnv* penv, void* cdata) {
 ParallelTask* task = static_cast(cdata);
-task->Run(task_id);
+task->Run(task_id, penv->num_task);
 return 0;
   }
 
-  void Run(int i) {
-int64_t chunk_size = size / num_threads;
+  void Run(int i, int num_tasks) {
+int64_t chunk_size = size / num_tasks;
 int64_t st = i * chunk_size;
 int64_t ed = std::min(st + chunk_size, size);
 self->FillDataImpl(data, st, ed, dtype);
@@ -205,7 +205,6 @@ class RandomEngine {
 
   RandomEngine* self;
   void* data;
-  int num_threads;
   int64_t size;
   DLDataType dtype;
 };
@@ -220,8 +219,7 @@ class RandomEngine {
 }
 if (dtype.bits == 1 || dtype.bits == 4 || dtype.bits == 8 || dtype.bits == 
16 ||
 dtype.bits == 32 || dtype.bits == 64) {
-  int num_threads = task.num_threads = 
runtime::threading::MaxConcurrency();
-  int res = TVMBackendParallelLaunch(ParallelTask::RunTask, , 
num_threads);
+  int res = TVMBackendParallelLaunch(ParallelTask::RunTask, , 0);
   ICHECK_EQ(res, 0) << "RandomFillForMeasure: TVMBackendParallelLaunch 
failed";
 } else {
   LOG(FATAL) << "Doesn't support dtype code " << dtype.code << " dtype 
bits " << dtype.bits;
diff --git a/tests/python/contrib/test_random.py 
b/tests/python/contrib/test_random.py
index 7a52c0dbf1..ddc06b0711 100644
--- a/tests/python/contrib/test_random.py
+++ b/tests/python/contrib/test_random.py
@@ -20,6 +20,7 @@ import numpy as np
 from tvm.contrib import random
 from tvm import rpc
 import tvm.testing
+import threading
 
 
 def test_randint():
@@ -155,8 +156,35 @@ def test_random_fill():
 test_rpc(dtype)
 
 
+def test_random_fill_mt():
+"""Check random filler applicability in case of nontrivial thread pool 
configuration.
+Particularly when MaxConcurrency != num_workers_used_ which is actual for 
big-little systems.
+"""
+no_exception_happened = True
+
+def test_body():
+try:
+num_thread_used = 1
+configure_threads = 
tvm.get_global_func("runtime.config_threadpool")
+configure_threads(1, num_thread_used)
+
+test_input = tvm.runtime.ndarray.empty((10, 10))
+random_fill = 
tvm.get_global_func("tvm.contrib.random.random_fill_for_measure")
+random_fill(test_input)
+except:
+nonlocal no_exception_happened
+no_exception_happened = False
+
+# ThreadPool object is thread local. To eliminate effect on other test 
cases put it into thread
+x = threading.Thread(target=test_body)
+x.start()
+x.join()
+assert no_exception_happened
+
+
 if __name__ == "__main__":
 test_randint()
 test_uniform()
 test_normal()
 test_random_fill()
+test_random_fill_mt()

[tvm] branch main updated (c3fe08fb20 -> 7db77ada3d)

2023-02-01 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from c3fe08fb20 [ARITH] support floordiv in deduce bound (#13880)
 add 7db77ada3d [Build] Fix misleading error messages (#13887)

No new revisions were added by this update.

Summary of changes:
 cmake/utils/FindLLVM.cmake | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

[tvm] branch main updated (0d5baacc02 -> d8833bdbbe)

2023-01-31 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 0d5baacc02 [ONNX] Support SequenceErase op (#13865)
 add d8833bdbbe [ONNX] Support SequenceEmpty op (#13866)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relay/frontend/onnx.py  | 10 ++
 tests/python/frontend/onnx/test_forward.py | 32 ++
 2 files changed, 42 insertions(+)

[tvm] branch main updated: [ONNX] Support SequenceErase op (#13865)

2023-01-30 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 0d5baacc02 [ONNX] Support SequenceErase op (#13865)
0d5baacc02 is described below

commit 0d5baacc0241253547fe5235f95416a6467e712c
Author: Valery Chernov 
AuthorDate: Tue Jan 31 11:06:02 2023 +0400

[ONNX] Support SequenceErase op (#13865)

* SequenceErase was implemented in ONNX front-end

* add SequenceErase node to Sequence test

* remark from reviewer. fix negative position recalculation

* add assert

-

Co-authored-by: Valery Chernov 
---
 python/tvm/relay/frontend/onnx.py  | 42 ++
 tests/python/frontend/onnx/test_forward.py | 10 ++-
 2 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/python/tvm/relay/frontend/onnx.py 
b/python/tvm/relay/frontend/onnx.py
index 6e0c7cc2dd..93429a8638 100644
--- a/python/tvm/relay/frontend/onnx.py
+++ b/python/tvm/relay/frontend/onnx.py
@@ -6148,13 +6148,35 @@ class SequenceConstruct(OnnxOpConverter):
 return _expr.Tuple(inputs)
 
 
-class SequenceLength(OnnxOpConverter):
-"""Operator converter for sequence length op."""
+class SequenceErase(OnnxOpConverter):
+"""Operator converter for sequence erase op."""
 
 @classmethod
 def _impl_v11(cls, inputs, attr, params):
-# Get length of input sequence
-return _expr.const(len(inputs[0]), dtype="int64")
+# Erase tensor from sequence on specified position
+input_sequence = inputs[0]
+
+if len(inputs) == 2:
+position = inputs[1]
+# Non constant position is not supported.
+if isinstance(position, _expr.Constant):
+position = position.data.numpy()
+elif position.name_hint in params:
+position = params[position.name_hint].numpy()
+else:
+raise NotImplementedError("Position must be a constant.")
+else:
+position = -1
+
+seq_len = len(input_sequence)
+assert -seq_len <= position < seq_len, "Position is out of bounds"
+
+if position < 0:
+position = seq_len + position
+# Convert sequence to a list, insert tensors before erased, and 
repackage as Tuple.
+tensor_list = [input_sequence[i] for i in range(seq_len) if i != 
position]
+# Create new tuple and return.
+return _expr.Tuple(tensor_list)
 
 
 class SequenceInsert(OnnxOpConverter):
@@ -6188,6 +6210,15 @@ class SequenceInsert(OnnxOpConverter):
 return _expr.Tuple(tensor_list)
 
 
+class SequenceLength(OnnxOpConverter):
+"""Operator converter for sequence length op."""
+
+@classmethod
+def _impl_v11(cls, inputs, attr, params):
+# Get length of input sequence
+return _expr.const(len(inputs[0]), dtype="int64")
+
+
 class ConcatFromSequence(OnnxOpConverter):
 """Operator converter for sequence concatenation op."""
 
@@ -6492,8 +6523,9 @@ def _get_convert_map(opset):
 "LinearRegressor": LinearRegressor.get_converter(opset),
 # Sequence operators
 "SequenceConstruct": SequenceConstruct.get_converter(opset),
-"SequenceLength": SequenceLength.get_converter(opset),
+"SequenceErase": SequenceErase.get_converter(opset),
 "SequenceInsert": SequenceInsert.get_converter(opset),
+"SequenceLength": SequenceLength.get_converter(opset),
 "ConcatFromSequence": ConcatFromSequence.get_converter(opset),
 "SplitToSequence": SplitToSequence.get_converter(opset),
 "SequenceAt": SequenceAt.get_converter(opset),
diff --git a/tests/python/frontend/onnx/test_forward.py 
b/tests/python/frontend/onnx/test_forward.py
index 6a780a632f..3e1af40867 100644
--- a/tests/python/frontend/onnx/test_forward.py
+++ b/tests/python/frontend/onnx/test_forward.py
@@ -7747,10 +7747,17 @@ def test_sequence(target, dev):
 outputs=["inserted_sequence"],
 )
 
+# Test sequence erase.
+erase_node = helper.make_node(
+"SequenceErase",
+inputs=["inserted_sequence", "position"],
+outputs=["erased_sequence"],
+)
+
 # Test sequence concatenation.
 concat_node = helper.make_node(
 "ConcatFromSequence",
-inputs=["inserted_sequence"],
+inputs=["erased_sequence"],
 outputs=["concat_sequence"],
 axis=axis,
 )
@@ -7796,6 +7803,7 @@ def test_sequence(target, dev):
 position_node,
 construct_node,
 insert_node,
+erase_node,
 concat_node,
 split_node,
 at_node,

[tvm] branch main updated: [ONNX] Support SequenceLength op (#13863)

2023-01-30 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new e6262bf985 [ONNX] Support SequenceLength op (#13863)
e6262bf985 is described below

commit e6262bf9855a0c6f546f097910b48f955f2749cf
Author: Valery Chernov 
AuthorDate: Mon Jan 30 13:29:04 2023 +0400

[ONNX] Support SequenceLength op (#13863)

* add SequenceLength op

* add SequenceLength test

* graph fix

-

Co-authored-by: Valery Chernov 
---
 python/tvm/relay/frontend/onnx.py  | 10 ++
 tests/python/frontend/onnx/test_forward.py | 21 +++--
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/python/tvm/relay/frontend/onnx.py 
b/python/tvm/relay/frontend/onnx.py
index 7b35d4a481..6e0c7cc2dd 100644
--- a/python/tvm/relay/frontend/onnx.py
+++ b/python/tvm/relay/frontend/onnx.py
@@ -6148,6 +6148,15 @@ class SequenceConstruct(OnnxOpConverter):
 return _expr.Tuple(inputs)
 
 
+class SequenceLength(OnnxOpConverter):
+"""Operator converter for sequence length op."""
+
+@classmethod
+def _impl_v11(cls, inputs, attr, params):
+# Get length of input sequence
+return _expr.const(len(inputs[0]), dtype="int64")
+
+
 class SequenceInsert(OnnxOpConverter):
 """Operator converter for sequence insert op."""
 
@@ -6483,6 +6492,7 @@ def _get_convert_map(opset):
 "LinearRegressor": LinearRegressor.get_converter(opset),
 # Sequence operators
 "SequenceConstruct": SequenceConstruct.get_converter(opset),
+"SequenceLength": SequenceLength.get_converter(opset),
 "SequenceInsert": SequenceInsert.get_converter(opset),
 "ConcatFromSequence": ConcatFromSequence.get_converter(opset),
 "SplitToSequence": SplitToSequence.get_converter(opset),
diff --git a/tests/python/frontend/onnx/test_forward.py 
b/tests/python/frontend/onnx/test_forward.py
index 4b17cfbbb3..6a780a632f 100644
--- a/tests/python/frontend/onnx/test_forward.py
+++ b/tests/python/frontend/onnx/test_forward.py
@@ -7760,10 +7760,16 @@ def test_sequence(target, dev):
 "SplitToSequence", inputs=["concat_sequence"], 
outputs=["split_sequence"], axis=axis
 )
 
+# Test tensor extraction from sequence
 at_node = helper.make_node(
 "SequenceAt", inputs=["split_sequence", "position"], 
outputs=["output"]
 )
 
+# Test sequence length
+length_node = helper.make_node(
+"SequenceLength", inputs=["split_sequence"], outputs=["output_2"]
+)
+
 if new_axis is not None:
 new_axis_attr = helper.make_attribute("new_axis", new_axis)
 concat_node.attribute.append(new_axis_attr)
@@ -7781,9 +7787,20 @@ def test_sequence(target, dev):
 output_shape[axis] = num_tensors + 1
 else:
 output_shape[axis] = (num_tensors + 1) * output_shape[axis]
-graph_outputs = [helper.make_tensor_value_info("output", 
TensorProto.FLOAT, output_shape)]
+graph_outputs = [
+helper.make_tensor_value_info("output", TensorProto.FLOAT, 
output_shape),
+helper.make_tensor_value_info("output_2", TensorProto.INT64, []),
+]
 
-graph_nodes = [position_node, construct_node, insert_node, 
concat_node, split_node, at_node]
+graph_nodes = [
+position_node,
+construct_node,
+insert_node,
+concat_node,
+split_node,
+at_node,
+length_node,
+]
 
 graph = helper.make_graph(
 graph_nodes,

[tvm] branch main updated (10d6c17238 -> 3c81d9bbac)

2023-01-29 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 10d6c17238 [WASM][FIX] test tests/node/websock_rpc_test.py (#13862)
 add 3c81d9bbac [RUNTIME][CLML] OpenCLML tuning and profiling enhanced 
(#13843)

No new revisions were added by this update.

Summary of changes:
 src/runtime/contrib/clml/clml_runtime.cc | 143 ---
 src/runtime/opencl/opencl_common.h   |  40 -
 2 files changed, 111 insertions(+), 72 deletions(-)

[tvm] branch main updated (1d89071863 -> 1f40b925a5)

2023-01-24 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 1d89071863 [TVMScript] More concise `T.allocate` syntax printing 
(#13830)
 add 1f40b925a5 [TOOL][NATIVE] Android native application for deploy and 
run (#13791)

No new revisions were added by this update.

Summary of changes:
 .gitmodules|   3 +
 3rdparty/cnpy  |   1 +
 CMakeLists.txt |   4 +
 LICENSE|   1 +
 apps/cpp_rtvm/CMakeLists.txt   |  98 ++
 apps/cpp_rtvm/README.md| 354 +
 apps/cpp_rtvm/main.cc  | 264 +++
 .../cpp_rtvm/scripts/download_models.py|  28 +-
 apps/cpp_rtvm/tvm_runner.cc| 320 +++
 apps/cpp_rtvm/tvm_runner.h |  93 ++
 cmake/config.cmake |   3 +
 cmake/modules/LibInfo.cmake|   1 +
 python/tvm/driver/tvmc/composite_target.py |   5 +
 python/tvm/relay/op/contrib/clml.py|   2 +-
 src/auto_scheduler/search_task.cc  |   8 +
 src/runtime/graph_executor/graph_executor.cc   |  44 ++-
 src/runtime/graph_executor/graph_executor.h|   6 +
 src/support/libinfo.cc |   5 +
 tests/scripts/task_build_adreno_bins.sh|   3 +-
 19 files changed, 1227 insertions(+), 16 deletions(-)
 create mode 16 3rdparty/cnpy
 create mode 100644 apps/cpp_rtvm/CMakeLists.txt
 create mode 100644 apps/cpp_rtvm/README.md
 create mode 100644 apps/cpp_rtvm/main.cc
 copy ci/scripts/jenkins/http_utils.py => 
apps/cpp_rtvm/scripts/download_models.py (60%)
 create mode 100644 apps/cpp_rtvm/tvm_runner.cc
 create mode 100644 apps/cpp_rtvm/tvm_runner.h

[tvm] branch main updated: [Adreno] Update interface of AnnotateMemoryScope pass (#13779)

2023-01-19 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 6e01f3d855 [Adreno] Update interface of AnnotateMemoryScope pass 
(#13779)
6e01f3d855 is described below

commit 6e01f3d85581043593dd9b65cb5718c3e7386a81
Author: Egor Churaev 
AuthorDate: Thu Jan 19 17:38:33 2023 +0300

[Adreno] Update interface of AnnotateMemoryScope pass (#13779)

Removed redundant and unused parameter from AnnotateMemoryScope pass.
---
 include/tvm/relay/transform.h| 2 +-
 src/relay/backend/build_module.cc| 2 +-
 src/relay/transforms/annotate_texture_storage.cc | 8 
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/tvm/relay/transform.h b/include/tvm/relay/transform.h
index 3227f7979d..43a0f89d95 100644
--- a/include/tvm/relay/transform.h
+++ b/include/tvm/relay/transform.h
@@ -584,7 +584,7 @@ TVM_DLL Pass CapturePostDfsIndexInSpans();
  * \brief Calls device dependent memory scope analysis pass, collects mapping 
of desirable
  * expr->memory_scope and annotates expressions by VirtualDevice with required 
memory_scope
  */
-TVM_DLL Pass AnnotateMemoryScope(CompilationConfig config);
+TVM_DLL Pass AnnotateMemoryScope();
 
 /*!
  * \brief Removes non-fused reshapes after lowering the graph.
diff --git a/src/relay/backend/build_module.cc 
b/src/relay/backend/build_module.cc
index bca524794a..0642c0c672 100644
--- a/src/relay/backend/build_module.cc
+++ b/src/relay/backend/build_module.cc
@@ -396,7 +396,7 @@ class RelayBuildModule : public runtime::ModuleNode {
 relay_module = transform::Inline()(relay_module);
 relay_module = transform::InferType()(relay_module);
 relay_module = transform::LabelOps()(relay_module);
-relay_module = transform::AnnotateMemoryScope(config_)(relay_module);
+relay_module = transform::AnnotateMemoryScope()(relay_module);
 
 ICHECK(relay_module.defined());
 
diff --git a/src/relay/transforms/annotate_texture_storage.cc 
b/src/relay/transforms/annotate_texture_storage.cc
index 9dbd631ad3..39f065ea8c 100644
--- a/src/relay/transforms/annotate_texture_storage.cc
+++ b/src/relay/transforms/annotate_texture_storage.cc
@@ -645,7 +645,7 @@ Map>> 
CollectStorageInfo(const Expr& expr) {
   return storage_info;
 }
 
-Expr AnnotateMemoryScopeExpr(const Expr& expr, const IRModule& mod, 
CompilationConfig config) {
+Expr AnnotateMemoryScopeExpr(const Expr& expr, const IRModule& mod) {
   auto storage_scope = CollectStorageInfo(expr);
   if (storage_scope.size()) {
 return RewriteVDStorageScopes(storage_scope).Rewrite(expr);
@@ -655,10 +655,10 @@ Expr AnnotateMemoryScopeExpr(const Expr& expr, const 
IRModule& mod, CompilationC
 }
 
 namespace transform {
-tvm::transform::Pass AnnotateMemoryScope(CompilationConfig config) {
+tvm::transform::Pass AnnotateMemoryScope() {
   runtime::TypedPackedFunc 
pass_func =
-  [config = std::move(config)](Function f, IRModule m, PassContext pc) {
-return Downcast(AnnotateMemoryScopeExpr(f, m, config));
+  [](Function f, IRModule m, PassContext pc) {
+return Downcast(AnnotateMemoryScopeExpr(f, m));
   };
   return CreateFunctionPass(pass_func, 2, "AnnotateMemoryScope", {});
 }

[tvm] branch main updated: [FIX] Minor Compilation Warning Fixes (#13794)

2023-01-17 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 3b49f7973d [FIX] Minor Compilation Warning Fixes (#13794)
3b49f7973d is described below

commit 3b49f7973d497a3087004c3487c3e89b8d9e6c2f
Author: Siyuan Feng 
AuthorDate: Tue Jan 17 20:24:30 2023 +0800

[FIX] Minor Compilation Warning Fixes (#13794)

This PR fixes some warnings from the latest clang compiler.
---
 include/tvm/tir/function.h| 8 
 src/tir/analysis/control_flow_graph.h | 7 ---
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/tvm/tir/function.h b/include/tvm/tir/function.h
index cf92f97360..9f7c0fa16b 100644
--- a/include/tvm/tir/function.h
+++ b/include/tvm/tir/function.h
@@ -224,14 +224,14 @@ class TensorIntrin : public ObjectRef {
   TVM_DEFINE_OBJECT_REF_METHODS(TensorIntrin, ObjectRef, TensorIntrinNode)
 };
 
-/*
+/*!
  * \brief Specialize parameters of PrimFunc.
  * \param func The PrimFunc to be specialized.
  * \param param_map The mapping from function params to the instance.
  * \return The new function with parameter specialized.
  * \note We can define a Meta TIR function with symbolic shape:
  *
- * \code
+ * \code{.py}
  *  @T.prim_func
  *  def mem_copy(a: T.handle, b: T.handle, m: T.int32, n: T.int32) -> None:
  *  A = T.match_buffer(a, (m, n), "float32")
@@ -244,14 +244,14 @@ class TensorIntrin : public ObjectRef {
  *
  * Then we can make it specialized with given shapes or buffers.
  *
- * \code
+ * \code{.py}
  *  a, _, m, n = mem_copy.params
  *  func = mem_copy.specialize({a: tir.decl_buffer((16, 16))})
  *  # or
  *  func = mem_copy.specialize({n: 16, m: 16})
  * \endcode
  *
- * \code {.language-id}
+ * \code{.py}
  *  @T.prim_func
  *  def mem_copy_16_16(a: T.handle, b: T.handle) -> None:
  *  A = T.match_buffer(a, (16, 16), "float32")
diff --git a/src/tir/analysis/control_flow_graph.h 
b/src/tir/analysis/control_flow_graph.h
index 590392cf65..00a6b68ff9 100644
--- a/src/tir/analysis/control_flow_graph.h
+++ b/src/tir/analysis/control_flow_graph.h
@@ -292,7 +292,8 @@ class BufferState {
   std::vector constraints_;
 };
 
-/*! \brief Represents the flow of control through a `tir::Stmt`
+/*!
+ * \brief Represents the flow of control through a `tir::Stmt`
  *
  * This class contains an internal representation of the possible
  * control flow that may occur during execution of a `tir::Stmt`.  It
@@ -312,7 +313,7 @@ class BufferState {
  *
  * For example, consider the following PrimFunc
  *
- * ```python
+ * \code{.py}
  * @T.prim_func
  * def func(T.Buffer[16, "float32"]):
  * for i in T.serial(16):
@@ -320,7 +321,7 @@ class BufferState {
  *  B[i] = i
  * else:
  *  B[i] = i-8
- * ```
+ * \endcode
  *
  * The control flow graph would have eight control blocks.
  *

[tvm] branch main updated (287597b45d -> f9759920e0)

2023-01-12 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 287597b45d [CI] Update ci_minimal docker image to cross-compile TVM to 
aarch64 (#13776)
 add f9759920e0 [UnitTest] Parametrized 
test_arith_iter_affine_map::test_padding (#13774)

No new revisions were added by this update.

Summary of changes:
 .../python/unittest/test_arith_iter_affine_map.py  | 160 +
 1 file changed, 70 insertions(+), 90 deletions(-)

[tvm] branch main updated (d6507b256f -> 8551a5c71f)

2022-12-28 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from d6507b256f [TIR] Create Layout with specified axis dtype (#13663)
 add 8551a5c71f [MetaSchedule] Add "disabled_pass" option in tuning API 
(#13659)

No new revisions were added by this update.

Summary of changes:
 python/tvm/meta_schedule/relay_integration.py  | 17 +++-
 .../test_meta_schedule_relay_integration.py| 45 ++
 2 files changed, 60 insertions(+), 2 deletions(-)

[tvm] branch main updated: [RPC] Add fail-guard for termination time exception (#13651)

2022-12-22 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 209845fb91 [RPC] Add fail-guard for termination time exception (#13651)
209845fb91 is described below

commit 209845fb910aac76f1388d868bb4fe4a46d9170f
Author: Tianqi Chen 
AuthorDate: Thu Dec 22 14:59:45 2022 -0500

[RPC] Add fail-guard for termination time exception (#13651)

This PR adds fail-guard to reduce error messages thrown during
process termination time. Such error won't trigger test error
but will bring extra message during exit time.
---
 python/tvm/rpc/proxy.py   | 5 -
 python/tvm/rpc/server.py  | 5 -
 python/tvm/rpc/server_ios_launcher.py | 5 -
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/python/tvm/rpc/proxy.py b/python/tvm/rpc/proxy.py
index 4c3144e7b5..d7027c88a4 100644
--- a/python/tvm/rpc/proxy.py
+++ b/python/tvm/rpc/proxy.py
@@ -643,7 +643,10 @@ class Proxy(object):
 self.proc = None
 
 def __del__(self):
-self.terminate()
+try:
+self.terminate()
+except ImportError:
+pass
 
 
 def websocket_proxy_server(url, key=""):
diff --git a/python/tvm/rpc/server.py b/python/tvm/rpc/server.py
index 7932e98aa2..a144356f2e 100644
--- a/python/tvm/rpc/server.py
+++ b/python/tvm/rpc/server.py
@@ -499,4 +499,7 @@ class Server(object):
 self.proc = None
 
 def __del__(self):
-self.terminate()
+try:
+self.terminate()
+except ImportError:
+pass
diff --git a/python/tvm/rpc/server_ios_launcher.py 
b/python/tvm/rpc/server_ios_launcher.py
index 2e31586f64..06ead1c3d1 100644
--- a/python/tvm/rpc/server_ios_launcher.py
+++ b/python/tvm/rpc/server_ios_launcher.py
@@ -335,7 +335,10 @@ class ServerIOSLauncher:
 print(e)
 
 def __del__(self):
-self.terminate()
+try:
+self.terminate()
+except ImportError:
+pass
 
 @staticmethod
 def is_compatible_environment():

[tvm] branch main updated: [LLVM] Fix get tm allow_missing check pos (#13591)

2022-12-12 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new ae07437a32 [LLVM] Fix get tm allow_missing check pos (#13591)
ae07437a32 is described below

commit ae07437a32c8addadfec4002426e03a6b2bc8781
Author: wrongtest 
AuthorDate: Mon Dec 12 18:53:15 2022 +0800

[LLVM] Fix get tm allow_missing check pos (#13591)

Fix get tm allow_missing check pos
---
 src/target/llvm/llvm_instance.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/target/llvm/llvm_instance.cc b/src/target/llvm/llvm_instance.cc
index 44454fc6b9..2aa190ad70 100644
--- a/src/target/llvm/llvm_instance.cc
+++ b/src/target/llvm/llvm_instance.cc
@@ -297,9 +297,9 @@ llvm::TargetMachine* 
LLVMTargetInfo::GetOrCreateTargetMachine(bool allow_missing
 llvm_instance->createTargetMachine(triple_, cpu_, 
GetTargetFeatureString(), target_options_,
reloc_model_, code_model_, 
opt_level_);
 target_machine_ = std::unique_ptr(tm);
-if (!allow_missing) {
-  ICHECK(target_machine_ != nullptr) << error;
-}
+  }
+  if (!allow_missing) {
+ICHECK(target_machine_ != nullptr) << error;
   }
   return target_machine_.get();
 }

[tvm] branch main updated: [Adreno] Add global pooling schedule (#13573)

2022-12-08 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new f4cfcafba5 [Adreno] Add global pooling schedule (#13573)
f4cfcafba5 is described below

commit f4cfcafba5024d9bbef1b8bf422c6a25368837f3
Author: Andrey Malyshev 
AuthorDate: Thu Dec 8 12:29:00 2022 +0200

[Adreno] Add global pooling schedule (#13573)

* [Adreno] Add global pooling schedule

The parallelizm opportuninties in case of global pooling are
limited by number of channels, need to change schedule to have
parallelizm by reduction axis/use rfactor

* address pylint hits

* address PR comments

* switch spatial axis to blk binding
---
 python/tvm/relay/op/strategy/adreno.py |   7 ++
 python/tvm/topi/adreno/pooling.py  | 107 
 .../relay/opencl_texture/test_pool_texture.py  | 135 +
 3 files changed, 249 insertions(+)

diff --git a/python/tvm/relay/op/strategy/adreno.py 
b/python/tvm/relay/op/strategy/adreno.py
index 21252215fc..b606ab05d7 100644
--- a/python/tvm/relay/op/strategy/adreno.py
+++ b/python/tvm/relay/op/strategy/adreno.py
@@ -215,6 +215,13 @@ def schedule_reduce_adreno(attrs, outs, target):
 return topi.adreno.schedule_reduce(outs)
 
 
+@schedule_adaptive_pool.register(["adreno"])
+def schedule_adaptive_pool_adreno(attrs, outs, target):
+"""schedule adaptive pooling ops for adreno"""
+with target:
+return topi.adreno.schedule_adaptive_pool(outs, attrs.layout)
+
+
 @concatenate_strategy.register(["adreno"])
 def concatenate_strategy_adreno(attrs, inputs, out_type, target):
 strategy = _op.OpStrategy()
diff --git a/python/tvm/topi/adreno/pooling.py 
b/python/tvm/topi/adreno/pooling.py
index 49f103c04a..f02af0c01f 100644
--- a/python/tvm/topi/adreno/pooling.py
+++ b/python/tvm/topi/adreno/pooling.py
@@ -19,6 +19,113 @@
 import tvm
 from tvm import te
 from .. import tag
+from .utils import get_div
+
+
+def schedule_adaptive_pool(outs, layout="NCHW"):
+"""Schedule for adaptive_pool.
+
+Parameters
+--
+outs: Array of Tensor
+The computation graph description of adaptive_pool
+in the format of an array of tensors.
+
+Returns
+---
+s: Schedule
+The computation schedule for adaptive_pool.
+"""
+outs = [outs] if isinstance(outs, te.tensor.Tensor) else outs
+s = te.create_schedule([x.op for x in outs])
+
+def _schedule_global(Pool, layout):
+# examples of latest pool op is global max pool and non latest is 
global avg pooling
+# OL - an Expr will be used for rfactor
+# Out - programming of the parallelizm on the global level
+# shared is not required, local could be enough but shared scope gives 
quite significant
+# perf boost
+if Pool.op in s.outputs:
+Out = Pool
+OL = s.cache_write(Pool, "shared")
+else:
+Out = outs[0].op.output(0)
+s[Pool].set_scope("shared")
+OL = Pool
+
+PaddedInput = Pool.op.input_tensors[0]
+
+# detect axis for later reorder and binding of batch/channel to blocks 
and
+# spatial to threads
+if layout in ("NCHW", "NCHW4c"):
+channel_index = 1
+height_index = 2
+width_index = 3
+else:
+channel_index = 3
+height_index = 1
+width_index = 2
+
+if isinstance(PaddedInput.op, tvm.te.ComputeOp):
+s[PaddedInput].compute_inline()
+
+fused_reduce = s[OL].fuse(*s[OL].op.reduce_axis)
+
+spatial = PaddedInput.shape[height_index].value * 
PaddedInput.shape[width_index].value
+# below values were selected empirically assuming that we should have 
some work in each
+# thread (currently from 25-49) and number of threads not exceeding 
some threshold that
+# was selected as 256 from performance point of view after experiments 
on Adreno 660
+max_threads = spatial // 25 if spatial > 25 else 1
+max_threads = 256 if max_threads > 256 else max_threads
+num_thread = get_div(spatial, max_threads)
+
+thread_y = te.thread_axis((0, num_thread), "threadIdx.y")
+
+_, ki = s[OL].split(fused_reduce, factor=num_thread)
+data_out_rf = s.rfactor(OL, ki)
+s[data_out_rf].compute_at(s[OL], s[OL].op.reduce_axis[0])
+s[OL].bind(s[OL].op.reduce_axis[0], thread_y)
+
+naxis = s[Out].op.axis[0]
+caxis = s[Out].op.axis[channel_index]
+haxis = s[Out].op.axis[height_index]
+waxis = s[Out].op.axis[width_index]
+
+if l

[tvm] branch main updated (22ff38dff8 -> 6069292ff9)

2022-12-08 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 22ff38dff8 [docs] Make building the cpu-only docs build explicit 
(#13315)
 add 6069292ff9 [OpenCL]  Introduction of weights on buffers (#13563)

No new revisions were added by this update.

Summary of changes:
 python/tvm/topi/adreno/conv2d_nchw.py  |  5 +-
 python/tvm/topi/adreno/conv2d_nhwc.py  |  5 +-
 python/tvm/topi/adreno/conv2d_winograd_common.py   |  2 +
 python/tvm/topi/adreno/depthwise_conv2d_nchw.py|  5 +-
 python/tvm/topi/adreno/depthwise_conv2d_nhwc.py|  5 +-
 src/relay/transforms/annotate_texture_storage.cc   | 78 +-
 .../opencl_texture/test_conv2d_nchw_texture.py | 20 +++---
 7 files changed, 86 insertions(+), 34 deletions(-)

[tvm] branch main updated (95d2e9fa35 -> 57de9e7f3d)

2022-11-28 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 95d2e9fa35 [ARM] Add dynamic matvec support (#13502)
 add 57de9e7f3d Revert "[microTVM] enable building microTVM components by 
default" (#13503)

No new revisions were added by this update.

Summary of changes:
 CMakeLists.txt   | 7 +--
 cmake/config.cmake   | 3 +++
 conda/recipe/build.sh| 3 ---
 conda/recipe/meta.yaml   | 3 ---
 tests/scripts/task_config_build_arm.sh   | 1 +
 tests/scripts/task_config_build_cortexm.sh   | 1 +
 tests/scripts/task_config_build_cpu.sh   | 1 +
 tests/scripts/task_config_build_gpu.sh   | 1 +
 tests/scripts/task_config_build_gpu_other.sh | 1 +
 tests/scripts/task_config_build_hexagon.sh   | 2 ++
 tests/scripts/task_config_build_i386.sh  | 1 +
 tests/scripts/task_config_build_minimal.sh   | 1 +
 tests/scripts/task_config_build_riscv.sh | 1 +
 tests/scripts/task_config_build_wasm.sh  | 1 +
 14 files changed, 15 insertions(+), 12 deletions(-)

[tvm] 01/01: [docs][Adreno] Remove unnecessary compilation flag

2022-11-28 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch echuraev/remove_use_micro_from_doc
in repository https://gitbox.apache.org/repos/asf/tvm.git

commit 6f0ed41828c47f15b79b4a9461f45c14d76b4189
Author: Egor Churaev 
AuthorDate: Tue Nov 29 08:17:32 2022 +0300

[docs][Adreno] Remove unnecessary compilation flag

Flag `-DUSE_MICRO=OFF` was added to avoid this issue #13482. In #13503 
these changes were reverted.
Removed unnecessary compilation flag to avoid user confusing.
---
 docs/how_to/deploy/adreno.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/how_to/deploy/adreno.rst b/docs/how_to/deploy/adreno.rst
index af613aa5cb..7f4616fbf7 100644
--- a/docs/how_to/deploy/adreno.rst
+++ b/docs/how_to/deploy/adreno.rst
@@ -94,7 +94,7 @@ folder of TVM:
 
mkdir build_android
cd build_android
-   cmake .. -DUSE_OPENCL=ON -DUSE_MICRO=OFF 
-DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_HOME}/build/cmake/android.toolchain.cmake 
-DANDROID_ABI=arm64-v8a -DANDROID_NATIVE_API_LEVEL=android-28 
-DCMAKE_FIND_ROOT_PATH_MODE_PACKAGE=ON -DANDROID_STL=c++_static -DUSE_CPP_RPC=ON
+   cmake .. -DUSE_OPENCL=ON 
-DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_HOME}/build/cmake/android.toolchain.cmake 
-DANDROID_ABI=arm64-v8a -DANDROID_NATIVE_API_LEVEL=android-28 
-DCMAKE_FIND_ROOT_PATH_MODE_PACKAGE=ON -DANDROID_STL=c++_static -DUSE_CPP_RPC=ON
make -jN tvm_runtime tvm_rpc
 
 where **N** is the number of cores available on your *CPU*.

[tvm] branch echuraev/remove_use_micro_from_doc created (now 6f0ed41828)

2022-11-28 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch echuraev/remove_use_micro_from_doc
in repository https://gitbox.apache.org/repos/asf/tvm.git


  at 6f0ed41828 [docs][Adreno] Remove unnecessary compilation flag

This branch includes the following new commits:

 new 6f0ed41828 [docs][Adreno] Remove unnecessary compilation flag

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.

[tvm] branch main updated: [Adreno] Add documentation for Adreno deployment (#13393)

2022-11-28 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new c38a0c50a7 [Adreno] Add documentation for Adreno deployment (#13393)
c38a0c50a7 is described below

commit c38a0c50a7cde09d548f570f7aafa8e293ef1485
Author: dsbarinov1 <71228944+dsbarin...@users.noreply.github.com>
AuthorDate: Mon Nov 28 11:25:55 2022 +0300

[Adreno] Add documentation for Adreno deployment (#13393)

* [Adreno] Add documentation for Adreno deployment
Purpose:
assist TVM users compile and deploy on Adreno by expanding our 
documentation and providing sample scripts in TVM.

Information about PR:
The present PR consists globally of 3 parts.

The first part is an introductory article on compilation and deployment of 
neural networks on Adreno, covering such topics as: «Build TVM for 
Android/Adreno», «Advantages of textures» and «Differences in compilation and 
deployment of models for Adreno devices».

The second part is a straightforward example script for compiling and 
inferring models at different precisions for Adreno devices.

The third part is auxiliary files, images, etc.

* Add correct links to images + small fixes

* Remove images (.png)

* Add request_hook in deploy_model_on_adreno.py

* Fix trailing newline + add license

* No newline at the EOF + blanks

* Fix request hook placing

* Fix style

* Fix trailing

* Fix whitespaces

* Fix whitespaces v2

* Add newline at adreno.rst EOF

* Add license to adreno.rst

* Remove sphinx 'autosectionlabel' extension + modify cross-references in 
docs to work without this extension

* Set default values to tracker_host and tracker_port

* Add local_demo to be able to autogenerate docs

* Fix quotes

* Fix benchmark

* .
---
 docs/how_to/deploy/adreno.rst  | 336 
 docs/how_to/deploy/index.rst   |   1 +
 .../how_to/deploy_models/deploy_model_on_adreno.py | 351 +
 3 files changed, 688 insertions(+)

diff --git a/docs/how_to/deploy/adreno.rst b/docs/how_to/deploy/adreno.rst
new file mode 100644
index 00..af613aa5cb
--- /dev/null
+++ b/docs/how_to/deploy/adreno.rst
@@ -0,0 +1,336 @@
+..  Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+..http://www.apache.org/licenses/LICENSE-2.0
+
+..  Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+
+Deploy to Adreno GPU
+===
+
+**Authors**: Daniil Barinov, Egor Churaev, Andrey Malyshev
+
+Introduction
+
+
+Adreno is a series of graphics processing unit (GPU) semiconductor
+intellectual property cores developed by Qualcomm and used in many of
+their SoCs.
+
+The Adreno GPU accelerates the rendering of complex geometries to
+deliver high-performance graphics and a rich user experience with low
+power consumption.
+
+This guide will demonstrate :ref:`the benefits of using textures with 
Adreno`,
+how to :ref:`build TVM with OpenCL` (needed by Adreno 
devices) and TVM RPC
+enabled. It will also provide :ref:`example 
code` to better understand the differences 
in compiling and deploying models
+for Adreno devices.
+
+.. _advantages_of_the_textures:
+
+Advantages of the Textures
+--
+
+One of the Adreno's advantages is the clever handling of textures. At
+the moment, TVM is able to benefit from this by having texture support
+for Adreno. The graph below shows the Adreno A5x architecture.
+
+|High-level overview of the Adreno A5x architecture for OpenCL|
+
+*Fig. 1 High-level overview of the Adreno A5x architecture for OpenCL*
+
+*source:* `OpenCL Optimization and Best Practices for Qualcomm Adreno GPUs 
<https://dl.acm.org/doi/10.1145/3204919.3204935>`_
+
+Reasons of using textures:
+
+-  Texture processor (TP) has a dedicated L1 cache, which is read-only cache 
and stores data
+   fetched from level-2 (L2) cache for texture operations (primary
+   reason)
+
+-  The handling of image boundaries is built-in.
+
+-  Supports num

[tvm] branch main updated (7cfa62e255 -> 3ad425fbdf)

2022-11-21 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 7cfa62e255 [Meta Schedule] Patch ICHECK for `target_has_vnni` to avoid 
seg fault (#13441)
 add 3ad425fbdf [Docker]Add privileged option for hardware with USB access 
testing (#13460)

No new revisions were added by this update.

Summary of changes:
 docker/bash.sh | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

[tvm] branch main updated (77f9c49b4e -> 1b3d77a897)

2022-11-21 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from 77f9c49b4e [Fix][MetaSchedule] Param for rule AutoBind on Python side 
(#13454)
 add 1b3d77a897 [RPC] Fix tracker connection termination (#13420)

No new revisions were added by this update.

Summary of changes:
 python/tvm/rpc/base.py | 36 ++-
 python/tvm/rpc/proxy.py|  8 ++---
 python/tvm/rpc/server.py   |  4 +--
 python/tvm/rpc/tracker.py  |  2 +-
 ...mscript_ir_builder_base.py => test_rpc_base.py} | 42 +++---
 tests/python/unittest/test_runtime_rpc.py  | 13 ---
 6 files changed, 61 insertions(+), 44 deletions(-)
 copy tests/python/unittest/{test_tvmscript_ir_builder_base.py => 
test_rpc_base.py} (56%)

82 matches

Mail list logo