(tvm) branch main updated: Restore "pytest.mark.gpu" for RELAX tests (#16741)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 11f2253b9c Restore "pytest.mark.gpu" for RELAX tests (#16741) 11f2253b9c is described below commit 11f2253b9cc22ff354e7f13df2d5a55feae01259 Author: apeskov AuthorDate: Tue Apr 23 11:22:55 2024 +0300 Restore "pytest.mark.gpu" for RELAX tests (#16741) * [TEST] Mark RELAX GPU tests with pytest.mark.gpu Missed pytest.mark.gpu prevents tests from launch in CI. Signed-off-by: Alexander Peskov * fix Signed-off-by: Alexander Peskov * Check fp8 compute capability Signed-off-by: Alexander Peskov * fix func signature Signed-off-by: Alexander Peskov * lint Signed-off-by: Alexander Peskov - Signed-off-by: Alexander Peskov Co-authored-by: Alexander Peskov --- tests/python/relax/test_codegen_cublas.py | 10 ++ tests/python/relax/test_codegen_cudnn.py | 9 + tests/python/relax/test_codegen_cutlass.py| 9 + tests/python/relax/test_codegen_tensorrt.py | 13 +++-- tests/python/relax/test_contrib_vllm.py | 2 +- tests/python/relax/test_transform_codegen_pass.py | 10 ++ 6 files changed, 22 insertions(+), 31 deletions(-) diff --git a/tests/python/relax/test_codegen_cublas.py b/tests/python/relax/test_codegen_cublas.py index 4f357626b8..ea0861467f 100644 --- a/tests/python/relax/test_codegen_cublas.py +++ b/tests/python/relax/test_codegen_cublas.py @@ -36,14 +36,7 @@ def reset_seed(): np.random.seed(0) -has_cublas = tvm.get_global_func("relax.ext.cublas", True) - -cublas_enabled = pytest.mark.skipif( -not has_cublas, -reason="CUBLAS not enabled.", -) - -pytestmark = [cublas_enabled] +pytestmark = tvm.testing.requires_cublas.marks() def build_and_run(mod, inputs_np, target, legalize=False, cuda_graph=False): @@ -231,6 +224,7 @@ def test_matmul_igemm_offload( tvm.testing.assert_allclose(out, ref, rtol=1e-2, atol=1e-2) +@tvm.testing.requires_cuda_compute_version(9) @pytest.mark.skipif(ml_dtypes is None, reason="requires ml_dtypes to be installed") @pytest.mark.parametrize( "x_shape, y_shape, transpose_y, out_dtype", diff --git a/tests/python/relax/test_codegen_cudnn.py b/tests/python/relax/test_codegen_cudnn.py index c913559232..f342705878 100644 --- a/tests/python/relax/test_codegen_cudnn.py +++ b/tests/python/relax/test_codegen_cudnn.py @@ -34,14 +34,7 @@ def reset_seed(): np.random.seed(0) -has_cudnn = tvm.get_global_func("relax.ext.cudnn", True) - -cudnn_enabled = pytest.mark.skipif( -not has_cudnn, -reason="cuDNN not enabled.", -) - -pytestmark = [cudnn_enabled] +pytestmark = tvm.testing.requires_cudnn.marks() _activation_table = { diff --git a/tests/python/relax/test_codegen_cutlass.py b/tests/python/relax/test_codegen_cutlass.py index fced7a84a8..57f47ca6e6 100644 --- a/tests/python/relax/test_codegen_cutlass.py +++ b/tests/python/relax/test_codegen_cutlass.py @@ -75,14 +75,7 @@ class Conv2dx2: return conv2 -has_cutlass = tvm.get_global_func("relax.ext.cutlass", True) - -cutlass_enabled = pytest.mark.skipif( -not has_cutlass, -reason="CUTLASS not enabled.", -) - -pytestmark = [cutlass_enabled] +pytestmark = tvm.testing.requires_cutlass.marks() def build_and_run(mod, inputs_np, target, legalize=True, cuda_graph=False): diff --git a/tests/python/relax/test_codegen_tensorrt.py b/tests/python/relax/test_codegen_tensorrt.py index 23dc7d887f..009bb24c63 100644 --- a/tests/python/relax/test_codegen_tensorrt.py +++ b/tests/python/relax/test_codegen_tensorrt.py @@ -43,13 +43,22 @@ class Conv2dResidualBlock: has_tensorrt = tvm.get_global_func("relax.ext.tensorrt", True) +env_checker_runtime = tvm.get_global_func("relax.is_tensorrt_runtime_enabled", True) -tensorrt_enabled = pytest.mark.skipif( +requires_tensorrt_codegen = pytest.mark.skipif( not has_tensorrt, reason="TENSORRT not enabled.", ) -pytestmark = [tensorrt_enabled] +requires_tensorrt_runtime = pytest.mark.skipif( +not env_checker_runtime or not env_checker_runtime(), +reason="TensorRT runtime not available", +) + +pytestmark = [ +requires_tensorrt_codegen, +requires_tensorrt_runtime, +] + tvm.testing.requires_cuda.marks() def build_and_run(mod, inputs_np, target, legalize=False): diff --git a/tests/python/relax/test_contrib_vllm.py b/tests/python/relax/test_contrib_vllm.py index dd2149e572..f3c4839133 100644 --- a/tests/python/relax/test_contrib_vllm.py +++ b/tests/python/relax/test_contrib_vllm.py @@ -32,
(tvm) branch main updated: [Meta-Schedule][OpenCL] Enable MS tuning for Android OpenCL (#16846)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new c93f0bae9b [Meta-Schedule][OpenCL] Enable MS tuning for Android OpenCL (#16846) c93f0bae9b is described below commit c93f0bae9bf9aa3bd42f3239d4e4a0f2da37ee84 Author: Egor Churaev AuthorDate: Fri Apr 5 09:52:41 2024 +0300 [Meta-Schedule][OpenCL] Enable MS tuning for Android OpenCL (#16846) Added OpenCL as a GPU target for Meta-Scheduler. Implemented export function for Android which can be used when MS builder is configured. Added an integration test which checks that MS tuning on Android GPU works fine. --- python/tvm/contrib/ndk.py | 12 src/meta_schedule/utils.h | 3 +- tests/python/contrib/test_android/__init__.py | 18 ++ .../python/contrib/test_android/infrastructure.py | 57 + .../contrib/test_android/test_meta_schedule.py | 71 ++ 5 files changed, 160 insertions(+), 1 deletion(-) diff --git a/python/tvm/contrib/ndk.py b/python/tvm/contrib/ndk.py index 2a1105ed2b..14820c0ca8 100644 --- a/python/tvm/contrib/ndk.py +++ b/python/tvm/contrib/ndk.py @@ -22,7 +22,10 @@ import subprocess import os import shutil from typing import Dict +import tempfile +from pathlib import Path +from .._ffi import register_func from .._ffi.base import py_str from . import utils as _utils, tar as _tar, cc as _cc from .cc import get_target_by_dump_machine @@ -152,3 +155,12 @@ def get_global_symbol_section_map(path, *, nm=None) -> Dict[str, str]: base_path = os.path.dirname(compiler) nm = os.path.join(base_path, "llvm-nm") return _cc.get_global_symbol_section_map(path, nm=nm) + + +@register_func("meta_schedule.builder.export_ndk") +def _ndk_export(mod): +tmp_dir = tempfile.mkdtemp() +binary_name = "tmp_binary.so" +binary_path = Path(tmp_dir) / binary_name +mod.export_library(binary_path, fcompile=create_shared) +return str(binary_path) diff --git a/src/meta_schedule/utils.h b/src/meta_schedule/utils.h index 60840ca163..ceb0356cbc 100644 --- a/src/meta_schedule/utils.h +++ b/src/meta_schedule/utils.h @@ -513,7 +513,8 @@ inline void CloneRules(const SpaceGeneratorNode* src, SpaceGeneratorNode* dst) { /*! \brief Returns true if the given target is one of the supported gpu targets. */ inline bool IsGPUTarget(const std::string& target_name) { - static const std::unordered_set gpu_targets{"cuda", "rocm", "vulkan", "metal"}; + static const std::unordered_set gpu_targets{"cuda", "rocm", "vulkan", "metal", + "opencl"}; return gpu_targets.count(target_name); } diff --git a/tests/python/contrib/test_android/__init__.py b/tests/python/contrib/test_android/__init__.py new file mode 100644 index 00..9669578bb7 --- /dev/null +++ b/tests/python/contrib/test_android/__init__.py @@ -0,0 +1,18 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +""" Testing infrastructure for Android """ diff --git a/tests/python/contrib/test_android/infrastructure.py b/tests/python/contrib/test_android/infrastructure.py new file mode 100644 index 00..b78d0bb40e --- /dev/null +++ b/tests/python/contrib/test_android/infrastructure.py @@ -0,0 +1,57 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the Lic
(tvm) branch main updated: [RUNTIME][OPENCL] Bugfix for ciImage create with host ptr (#16768)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 77a7b01081 [RUNTIME][OPENCL] Bugfix for ciImage create with host ptr (#16768) 77a7b01081 is described below commit 77a7b010817d2d8fbdf89223bb814e9c38f68365 Author: Siva AuthorDate: Sat Mar 23 20:23:36 2024 +0530 [RUNTIME][OPENCL] Bugfix for ciImage create with host ptr (#16768) Added couple more tests for host ptr data validation --- src/runtime/opencl/opencl_device_api.cc | 2 +- tests/cpp-runtime/opencl/opencl_nativeptr.cc | 40 +++- 2 files changed, 40 insertions(+), 2 deletions(-) diff --git a/src/runtime/opencl/opencl_device_api.cc b/src/runtime/opencl/opencl_device_api.cc index 96ec8ed69f..ab553052bb 100644 --- a/src/runtime/opencl/opencl_device_api.cc +++ b/src/runtime/opencl/opencl_device_api.cc @@ -294,7 +294,7 @@ cl_mem OpenCLWorkspace::AllocTexture(Device dev, size_t width, size_t height, cl_channel_type cl_type = DTypeToOpenCLChannelType(type_hint); cl_image_format format = {CL_RGBA, cl_type}; cl_image_desc descriptor = {CL_MEM_OBJECT_IMAGE2D, width, height, 0, 0, 0, 0, 0, 0}; - cl_mem mptr = clCreateImage(this->contexts[platform], CL_MEM_CREATE_FLAGS, , , + cl_mem mptr = clCreateImage(this->contexts[platform], CL_MEM_READ_WRITE, , , nullptr, _code); OPENCL_CHECK_ERROR(err_code); return mptr; diff --git a/tests/cpp-runtime/opencl/opencl_nativeptr.cc b/tests/cpp-runtime/opencl/opencl_nativeptr.cc index ebfb62e920..8f894c4bff 100644 --- a/tests/cpp-runtime/opencl/opencl_nativeptr.cc +++ b/tests/cpp-runtime/opencl/opencl_nativeptr.cc @@ -20,17 +20,55 @@ #include #include +#include +#include + #include "../src/runtime/opencl/opencl_common.h" using namespace tvm::runtime; using namespace tvm::runtime::cl; #if defined(OPENCL_ENABLE_HOST_PTR) -TEST(OpenCLNDArray, native_ptr) { +TEST(OpenCLNativePtr, access_memory) { OpenCLWorkspace* workspace = OpenCLWorkspace::Global(); auto A = tvm::runtime::NDArray::Empty({128, 128}, {kDLFloat, 32, 1}, {kDLOpenCL, 0}); void* nptr = workspace->GetNativePtr(A); memset(nptr, 0x0, 128 * 128 * 4); } + +TEST(OpenCLNatvePtr, data_loop) { + OpenCLWorkspace* workspace = OpenCLWorkspace::Global(); + + auto cl_arr = tvm::runtime::NDArray::Empty({1024}, {kDLFloat, 32, 1}, {kDLOpenCL, 0}); + auto cpu_arr = tvm::runtime::NDArray::Empty({1024}, {kDLFloat, 32, 1}, {kDLCPU, 0}); + + std::random_device rdev; + std::mt19937 mt(rdev()); + std::uniform_real_distribution<> random(-10.0, 10.0); + + // Random initialize host ndarray + for (size_t i = 0; i < 1024; i++) { +static_cast(cpu_arr->data)[i] = random(mt); + } + // Do a roundtrip from cpu arr to opencl array and native ptr. + cpu_arr.CopyTo(cl_arr); + void* nptr = workspace->GetNativePtr(cl_arr); + for (size_t i = 0; i < 1024; ++i) { +ICHECK_LT(std::fabs(static_cast(cpu_arr->data)[i] - static_cast(nptr)[i]), + 1e-5); + } + + // Random initialize cl ndarray + for (size_t i = 0; i < 1024; i++) { +static_cast(nptr)[i] = random(mt); + } + // Do a roundtrip from native ptr to cl arr to cpu array. + cl_arr.CopyTo(cpu_arr); + for (size_t i = 0; i < 1024; ++i) { +ICHECK_LT(std::fabs(static_cast(cpu_arr->data)[i] - static_cast(nptr)[i]), + 1e-5); + } +} + #endif
(tvm) branch main updated: [Relay][ONNX] Fix the attribute mode parse of operator Upsample (#16622)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 72ce7013e4 [Relay][ONNX] Fix the attribute mode parse of operator Upsample (#16622) 72ce7013e4 is described below commit 72ce7013e46c432dd1f8c3e1ec862a1e72b9798e Author: Qingchao Shen AuthorDate: Fri Feb 23 15:54:50 2024 +0800 [Relay][ONNX] Fix the attribute mode parse of operator Upsample (#16622) * add the default value for mode attrbute of Upsample * Update test_forward.py * Update onnx.py * Update test_forward.py --- python/tvm/relay/frontend/onnx.py | 2 +- tests/python/frontend/onnx/test_forward.py | 22 ++ 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/python/tvm/relay/frontend/onnx.py b/python/tvm/relay/frontend/onnx.py index 3023cd039c..b95afae1d1 100644 --- a/python/tvm/relay/frontend/onnx.py +++ b/python/tvm/relay/frontend/onnx.py @@ -2392,7 +2392,7 @@ class Upsample(OnnxOpConverter): if not isinstance(scales, _expr.Expr): assert scales[0] == 1.0 and scales[1] == 1.0 -mode = attr.get("mode") +mode = attr.get("mode", b"nearest") if mode == b"nearest": method = "nearest_neighbor" elif mode == b"linear": diff --git a/tests/python/frontend/onnx/test_forward.py b/tests/python/frontend/onnx/test_forward.py index cfa30ad346..543aa7f518 100644 --- a/tests/python/frontend/onnx/test_forward.py +++ b/tests/python/frontend/onnx/test_forward.py @@ -1726,6 +1726,27 @@ def test_upsample_nearest(target, dev): verify_with_ort_with_inputs(model, [in_array], [out_shape], opset=7, target=target, dev=dev) +@tvm.testing.parametrize_targets +def test_upsample_nearest_default(target, dev): +"""test_upsample_nearest_default""" +scale = 2 +in_shape = (1, 1, 3, 3) +out_shape = (1, 1, 3 * scale, 3 * scale) +y = helper.make_node("Upsample", ["in"], ["out"], scales=[1.0, 1.0, 2.0, 2.0]) + +in_array = np.random.uniform(size=in_shape).astype(np.float32) + +graph = helper.make_graph( +[y], +"upsample_nearest_test", +inputs=[helper.make_tensor_value_info("in", TensorProto.FLOAT, list(in_shape))], +outputs=[helper.make_tensor_value_info("out", TensorProto.FLOAT, list(out_shape))], +) + +model = helper.make_model(graph, producer_name="upsample_nearest_test") +verify_with_ort_with_inputs(model, [in_array], [out_shape], opset=7, target=target, dev=dev) + + @tvm.testing.parametrize_targets def test_upsample3d_nearest(target, dev): """test_upsample3d_nearest""" @@ -5708,6 +5729,7 @@ unsupported_onnx_tests = [ "test_unique_sorted_with_axis_3d", "test_unique_sorted_with_negative_axis", "test_upsample_nearest", +"test_upsample_nearest_default", ]
(tvm) branch main updated: [Relay][ONNX] Fix the Resize operator in ONNX frontend (#16626)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new aa55528714 [Relay][ONNX] Fix the Resize operator in ONNX frontend (#16626) aa55528714 is described below commit aa5552871415409d2696bc5864535c910ee12018 Author: Qingchao Shen AuthorDate: Fri Feb 23 15:54:04 2024 +0800 [Relay][ONNX] Fix the Resize operator in ONNX frontend (#16626) * Update onnx.py * Update test_forward.py --- python/tvm/relay/frontend/onnx.py | 4 ++-- tests/python/frontend/onnx/test_forward.py | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/python/tvm/relay/frontend/onnx.py b/python/tvm/relay/frontend/onnx.py index ddd0d34c5c..3023cd039c 100644 --- a/python/tvm/relay/frontend/onnx.py +++ b/python/tvm/relay/frontend/onnx.py @@ -3932,7 +3932,7 @@ class Resize(OnnxOpConverter): @classmethod def _impl_v10(cls, inputs, attr, params): -mode = attr.get("mode").decode("ascii") +mode = attr.get("mode", b"nearest").decode("ascii") if mode == "nearest": method = "nearest_neighbor" elif mode == "linear": @@ -4007,7 +4007,7 @@ class Resize(OnnxOpConverter): if roi is not None and infer_shape(roi)[0] == 0: roi = None ndims = len(infer_shape(inputs[0])) -mode = attr.get("mode").decode("ascii") +mode = attr.get("mode", b"nearest").decode("ascii") if mode == "nearest": method = "nearest_neighbor" elif mode == "linear": diff --git a/tests/python/frontend/onnx/test_forward.py b/tests/python/frontend/onnx/test_forward.py index 51748462d0..cfa30ad346 100644 --- a/tests/python/frontend/onnx/test_forward.py +++ b/tests/python/frontend/onnx/test_forward.py @@ -4503,6 +4503,7 @@ def test_resize(target, dev): # scales are specified instead of sizes verify([1, 16] + [32] * ndim, [], [1, 1] + [0.5] * ndim, method, coord_trans) verify([1, 16] + [32] * ndim, [], [1, 1] + [2] * ndim, method, coord_trans) +verify([1, 16] + [32] * ndim, [], [1, 1] + [2] * ndim, None, coord_trans) method = "linear" # upsampling
(tvm) branch main updated: [OpenCL][CMake] Fix OpenCL tests compilation (#16394)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new fe9814c73e [OpenCL][CMake] Fix OpenCL tests compilation (#16394) fe9814c73e is described below commit fe9814c73e24df3ecd031b216492fb555a1ab95a Author: Egor Churaev AuthorDate: Mon Jan 15 09:46:47 2024 +0300 [OpenCL][CMake] Fix OpenCL tests compilation (#16394) [OpenCL] Fix OpenCL tests compilation Found a problem when you are in a different cmake project (not TVM) and you run TVM build with OpenCL tests, then `CMAKE_SOURCE_DIR` returns the path to the `CMakeList.txt` in the current project (not to the TVM) and in this case we will see the following error: `No SOURCES given to target: opencl-cpptest`. To be consistent with code style in `OpenCL.cmake`, I removed the usage of `CMAKE_SOURCE_DIR` variable. It also fixes the issue if TVM cmake was called from directory with another cmake project. --- cmake/modules/OpenCL.cmake | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cmake/modules/OpenCL.cmake b/cmake/modules/OpenCL.cmake index 2dc1fc18f3..ddcd1e4190 100644 --- a/cmake/modules/OpenCL.cmake +++ b/cmake/modules/OpenCL.cmake @@ -81,7 +81,7 @@ if(USE_OPENCL) if(Build_OpenCL_GTests) message(STATUS "Building OpenCL-Gtests") tvm_file_glob(GLOB_RECURSE OPENCL_TEST_SRCS - "${CMAKE_SOURCE_DIR}/tests/cpp-runtime/opencl/*.cc" + "tests/cpp-runtime/opencl/*.cc" ) add_executable(opencl-cpptest ${OPENCL_TEST_SRCS}) target_link_libraries(opencl-cpptest PRIVATE gtest_main tvm_runtime)
(tvm) branch echuraev/fix_cmake_issue_for_opencl_tests created (now 4eb9d880fd)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch echuraev/fix_cmake_issue_for_opencl_tests in repository https://gitbox.apache.org/repos/asf/tvm.git at 4eb9d880fd [OpenCL] Fix OpenCL tests compilation This branch includes the following new commits: new 4eb9d880fd [OpenCL] Fix OpenCL tests compilation The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
(tvm) 01/01: [OpenCL] Fix OpenCL tests compilation
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch echuraev/fix_cmake_issue_for_opencl_tests in repository https://gitbox.apache.org/repos/asf/tvm.git commit 4eb9d880fd049730f452a651419bf73b7abdab1b Author: Egor Churaev AuthorDate: Fri Jan 12 14:56:10 2024 +0300 [OpenCL] Fix OpenCL tests compilation Found a problem when you are in a different cmake project (not TVM) and you run TVM build with OpenCL tests, then `CMAKE_SOURCE_DIR` returns the path to the `CMakeList.txt` in the current project (not to the TVM) and in this case we will see the following error: `No SOURCES given to target: opencl-cpptest`. To be consistent with code style in `OpenCL.cmake`, I removed the usage of `CMAKE_SOURCE_DIR` variable. It also fixes the issue if TVM cmake was called from directory with another cmake project. --- cmake/modules/OpenCL.cmake | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cmake/modules/OpenCL.cmake b/cmake/modules/OpenCL.cmake index 2dc1fc18f3..ddcd1e4190 100644 --- a/cmake/modules/OpenCL.cmake +++ b/cmake/modules/OpenCL.cmake @@ -81,7 +81,7 @@ if(USE_OPENCL) if(Build_OpenCL_GTests) message(STATUS "Building OpenCL-Gtests") tvm_file_glob(GLOB_RECURSE OPENCL_TEST_SRCS - "${CMAKE_SOURCE_DIR}/tests/cpp-runtime/opencl/*.cc" + "tests/cpp-runtime/opencl/*.cc" ) add_executable(opencl-cpptest ${OPENCL_TEST_SRCS}) target_link_libraries(opencl-cpptest PRIVATE gtest_main tvm_runtime)
(tvm) branch main updated: [TOPI][ADRENO] Add conv2d transpose nchw texture schedule (#15786)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 015da7c185 [TOPI][ADRENO] Add conv2d transpose nchw texture schedule (#15786) 015da7c185 is described below commit 015da7c185199add4af943970b6ee3d0a0661ec4 Author: krishnaraj36 AuthorDate: Wed Nov 15 11:37:40 2023 +0530 [TOPI][ADRENO] Add conv2d transpose nchw texture schedule (#15786) * [TOPI][ADRENO] Add conv2d transpose nchw texture schedule Added the conv2d transpose strategy for adreno target and enable the optimized schedule. * Fix the whitespace lint error * Fix lint errors * Fix whitespace lint error * Removed unused variables * Add more conv2dTranspose testcases * empty update empty update for retrigger ci * Update test_conv2d_transpose_nchw_texture.py * Added more testcase to check memory scopes * Device specific alter_op_layout for conv2d_transpose * Fix in virtual device setup and added test case with scope check * Add the comment conv2d algo * Add the comment conv2d algo * Removed fp16 test case from texture It is failing for few gpu devices. * remove opencl config change for mainline confilct * Add the test case for 3 channel input which run with cuda schecule * Fix in op strategy for out channel 3 * Comment in test case for memory scope - Co-authored-by: Siva --- python/tvm/relay/op/nn/_nn.py | 6 + python/tvm/relay/op/strategy/adreno.py | 52 +++ python/tvm/topi/adreno/__init__.py | 2 + .../tvm/topi/adreno/conv2d_transpose_alter_op.py | 121 ++ python/tvm/topi/adreno/conv2d_transpose_nchw.py| 412 + python/tvm/topi/adreno/utils.py| 23 ++ python/tvm/topi/nn/conv2d.py | 23 ++ src/relay/transforms/annotate_texture_storage.cc | 4 + .../test_conv2d_transpose_nchw_texture.py | 325 .../relay/opencl_texture/utils/adreno_utils.py | 5 +- 10 files changed, 972 insertions(+), 1 deletion(-) diff --git a/python/tvm/relay/op/nn/_nn.py b/python/tvm/relay/op/nn/_nn.py index c68685f0ae..6acaf43fe7 100644 --- a/python/tvm/relay/op/nn/_nn.py +++ b/python/tvm/relay/op/nn/_nn.py @@ -335,6 +335,12 @@ def legalize_conv2d_transpose(attrs, inputs, types): return topi.nn.conv2d_transpose_legalize(attrs, inputs, types) +@reg.register_alter_op_layout("nn.conv2d_transpose") +def alter_op_layout_conv2d_transpose(attrs, inputs, tinfos, out_type): +"""Alternate the layout of conv2d_transpose""" +return topi.nn.conv2d_transpose_alter_layout(attrs, inputs, tinfos, out_type) + + @reg.register_convert_op_layout("nn.conv2d_transpose") def convert_conv2d_transpose(attrs, inputs, tinfos, desired_layouts): """Convert Layout pass registration for conv2d_transpose op. diff --git a/python/tvm/relay/op/strategy/adreno.py b/python/tvm/relay/op/strategy/adreno.py index c180eeec74..bacace9ad4 100644 --- a/python/tvm/relay/op/strategy/adreno.py +++ b/python/tvm/relay/op/strategy/adreno.py @@ -215,6 +215,58 @@ def conv2d_winograd_without_weight_transform_strategy_adreno(attrs, inputs, out_ return strategy +@conv2d_transpose_strategy.register("adreno") +def conv2d_transpose_strategy_adreno(attrs, inputs, out_type, target): +"""conv2d_transpose adreno strategy""" +strategy = _op.OpStrategy() +_, kernel = inputs +dilation = attrs.get_int_tuple("dilation") +groups = attrs.groups +data_layout = attrs.data_layout +kernel_layout = attrs.kernel_layout +assert dilation == (1, 1), "not support dilate now" + +if (groups == 1) and ( +(data_layout == "NCHW" and kernel_layout == "IOHW") +or (data_layout == "NCHW4c" and kernel_layout == "IOHW4o") +or (data_layout == "NCHW" and kernel_layout == "IOHW4o") +): +if len(kernel.shape) == 4: +_, oc, _, _ = get_const_tuple(kernel.shape) +else: +_, oc, _, _, _ = get_const_tuple(kernel.shape) +# We cannot use textures for case than number of channels is less than 4. +# So, we use compute functions from cuda. +if len(kernel.shape) == 4 and oc < 4: +strategy.add_implementation( +wrap_compute_conv2d_transpose(topi.cuda.conv2d_transpose_nchw), +wrap_topi_schedule(topi.cuda.schedule_conv2d_transpose_nchw), +name="conv2d_transpose_nchw.cuda&qu
[tvm] branch main updated: [CI][BUGFIX] Better to pass the build folder (#15965)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 6a8cb323a0 [CI][BUGFIX] Better to pass the build folder (#15965) 6a8cb323a0 is described below commit 6a8cb323a0a35fa1fd6b185b268ca2e9efb8d712 Author: Siva AuthorDate: Mon Oct 23 12:22:43 2023 +0530 [CI][BUGFIX] Better to pass the build folder (#15965) build_dir is not passed to allow custom build folders --- tests/scripts/ci.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/scripts/ci.py b/tests/scripts/ci.py index 1c9981bcb2..1add6bcde3 100755 --- a/tests/scripts/ci.py +++ b/tests/scripts/ci.py @@ -593,7 +593,7 @@ generated = [ "run unit tests", [ "./tests/scripts/task_java_unittest.sh", -"./tests/scripts/task_opencl_cpp_unittest.sh", +"./tests/scripts/task_opencl_cpp_unittest.sh {build_dir}", "./tests/scripts/task_python_unittest_gpuonly.sh", "./tests/scripts/task_python_integration_gpuonly.sh", ],
[tvm] branch main updated: [Fix] Remove duplicated words from comments, NFC (#15873)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 958c27123a [Fix] Remove duplicated words from comments, NFC (#15873) 958c27123a is described below commit 958c27123a45a9629e57cee20dbca28263c836bd Author: Krzysztof Parzyszek AuthorDate: Thu Oct 5 00:37:35 2023 -0500 [Fix] Remove duplicated words from comments, NFC (#15873) Removed instances of accidentally repeated words from comments. There are cases where duplicated words appear legitimately, those cases remain unmodified. --- docs/Doxyfile| 2 +- docs/conf.py | 4 ++-- include/tvm/runtime/logging.h| 2 +- include/tvm/runtime/ndarray.h| 2 +- include/tvm/runtime/packed_func.h| 2 +- include/tvm/tir/expr.h | 2 +- include/tvm/tir/stmt.h | 2 +- python/tvm/relay/op/contrib/clml.py | 2 +- python/tvm/relay/transform/memory_plan.py| 2 +- python/tvm/runtime/ndarray.py| 2 +- python/tvm/te/hybrid/__init__.py | 2 +- python/tvm/te/schedule.py| 2 +- python/tvm/topi/arm_cpu/qnn.py | 2 +- src/arith/const_fold.h | 4 ++-- src/arith/product_normal_form.h | 4 ++-- src/relay/collage/partition_rule.h | 2 +- src/relay/transforms/combine_parallel_op_batch.h | 2 +- src/runtime/c_runtime_api.cc | 2 +- src/runtime/crt/aot_executor/aot_executor.c | 2 +- src/target/source/ptx.h | 2 +- src/tir/analysis/control_flow_graph.h| 2 +- src/tir/schedule/error.h | 2 +- src/tir/transforms/unroll_loop.cc| 2 +- tests/python/contrib/test_arm_compute_lib/infrastructure.py | 2 +- tests/python/contrib/test_hexagon/conv2d/test_conv2d_conv2d.md | 2 +- tests/python/contrib/test_hexagon/test_benchmark_elemwise_add.py | 2 +- tests/python/relay/test_pass_plan_devices.py | 2 +- tests/python/unittest/test_tir_transform_remove_no_op.py | 2 +- web/src/environment.ts | 2 +- 29 files changed, 32 insertions(+), 32 deletions(-) diff --git a/docs/Doxyfile b/docs/Doxyfile index 8615f5b2ae..d1ea82c842 100644 --- a/docs/Doxyfile +++ b/docs/Doxyfile @@ -306,7 +306,7 @@ EXTENSION_MAPPING = # When enabled doxygen tries to link words that correspond to documented # classes, or namespaces to their corresponding documentation. Such a link can -# be prevented in individual cases by by putting a % sign in front of the word +# be prevented in individual cases by putting a % sign in front of the word # or globally by setting AUTOLINK_SUPPORT to NO. # The default value is: YES. diff --git a/docs/conf.py b/docs/conf.py index 330ae919a6..b1ecb37c97 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -244,7 +244,7 @@ INSTALL_TVM_CUDA_DEV = f"""\ # Installs the latest dev build of TVM from PyPI, with CUDA enabled. To use this, # you must request a Google Colab instance with a GPU by going to Runtime -> # Change runtime type -> Hardware accelerator -> GPU. If you wish to build from -# source, see see https://tvm.apache.org/docs/install/from_source.html +# source, see https://tvm.apache.org/docs/install/from_source.html pip install tlcpack-nightly-cu113 --pre -f https://tlcpack.ai/wheels""; INSTALL_TVM_CUDA_FIXED = f"""\ @@ -252,7 +252,7 @@ INSTALL_TVM_CUDA_FIXED = f"""\ # Installs TVM version {version} from PyPI, with CUDA enabled. To use this, # you must request a Google Colab instance with a GPU by going to Runtime -> # Change runtime type -> Hardware accelerator -> GPU. If you wish to build from -# source, see see https://tvm.apache.org/docs/install/from_source.html +# source, see https://tvm.apache.org/docs/install/from_source.html pip install apache-tvm-cu113=={version} -f https://tlcpack.ai/wheels""; diff --git a/include/tvm/runtime/logging.h b/include/tvm/runtime/logging.h index 16a176a801..45c390df1d 100644 --- a/include/tvm/runtime/logging.h +++ b/include/tvm/runtime/logging.h @@ -113,7 +113,7 @@ * in a function, or 'continue' or 'break' in a loop) * The default behavior when quit_on_assertion is false,
[tvm] branch main updated: [Relay][Keras][Bugfix] fix the converters of GRU and SimpleRNN about the go_backwards attribute (#15829)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 28908998e0 [Relay][Keras][Bugfix] fix the converters of GRU and SimpleRNN about the go_backwards attribute (#15829) 28908998e0 is described below commit 28908998e0c55025a89e8e2bd26a3fe3e6c84356 Author: Qingchao Shen AuthorDate: Fri Sep 29 15:54:23 2023 +0800 [Relay][Keras][Bugfix] fix the converters of GRU and SimpleRNN about the go_backwards attribute (#15829) * fix bug in gru and simpleRNN about go_backwards * Update test_forward.py * Update keras.py --- python/tvm/relay/frontend/keras.py | 4 tests/python/frontend/keras/test_forward.py | 12 2 files changed, 16 insertions(+) diff --git a/python/tvm/relay/frontend/keras.py b/python/tvm/relay/frontend/keras.py index 9e09cb400a..6c82ebb427 100644 --- a/python/tvm/relay/frontend/keras.py +++ b/python/tvm/relay/frontend/keras.py @@ -1062,6 +1062,8 @@ def _convert_simple_rnn( in_bias = etab.new_const(weightList[2]) assert len(in_data.type_annotation.shape) == 3 timeDim = in_data.type_annotation.shape[1].value +if keras_layer.go_backwards: +in_data = _op.reverse(in_data, axis=1) in_data_split = _op.split(in_data, indices_or_sections=timeDim, axis=1) for i in range(len(in_data_split)): in_data_split_i = _op.nn.batch_flatten(in_data_split[i]) @@ -1090,6 +1092,8 @@ def _convert_gru( recurrent_weight = etab.new_const(weightList[1].transpose([1, 0])) if keras_layer.use_bias: in_bias = etab.new_const(weightList[2]) +if keras_layer.go_backwards: +in_data = _op.reverse(in_data, axis=1) units = list(weightList[0].shape)[1] assert units > 0, "The value of units must be a positive integer" in_data = _op.nn.batch_flatten(in_data) diff --git a/tests/python/frontend/keras/test_forward.py b/tests/python/frontend/keras/test_forward.py index ba3880e186..8c5b578060 100644 --- a/tests/python/frontend/keras/test_forward.py +++ b/tests/python/frontend/keras/test_forward.py @@ -568,12 +568,23 @@ class TestKeras: keras_mod.layers.SimpleRNN( units=16, return_state=False, activation="tanh", use_bias=False ), +keras_mod.layers.SimpleRNN( +units=16, return_state=False, activation="tanh", go_backwards=True +), +keras_mod.layers.GRU( +units=16, +return_state=False, +recurrent_activation="sigmoid", +activation="tanh", +reset_after=False, +), keras_mod.layers.GRU( units=16, return_state=False, recurrent_activation="sigmoid", activation="tanh", reset_after=False, +use_bias=False, ), keras_mod.layers.GRU( units=16, @@ -582,6 +593,7 @@ class TestKeras: activation="tanh", reset_after=False, use_bias=False, +go_backwards=True, ), ] for rnn_func in rnn_funcs:
[tvm] branch main updated: [ADRENO] Minor changes for Adreno docs and help scripts (#15830)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 9d8e6fda50 [ADRENO] Minor changes for Adreno docs and help scripts (#15830) 9d8e6fda50 is described below commit 9d8e6fda50bce14bf597de1f87711230e6001e4e Author: Siva AuthorDate: Thu Sep 28 15:38:30 2023 +0530 [ADRENO] Minor changes for Adreno docs and help scripts (#15830) [ADRENO] Minor changes for Adreno docs and help scripts NCHW is mandatory layout for CLML offload. Updated the docs. CI scripts will keep OpenCL enbaled as fallback always. Enable configurable device bind ports. Helps in multi user environments. --- docs/how_to/deploy/adreno.rst | 2 +- tests/scripts/setup-adreno-env.sh | 18 +- tests/scripts/task_config_build_adreno.sh | 2 ++ 3 files changed, 16 insertions(+), 6 deletions(-) diff --git a/docs/how_to/deploy/adreno.rst b/docs/how_to/deploy/adreno.rst index ed016a3ff7..f0b8c6f757 100644 --- a/docs/how_to/deploy/adreno.rst +++ b/docs/how_to/deploy/adreno.rst @@ -432,7 +432,7 @@ as the OpenCL path is fallback option for any operator didn't go through OpenCLM python3 -m tvm.driver.tvmc compile \ --cross-compiler ${ANDROID_NDK_HOME}/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang \ - --target="opencl, clml, llvm" --target-llvm-mtriple aarch64-linux-gnu --target-opencl-device adreno \ + --target="opencl, clml, llvm" --desired-layout NCHW --target-llvm-mtriple aarch64-linux-gnu --target-opencl-device adreno \ --tuning-records keras-resnet50.log -o keras-resnet50.tar resnet50.h5 On successful compilation, above command produce ``keras-resnet50.tar``. diff --git a/tests/scripts/setup-adreno-env.sh b/tests/scripts/setup-adreno-env.sh index 55a92c5f61..15c124a0f0 100755 --- a/tests/scripts/setup-adreno-env.sh +++ b/tests/scripts/setup-adreno-env.sh @@ -20,12 +20,13 @@ ENVIRONMENT="" RPC_PORT="" ADB_SERIAL="" +LISTEN_PORT=5000 function usage() { echo "Helper script to setup the environment for Tracker, RPC Device and for application" echo "Usage (Help) : source setup-adreno-env.sh -h" echo "Usage (Tracker): source setup-adreno-env.sh -e tracker -p " -echo "Usage (Device): source setup-adreno-env.sh -e device -p -d " +echo "Usage (Device): source setup-adreno-env.sh -e device -p -d [-l ]" echo "Usage (Query): source setup-adreno-env.sh -e query -p " } @@ -46,6 +47,11 @@ while [[ $# -gt 0 ]]; do shift # past argument shift # past value ;; +-l|--listen-port) + LISTEN_PORT="$2" + shift # past argument + shift # past value + ;; -h|--help) usage return 0 @@ -62,6 +68,7 @@ done echo "ENVIRONMENT = ${ENVIRONMENT}" echo "RPC_PORT = ${RPC_PORT}" echo "ADB_SERIAL= ${ADB_SERIAL}" +echo "DEVICE LISTEN POPRT= ${LISTEN_PORT}" function def_environment() { @@ -100,10 +107,11 @@ case ${ENVIRONMENT} in fi adb reverse tcp:${TVM_TRACKER_PORT} tcp:${TVM_TRACKER_PORT} -adb forward tcp:5000 tcp:5000 -adb forward tcp:5001 tcp:5001 -adb forward tcp:5002 tcp:5002 -adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 --port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} --key=${RPC_DEVICE_KEY}" +adb forward tcp:${LISTEN_PORT} tcp:${LISTEN_PORT} +adb forward tcp:$((LISTEN_PORT + 1)) tcp:$((LISTEN_PORT + 1)) +adb forward tcp:$((LISTEN_PORT + 2)) tcp:$((LISTEN_PORT + 2)) +adb forward tcp:$((LISTEN_PORT + 3)) tcp:$((LISTEN_PORT + 3)) +adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 --port=${LISTEN_PORT} --port-end=$((LISTEN_PORT + 10)) --tracker=127.0.0.1:${TVM_TRACKER_PORT} --key=${RPC_DEVICE_KEY}" ;; "query") diff --git a/tests/scripts/task_config_build_adreno.sh b/tests/scripts/task_config_build_adreno.sh index 62e6ffecbc..1b6750f165 100755 --- a/tests/scripts/task_config_build_adreno.sh +++ b/tests/scripts/task_config_build_adreno.sh @@ -25,6 +25,8 @@ cp ../cmake/config.cmake . if [ -f "${ADRENO_OPENCL}/CL/cl_qcom_ml_ops.h" ] ; then echo set\(USE_CLML ${ADRENO_OPENCL}\) >> config.cmake +else +echo set\(USE_OPENCL ON\) >> config.cmake fi echo set\(USE_RPC ON\) >> config.cmake echo set\(USE_GRAPH_EXECUTOR ON\) >> config.cmake
[tvm] branch main updated: fix _convert_simple_rnn (#15723)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 7fd4704003 fix _convert_simple_rnn (#15723) 7fd4704003 is described below commit 7fd4704003dec853abbfc15a47a0d07d941b7a8a Author: Haoyang AuthorDate: Wed Sep 13 12:53:41 2023 +0800 fix _convert_simple_rnn (#15723) * fix _convert_simple_rnn * fix _convert_simple_rnn * fix errors in the last pr --- python/tvm/relay/frontend/keras.py | 25 ++--- tests/python/frontend/keras/test_forward.py | 11 +++ 2 files changed, 25 insertions(+), 11 deletions(-) diff --git a/python/tvm/relay/frontend/keras.py b/python/tvm/relay/frontend/keras.py index 205b2be490..9e09cb400a 100644 --- a/python/tvm/relay/frontend/keras.py +++ b/python/tvm/relay/frontend/keras.py @@ -1052,23 +1052,26 @@ def _convert_simple_rnn( inexpr = [inexpr, prev_op] in_data = inexpr[0] prev_op = inexpr[1] +prev_op = _op.nn.batch_flatten(prev_op) weightList = keras_layer.get_weights() kernel_weight = etab.new_const(weightList[0].transpose([1, 0])) recurrent_weight = etab.new_const(weightList[1].transpose([1, 0])) -if keras_layer.use_bias: -in_bias = etab.new_const(weightList[2]) units = list(weightList[0].shape)[1] assert units > 0, "The value of units must be a positive integer" -in_data = _op.nn.batch_flatten(in_data) -ixh = _op.nn.dense(in_data, kernel_weight, units=units) if keras_layer.use_bias: -ixh = _op.nn.bias_add(ixh, bias=in_bias) -prev_op = _op.nn.batch_flatten(prev_op) -ixh2 = _op.nn.dense(prev_op, recurrent_weight, units=units) -output = ixh + ixh2 -output = _convert_activation(output, keras_layer, etab, data_layout) -out_shape = tuple(dim if dim else 1 for dim in _as_list(keras_layer.output_shape)[0]) -output = _op.reshape(output, newshape=out_shape) +in_bias = etab.new_const(weightList[2]) +assert len(in_data.type_annotation.shape) == 3 +timeDim = in_data.type_annotation.shape[1].value +in_data_split = _op.split(in_data, indices_or_sections=timeDim, axis=1) +for i in range(len(in_data_split)): +in_data_split_i = _op.nn.batch_flatten(in_data_split[i]) +ixh = _op.nn.dense(in_data_split_i, kernel_weight, units=units) +if keras_layer.use_bias: +ixh = _op.nn.bias_add(ixh, bias=in_bias) +ixh2 = _op.nn.dense(prev_op, recurrent_weight, units=units) +output = ixh + ixh2 +output = _convert_activation(output, keras_layer, etab, data_layout) +prev_op = output return [output, output] diff --git a/tests/python/frontend/keras/test_forward.py b/tests/python/frontend/keras/test_forward.py index 80460f6063..9d33b15a91 100644 --- a/tests/python/frontend/keras/test_forward.py +++ b/tests/python/frontend/keras/test_forward.py @@ -825,6 +825,16 @@ class TestKeras: ) verify_keras_frontend(dense_model, need_transpose=False) +def test_simplernn_with_infertype(self, keras_mod): +"""This test case is from https://github.com/apache/tvm/issues/14868""; +input_shape = (2, 2, 2) +x = keras_mod.layers.Input(shape=input_shape[1:], dtype="float32") +layer = keras_mod.layers.SimpleRNN(units=4) +y = layer(x) +model = keras_mod.models.Model(x, y) +mod, _ = relay.frontend.from_keras(model, {model.input_names[0]: input_shape}) +relay.transform.InferType()(mod) + if __name__ == "__main__": for k in [keras, tf_keras]: @@ -867,3 +877,4 @@ if __name__ == "__main__": sut.test_forward_repeat_vector(keras_mod=k) sut.test_forward_l2_normalize(keras_mod=k) sut.test_forward_time_distributed(keras_mod=k) +sut.test_simplernn_with_infertype(keras_mod=k)
[tvm] branch main updated: [Relay][Bugfix] Fix the wrong implementation about operator Threshold in oneflow (#15715)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new d8136fbaa1 [Relay][Bugfix] Fix the wrong implementation about operator Threshold in oneflow (#15715) d8136fbaa1 is described below commit d8136fbaa158da4fb8759b88eb6eda35e55e9e53 Author: Qingchao Shen AuthorDate: Tue Sep 12 14:58:23 2023 +0800 [Relay][Bugfix] Fix the wrong implementation about operator Threshold in oneflow (#15715) * [Relay][BugFix] fix the wrong implementation of Threshold in OneFlow * Update test_forward.py * Update oneflow.py * Update test_forward.py * Update oneflow.py * Update test_forward.py add version checking --- python/tvm/relay/frontend/oneflow.py | 15 +-- tests/python/frontend/oneflow/test_forward.py | 15 +++ 2 files changed, 24 insertions(+), 6 deletions(-) diff --git a/python/tvm/relay/frontend/oneflow.py b/python/tvm/relay/frontend/oneflow.py index 7a713e5e15..ddfa043caf 100644 --- a/python/tvm/relay/frontend/oneflow.py +++ b/python/tvm/relay/frontend/oneflow.py @@ -1025,15 +1025,17 @@ class Dropout(OneFlowOpConverter): return out -class ThresholdedRelu(OneFlowOpConverter): -"""Operator converter for ThresholdedRelu.""" +class Threshold(OneFlowOpConverter): +"""Operator converter for Threshold.""" @classmethod def _impl_v1(cls, inputs, attrs, params): -alpha = float(attrs.get("alpha", 1.0)) -alpha_tensor = _op.full_like(inputs[0], fill_value=_expr.const(alpha)) -mask = _op.greater(inputs[0], alpha_tensor).astype("float32") -return inputs[0] * mask +threshold = float(attrs.get("threshold_val", 1.0)) +threshold_tensor = _op.full_like(inputs[0], fill_value=_expr.const(threshold)) +value = float(attrs.get("value")) +value_tensor = _op.full_like(inputs[0], fill_value=_expr.const(value)) +mask = _op.greater(inputs[0], threshold_tensor) +return _op.where(mask, inputs[0], value_tensor) class Elu(OneFlowOpConverter): @@ -1425,6 +1427,7 @@ def get_convert_map(): "relu": Renamer("relu"), "leaky_relu": Renamer("leaky_relu"), "prelu": PReLU.get_converter(), +"threshold": Threshold.get_converter(), "selu": Selu.get_converter(), "silu": Silu.get_converter(), "gelu": Gelu.get_converter(), diff --git a/tests/python/frontend/oneflow/test_forward.py b/tests/python/frontend/oneflow/test_forward.py index 17583b3c25..7ddc347e86 100644 --- a/tests/python/frontend/oneflow/test_forward.py +++ b/tests/python/frontend/oneflow/test_forward.py @@ -24,6 +24,7 @@ import tvm import tvm.testing import tvm.topi.testing from tvm import relay +from packaging import version as package_version MODEL_HOME = "test_model" @@ -702,6 +703,15 @@ def test_activation(): x = x.softmax(dim=-1) return x +class Threshold(flow.nn.Module): +def __init__(self): +super().__init__() +self.active = flow.nn.Threshold(0.5, 0.2) + +def forward(self, x): +x = self.active(x) +return x + if os.path.exists(MODEL_HOME): rmdir(MODEL_HOME) @@ -738,6 +748,11 @@ def test_activation(): inputs=flow.tensor(np.random.rand(1, 12, 197, 197).astype(np.float32)), ) +# Threshold was introduced in the version 0.8.0 of oneflow +if package_version.parse(flow.__version__) >= package_version.parse("0.8.0"): +model14 = Threshold().eval() +verify_activation(model14, device="llvm") + @tvm.testing.uses_gpu def test_math():
[tvm] branch main updated: Fix a bug caused by PyTorch instance_norm when the input shape is [1,1,1,2] (#15683)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new cd7d64e914 Fix a bug caused by PyTorch instance_norm when the input shape is [1,1,1,2] (#15683) cd7d64e914 is described below commit cd7d64e914f763f1f3abc9105142a3f60580c31b Author: Haoyang AuthorDate: Fri Sep 8 15:28:00 2023 +0800 Fix a bug caused by PyTorch instance_norm when the input shape is [1,1,1,2] (#15683) * Fix an adaptive_max_pool1d operator conversion bug * Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * Fix an adaptive_max_pool1d operator conversion bug * Fix an adaptive_max_pool1d operator conversion bug * Add a TODO * Add a TODO * Add a TODO * Fix the Bug caused by torch's instance_norm when input data is [1, 1, 1, 2] * Add a unit test * Fix the Bug caused by torch's instance_norm when input data is [1, 1, 1, 2] * Fix the Bug caused by torch's instance_norm when input data is [1, 1, 1, 2] * simplify the last fix --- python/tvm/relay/frontend/pytorch.py | 2 +- tests/python/frontend/pytorch/test_forward.py | 14 ++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/python/tvm/relay/frontend/pytorch.py b/python/tvm/relay/frontend/pytorch.py index 683b94dd92..9ddd04b5b4 100644 --- a/python/tvm/relay/frontend/pytorch.py +++ b/python/tvm/relay/frontend/pytorch.py @@ -4424,7 +4424,7 @@ def _create_typed_const(data, dtype): dtype should be a TVM dtype""" if dtype == "float64": -typed_data = _expr.const(np.float64(data), dtype=dtype) +typed_data = _expr.const(np.asarray(data, dtype="float64"), dtype=dtype) elif dtype == "float32": typed_data = _expr.const(np.float32(data), dtype=dtype) elif dtype == "float16": diff --git a/tests/python/frontend/pytorch/test_forward.py b/tests/python/frontend/pytorch/test_forward.py index 8c1cdbb0cf..9ee03512e7 100644 --- a/tests/python/frontend/pytorch/test_forward.py +++ b/tests/python/frontend/pytorch/test_forward.py @@ -3449,6 +3449,20 @@ def test_forward_adaptive_max_pool1d(): verify_model(m.float().eval(), input_data=input_data) +@tvm.testing.uses_gpu +def test_forward_instance_norm(): +"""test_forward_instance_norm""" + +class instance_norm(Module): +def forward(self, *args): +return torch.nn.functional.instance_norm(args[0], use_input_stats=True) + +m = instance_norm().float().eval() +input_data = torch.randn([1, 1, 1, 2], dtype=torch.float64) + +verify_model(m.float().eval(), input_data=input_data) + + @tvm.testing.uses_gpu def test_forward_full_like(): """test_forward_full_like"""
[tvm] branch main updated: [Relay] Fix an adaptive_max_pool1d operator conversion bug (#15386)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new d75083cd97 [Relay] Fix an adaptive_max_pool1d operator conversion bug (#15386) d75083cd97 is described below commit d75083cd97ede706338ab413dbc964009456d01b Author: Haoyang AuthorDate: Mon Sep 4 15:26:00 2023 +0800 [Relay] Fix an adaptive_max_pool1d operator conversion bug (#15386) * Fix an adaptive_max_pool1d operator conversion bug * Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * add tests for Fix an adaptive_max_pool1d operator conversion bug * Fix an adaptive_max_pool1d operator conversion bug * Fix an adaptive_max_pool1d operator conversion bug * Add a TODO * Add a TODO * Add a TODO --- python/tvm/relay/frontend/pytorch.py | 10 +- tests/python/frontend/pytorch/test_forward.py | 10 ++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/python/tvm/relay/frontend/pytorch.py b/python/tvm/relay/frontend/pytorch.py index 42765d2fb0..683b94dd92 100644 --- a/python/tvm/relay/frontend/pytorch.py +++ b/python/tvm/relay/frontend/pytorch.py @@ -4291,7 +4291,15 @@ class PyTorchOpConverter: self.current_op.pop() -return [_wrap_const(outputs[ret_name]) for ret_name in ret_names] +# TODO(@haoyang9804): outputs[ret_name] could be None and cause some issue +# revealed by https://github.com/apache/tvm/issues/15004 +# Now only adaptive_max_pool1d is considered. Maybe other ops could also +# trigger this problem. +return [ +_wrap_const(outputs[ret_name]) +for ret_name in ret_names +if ret_name != "aten::adaptive_max_pool1d_0_1" +] def _set_parameter_source_name(self, op_node, outputs): """A helper function to rewrite source_name of parameter.""" diff --git a/tests/python/frontend/pytorch/test_forward.py b/tests/python/frontend/pytorch/test_forward.py index cb49e837fe..8c1cdbb0cf 100644 --- a/tests/python/frontend/pytorch/test_forward.py +++ b/tests/python/frontend/pytorch/test_forward.py @@ -3439,6 +3439,16 @@ def test_forward_full(): verify_model(Full2().float().eval(), input_data=[]) +@tvm.testing.uses_gpu +def test_forward_adaptive_max_pool1d(): +"""test_forward_adaptive_max_pool1d""" +torch.set_grad_enabled(False) +input_data = [torch.randn([2, 2, 4], dtype=torch.float32)] +m = torch.nn.AdaptiveMaxPool1d(3) + +verify_model(m.float().eval(), input_data=input_data) + + @tvm.testing.uses_gpu def test_forward_full_like(): """test_forward_full_like"""
[tvm] branch main updated: Fixed search task comment (#15535)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 2b718e5d00 Fixed search task comment (#15535) 2b718e5d00 is described below commit 2b718e5d00948f56b880d14e13ab9fb3bad26a54 Author: Thais Camacho AuthorDate: Mon Aug 14 11:13:45 2023 -0300 Fixed search task comment (#15535) fixed search task comment --- python/tvm/auto_scheduler/search_task.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python/tvm/auto_scheduler/search_task.py b/python/tvm/auto_scheduler/search_task.py index 51a74dae93..767baf916d 100644 --- a/python/tvm/auto_scheduler/search_task.py +++ b/python/tvm/auto_scheduler/search_task.py @@ -576,7 +576,7 @@ class SearchTask(Object): # workload[0] is either the compute function name or the ComputeDAG hash. # The compute functions are already registered when importing TVM, so here # we only register the ComputeDAG workloads. If the same workload has -# already been registered, the later registration overrides the prvious one. +# already been registered, the later registration overrides the previous one. if workload[0] not in WORKLOAD_FUNC_REGISTRY: register_workload_tensors(state["workload_key"], state["compute_dag"].tensors)
[tvm] branch main updated: [Fix] Fix the typo in compile flag (#15542)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 482b3c2c99 [Fix] Fix the typo in compile flag (#15542) 482b3c2c99 is described below commit 482b3c2c998e345f7daffa7bfff099898dc7b4c6 Author: Haolin Zhang <44371223+chrisz...@users.noreply.github.com> AuthorDate: Mon Aug 14 07:12:39 2023 -0700 [Fix] Fix the typo in compile flag (#15542) Fix the compile flag typo from --shared to -shared in fucntion _linux_compile --- python/tvm/contrib/cc.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python/tvm/contrib/cc.py b/python/tvm/contrib/cc.py index ce02a132e2..b69c925411 100644 --- a/python/tvm/contrib/cc.py +++ b/python/tvm/contrib/cc.py @@ -280,7 +280,7 @@ def _linux_compile(output, objects, options, compile_cmd, compile_shared=False): cmd += ["-c"] else: if compile_shared or output.endswith(".so") or output.endswith(".dylib"): -cmd += ["--shared"] +cmd += ["-shared"] cmd += ["-o", output] if isinstance(objects, str): cmd += [objects]
[tvm] branch main updated: [Relay] Disable exception for ADT in mixed precision pass (#15533)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 927df59662 [Relay] Disable exception for ADT in mixed precision pass (#15533) 927df59662 is described below commit 927df5966237f10978319044716d93c90bf8843c Author: Egor Churaev AuthorDate: Mon Aug 14 10:32:50 2023 +0300 [Relay] Disable exception for ADT in mixed precision pass (#15533) If topology contains while loop and we want to transform it to mixed precision then we get an exception that "ADT are not supported for mixed precision pass". It happens, because while loop implemented as a lambda which is assigned to a VarNode. In this commit I changed the behavior of ToMixedPrecision pass and instead of generating exception, it just do nothing. Correspondent regression test is added. --- src/relay/transforms/to_mixed_precision.cc| 9 --- tests/python/relay/test_to_mixed_precision.py | 35 ++- 2 files changed, 39 insertions(+), 5 deletions(-) diff --git a/src/relay/transforms/to_mixed_precision.cc b/src/relay/transforms/to_mixed_precision.cc index 4638ee5477..5026b1bcba 100644 --- a/src/relay/transforms/to_mixed_precision.cc +++ b/src/relay/transforms/to_mixed_precision.cc @@ -350,10 +350,11 @@ class MixedPrecisionPass : public MixedModeMutator { // TODO(AndrewZhaoLuo): Support ADTs // Relay's algebraic data types are not supported yet. -ICHECK(!cur_op.as() // used to declare functions for recursion - && !cur_op.as() // constructing ADT types - && !cur_op.as()) // used for calling recursive functions -<< "Algebraic Data Types (ADT) are not supported yet for mixed precision pass."; +bool isADT = (cur_op.as() // used to declare functions for recursion + || cur_op.as() // constructing ADT types + || cur_op.as() // used for binding lambdas + || cur_op.as());// used for calling recursive functions +if (isADT) return post; // Get info on the operation being called: // conversion category (int), accumulation dtype (str), output dtype (str) diff --git a/tests/python/relay/test_to_mixed_precision.py b/tests/python/relay/test_to_mixed_precision.py index a802eee6d6..4c97642498 100644 --- a/tests/python/relay/test_to_mixed_precision.py +++ b/tests/python/relay/test_to_mixed_precision.py @@ -49,7 +49,6 @@ def verify_mixed_precision_output_close( atol: float = 0, keep_orig_output_dtype=False, ) -> tvm.runtime.Module: - mod = InferType()(mod) result_fp32 = run_module(mod, mod_params) @@ -586,5 +585,39 @@ def test_clip_with_pre_op(target_precision): assert tvm.ir.structural_equal(expected_mod, output_mod) +def test_loop(target_precision): +i = relay.var("i", shape=(), dtype="int32") +st = relay.var("st", shape=(relay.Any(), 1), dtype="int32") + +def int32(val): +return relay.const(val, "int32") + +def _cond(i, st): +return relay.op.min(relay.op.less(i, int32(10))) + +def _body(i, st): +i_vec = relay.op.reshape(i, (1, 1)) +ret = relay.op.concatenate([st, i_vec], axis=0) +return i + int32(1), ret + +loop = relay.loops.while_loop(_cond, [i, st], _body) +start = relay.var("start", shape=(), dtype="int32") +body = loop(start, relay.op.reshape(relay.const(0), newshape=(1, 1))) +func = relay.Function([start], relay.TupleGetItem(body, 1)) +mod = tvm.IRModule() +mod["main"] = func + +mod_params = { +"start": np.random.uniform(-1, 1, size=()).astype("int32"), +} +output_mod = verify_mixed_precision_output_close( +mod, mod_params, mixed_precision_dtype=target_precision, atol=0.01, rtol=0.01 +) + +# Create expected module +expected_mod = InferType()(mod) +assert tvm.ir.structural_equal(expected_mod, output_mod) + + if __name__ == "__main__": tvm.testing.main()
[tvm] branch main updated: [Relay] Stop ToMixedPrecision when constant is out of dtype range (#15461)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 0e905aa4d7 [Relay] Stop ToMixedPrecision when constant is out of dtype range (#15461) 0e905aa4d7 is described below commit 0e905aa4d755cbaeb71cd1fe979b91434177b256 Author: Egor Churaev AuthorDate: Thu Aug 3 08:27:34 2023 +0300 [Relay] Stop ToMixedPrecision when constant is out of dtype range (#15461) * [Relay] Stop ToMixedPrecision when constant is out of dtype range In some layers, e.g. Clip, we might have a compilation error in the case when operation takes on the input a constant which is out of target data type range. To prevent such situation, a new method was introduced. It compares values of constant attributes with the range of the target data type. In case if the value is out of range then float32 will be used. * Fix lint --- src/relay/transforms/to_mixed_precision.cc| 41 -- tests/python/relay/test_to_mixed_precision.py | 49 +++ 2 files changed, 88 insertions(+), 2 deletions(-) diff --git a/src/relay/transforms/to_mixed_precision.cc b/src/relay/transforms/to_mixed_precision.cc index 820bc6e58e..4638ee5477 100644 --- a/src/relay/transforms/to_mixed_precision.cc +++ b/src/relay/transforms/to_mixed_precision.cc @@ -31,6 +31,7 @@ #include +#include "../../support/scalars.h" #include "pattern_utils.h" namespace tvm { @@ -110,6 +111,39 @@ class MixedPrecisionPass : public MixedModeMutator { std::vector original_dtype_; bool keep_orig_output_dtype_; + /*! \brief If some of the constant attributes are out of mixed_precision_type_ bounds, then + * computation cannot be performed in mixed precision. */ + bool IsMixedPrecisionApplicableToAttrs(const Attrs& attrs) const { +if (attrs.get() != nullptr) { + double min_bound; + double max_bound; + if (mixed_precision_type_.is_float16()) { +min_bound = -support::kMaxFloat16; +max_bound = support::kMaxFloat16; + } else if (mixed_precision_type_.is_bfloat16()) { +min_bound = -support::kMaxBFloat16; +max_bound = support::kMaxBFloat16; + } else if (mixed_precision_type_.is_float8()) { +double bound = (mixed_precision_type_.code() == DataType::kE4M3Float) ? support::kMaxE4M3 + : support::kMaxE5M2; +min_bound = -bound; +max_bound = bound; + } else if (mixed_precision_type_.is_float()) { +min_bound = std::numeric_limits::lowest(); +max_bound = std::numeric_limits::max(); + } else { +return true; + } + + if (auto cur_attrs = attrs.as()) { +if (cur_attrs->a_min < min_bound || cur_attrs->a_max > max_bound) { + return false; +} + } +} +return true; + } + Attrs GetNewAttrs(const CallNode* call, const DataType& accumulation_dtype) const { /* If the accumulation dtype is in the attributes make a copy and mutate the field. */ Attrs cur_attrs = call->attrs; @@ -382,9 +416,12 @@ class MixedPrecisionPass : public MixedModeMutator { all_args_mixed_type_compatible ? MIXED_PRECISION_ALWAYS : MIXED_PRECISION_NEVER; } +bool is_mixed_precision_applicable = +static_cast(final_category == MIXED_PRECISION_ALWAYS && + IsMixedPrecisionApplicableToAttrs(pre_call_node->attrs)); // Create the new arguments to the call. DataType wanted_arg_dtypes = -final_category == MIXED_PRECISION_ALWAYS ? mixed_precision_type_ : DataType::Float(32); +is_mixed_precision_applicable ? mixed_precision_type_ : DataType::Float(32); auto call_args_and_types = CastAllArgs(post_call_node->args, cur_arg_types, wanted_arg_dtypes); Array new_args = call_args_and_types.first; Array new_arg_types; @@ -397,7 +434,7 @@ class MixedPrecisionPass : public MixedModeMutator { } // Finally create the new attributes. -if (final_category == MIXED_PRECISION_ALWAYS) { +if (is_mixed_precision_applicable) { Attrs new_attrs = GetNewAttrs(pre_call_node, accumulation_dtype); Expr output = Call(cur_op, new_args, new_attrs, new_arg_types, pre_call_node->span); if (accumulation_dtype != output_dtype) { diff --git a/tests/python/relay/test_to_mixed_precision.py b/tests/python/relay/test_to_mixed_precision.py index 771d366df0..a802eee6d6 100644 --- a/tests/python/relay/test_to_mixed_precision.py +++ b/tests/python/relay/test_to_mixed_precision.py @@ -537,5 +537,54 @@ def test_convert_follow_node_with_integer_arguments(target_precision): assert tvm.ir.structural_equal(expected_mod, output_mod) +def test
[tvm] branch main updated: [Codegen][Metal] Support metal warp-level primitive (#15401)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 22ec541a6f [Codegen][Metal] Support metal warp-level primitive (#15401) 22ec541a6f is described below commit 22ec541a6f12a6430246e4d3aed47c5cfc462b1e Author: Ruihang Lai AuthorDate: Wed Jul 26 14:03:40 2023 -0700 [Codegen][Metal] Support metal warp-level primitive (#15401) This PR introduces the warp-level shuffle primitives used in Metal Shading Language, and uses them in the implementation of allreduce lowering. The introduced primitives are: * `simd_shuffle`, * `simd_shuffle_up`, * `simd_shuffle_down`. See section 6.9.2 of https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf for details. The correctness are validated by `test_allreduce_cuda` with the backend changed to Metal. Given we do not have Metal CI tests, the correctness is checked only locally. Given the Metal shuffle primitives do not support (or need) masking, the pass LowerThreadAllreduce is updated to support such backend which does not have masks. One unit test for metal is added to ensure that no mask is used. --- src/target/source/intrin_rule_metal.cc | 53 +++ src/tir/transforms/lower_thread_allreduce.cc | 35 --- .../test_tir_transform_lower_thread_all_reduce.py | 103 + 3 files changed, 180 insertions(+), 11 deletions(-) diff --git a/src/target/source/intrin_rule_metal.cc b/src/target/source/intrin_rule_metal.cc index dd924b9255..cc83eb1462 100644 --- a/src/target/source/intrin_rule_metal.cc +++ b/src/target/source/intrin_rule_metal.cc @@ -30,6 +30,28 @@ namespace codegen { namespace intrin { using tir::FLowerIntrinsic; +struct MetalWarpIntrinsic { + const Op operator()(DataType t, const Op& orig_op) const { +if (orig_op.same_as(builtin::tvm_warp_shuffle())) { + return Op::Get("tir.metal.simd_shuffle"); +} else if (orig_op.same_as(builtin::tvm_warp_shuffle_up())) { + return Op::Get("tir.metal.simd_shuffle_up"); +} else { + ICHECK(orig_op.same_as(builtin::tvm_warp_shuffle_down())); + return Op::Get("tir.metal.simd_shuffle_down"); +} + } +}; + +template +static PrimExpr DispatchMetalShuffle(const PrimExpr& e) { + const CallNode* call = e.as(); + ICHECK(call != nullptr); + ICHECK_EQ(call->args.size(), 5); // mask, value, warp_id, width, warp_size + Array metal_args{{call->args[1], call->args[2]}}; + return Call(call->dtype, T()(call->dtype, Downcast(call->op)), metal_args); +} + TVM_REGISTER_OP("tir.floor") .set_attr("metal.FLowerIntrinsic", DispatchPureExtern); @@ -95,6 +117,37 @@ TVM_REGISTER_OP("tir.cosh") TVM_REGISTER_OP("tir.erf").set_attr("metal.FLowerIntrinsic", DispatchFastErf); +TVM_REGISTER_OP("tir.tvm_warp_shuffle") +.set_attr("metal.FLowerIntrinsic", DispatchMetalShuffle); + +TVM_REGISTER_OP("tir.tvm_warp_shuffle_up") +.set_attr("metal.FLowerIntrinsic", DispatchMetalShuffle); + +TVM_REGISTER_OP("tir.tvm_warp_shuffle_down") +.set_attr("metal.FLowerIntrinsic", DispatchMetalShuffle); + +// Register low-level builtin ops. +TVM_REGISTER_OP("tir.metal.simd_shuffle") +.set_num_inputs(2) +.add_argument("var", "Expr", "The variable to sync.") +.add_argument("lane", "Expr", "The source thread id.") +.set_attr("TGlobalSymbol", "simd_shuffle") +.set_attr("TCallEffectKind", Integer(CallEffectKind::kOpaque)); + +TVM_REGISTER_OP("tir.metal.simd_shuffle_up") +.set_num_inputs(2) +.add_argument("var", "Expr", "The variable to sync.") +.add_argument("delta", "Expr", "The source lane id offset to be added.") +.set_attr("TGlobalSymbol", "simd_shuffle_up") +.set_attr("TCallEffectKind", Integer(CallEffectKind::kOpaque)); + +TVM_REGISTER_OP("tir.metal.simd_shuffle_down") +.set_num_inputs(2) +.add_argument("var", "Expr", "The variable to sync.") +.add_argument("delta", "Expr", "The source lane id offset to be subtracted.") +.set_attr("TGlobalSymbol", "simd_shuffle_down") +.set_attr("TCallEffectKind", Integer(CallEffectKind::kOpaque)); + } // namespace intrin } // namespace codegen } // namespace tvm diff --git a/src/tir/transforms/lower_thread_allreduce.cc b/src/tir/transforms/lower_thread_allreduce.cc index 438dccff0b..fb
[tvm] branch main updated: [Bugfix][Frontend][Keras]Fix a corner case bug in softmax converter of keras frontend (#15337)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new e2d6511161 [Bugfix][Frontend][Keras]Fix a corner case bug in softmax converter of keras frontend (#15337) e2d6511161 is described below commit e2d65111616dfa95797c0dd7e082e4050b71701d Author: Qingchao Shen AuthorDate: Tue Jul 18 13:02:34 2023 +0800 [Bugfix][Frontend][Keras]Fix a corner case bug in softmax converter of keras frontend (#15337) * Fix softmax converter about keras * add new test cases to capture the bug * Update keras.py --- python/tvm/relay/frontend/keras.py | 6 -- tests/python/frontend/keras/test_forward.py | 7 +++ 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/python/tvm/relay/frontend/keras.py b/python/tvm/relay/frontend/keras.py index 1913d4a268..aba4160695 100644 --- a/python/tvm/relay/frontend/keras.py +++ b/python/tvm/relay/frontend/keras.py @@ -131,11 +131,13 @@ def _convert_advanced_activation(inexpr, keras_layer, etab, data_layout, input_s if act_type == "Softmax": axis = keras_layer.axis -dims = len(input_shape) +dims = len(input_shape) if input_shape else 0 if isinstance(axis, list): raise tvm.error.OpAttributeUnImplemented(f"Softmax with axes {axis} is not supported.") if data_layout == "NCHW": -if axis == -1: +if dims == 0: +axis = 0 +elif axis == -1: axis = 1 else: axis = axis + 1 if axis < dims - 1 else 1 diff --git a/tests/python/frontend/keras/test_forward.py b/tests/python/frontend/keras/test_forward.py index 50a0e98505..53e2ca8dbe 100644 --- a/tests/python/frontend/keras/test_forward.py +++ b/tests/python/frontend/keras/test_forward.py @@ -229,6 +229,13 @@ class TestKeras: keras_model = keras_mod.models.Model(data, x) verify_keras_frontend(keras_model) verify_keras_frontend(keras_model, need_transpose=False, layout="NHWC") +# Test the input dimension = 1 +data = keras_mod.layers.Input(shape=(11,)) +act_func = keras_mod.layers.Softmax() +x = act_func(data) +keras_model = keras_mod.models.Model(data, x) +verify_keras_frontend(keras_model) +verify_keras_frontend(keras_model, need_transpose=False, layout="NHWC") def test_forward_activations_except(self, keras_mod): """
[tvm] branch main updated: Fix keras version problem (#15265)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new b6502f4e27 Fix keras version problem (#15265) b6502f4e27 is described below commit b6502f4e278da391719155936aeefb6544115c1f Author: Shikamaru:) AuthorDate: Fri Jul 14 18:20:16 2023 +0800 Fix keras version problem (#15265) * Fix keras version problem * Fix keras version problem * Fix keras version problem * Fix keras version problem * Fix keras version problem * Fix keras version problem * Fix keras version problem --- python/tvm/relay/frontend/keras.py | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/python/tvm/relay/frontend/keras.py b/python/tvm/relay/frontend/keras.py index 0d932cadcc..1913d4a268 100644 --- a/python/tvm/relay/frontend/keras.py +++ b/python/tvm/relay/frontend/keras.py @@ -1526,12 +1526,19 @@ def from_keras(model, shape=None, layout="NCHW"): raise ValueError("Keras frontend currently supports tensorflow backend only.") if keras.backend.image_data_format() != "channels_last": raise ValueError("Keras frontend currently supports data_format = channels_last only.") -expected_model_class = keras.engine.training.Model -if hasattr(keras.engine, "InputLayer"): -input_layer_class = keras.engine.InputLayer +try: +import keras.engine as E +except ImportError: +try: +import keras.src.engine as E +except ImportError: +raise ImportError("Cannot find Keras's engine") +expected_model_class = E.training.Model +if hasattr(E, "InputLayer"): +input_layer_class = E.InputLayer else: # TFlite >=2.6 -input_layer_class = keras.engine.input_layer.InputLayer +input_layer_class = E.input_layer.InputLayer else: # Importing from Tensorflow Keras (tf.keras) try:
[tvm] branch main updated: [CMAKE] Add Vulkan header for Android (#15229)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 23fb568521 [CMAKE] Add Vulkan header for Android (#15229) 23fb568521 is described below commit 23fb56852122919a2ea896000d87be163e2b1bb5 Author: Ruihang Lai AuthorDate: Tue Jul 4 22:41:42 2023 -0700 [CMAKE] Add Vulkan header for Android (#15229) This PR enhances the Vulkan search for Android NDK, so that we can find the Vulkan header for Android builds. Co-authored-by: tqchen --- cmake/utils/FindVulkan.cmake | 57 +--- 1 file changed, 32 insertions(+), 25 deletions(-) diff --git a/cmake/utils/FindVulkan.cmake b/cmake/utils/FindVulkan.cmake index a2a6fb6035..032ff1dffa 100644 --- a/cmake/utils/FindVulkan.cmake +++ b/cmake/utils/FindVulkan.cmake @@ -37,36 +37,43 @@ macro(find_vulkan use_vulkan use_khronos_spirv) if(IS_DIRECTORY ${__use_vulkan}) set(__vulkan_sdk ${__use_vulkan}) message(STATUS "Custom Vulkan SDK PATH=" ${__use_vulkan}) - elseif(IS_DIRECTORY $ENV{VULKAN_SDK}) - set(__vulkan_sdk $ENV{VULKAN_SDK}) - else() - set(__vulkan_sdk "") - endif() + elseif(IS_DIRECTORY $ENV{VULKAN_SDK}) +set(__vulkan_sdk $ENV{VULKAN_SDK}) + else() +set(__vulkan_sdk "") + endif() - if(IS_DIRECTORY ${use_khronos_spirv}) - set(__use_khronos_spirv ${use_khronos_spirv}) - message(STATUS "Custom khronos spirv PATH=" ${__use_khronos_spirv}) - else() - set(__use_khronos_spirv "") - endif() + if(IS_DIRECTORY ${use_khronos_spirv}) +set(__use_khronos_spirv ${use_khronos_spirv}) +message(STATUS "Custom khronos spirv PATH=" ${__use_khronos_spirv}) + else() +set(__use_khronos_spirv "") + endif() + if(CMAKE_SYSTEM_NAME STREQUAL "Android") +set(VULKAN_NDK_SRC ${CMAKE_ANDROID_NDK}/sources/third_party/vulkan/src) +set(Vulkan_INCLUDE_DIRS ${VULKAN_NDK_SRC}/include) +set(Vulkan_FOUND TRUE) +message(STATUS "Android Vulkan_INCLUDE_DIRS=" ${Vulkan_INCLUDE_DIRS}) +message(STATUS "Skip finding SPIRV in Android, make sure you only build tvm runtime.") +return() + endif() - if(__vulkan_sdk) - set(Vulkan_INCLUDE_DIRS ${__vulkan_sdk}/include) - find_library(Vulkan_LIBRARY NAMES vulkan vulkan-1 PATHS ${__vulkan_sdk}/lib) - if(Vulkan_LIBRARY) - set(Vulkan_FOUND TRUE) - endif() - endif(__vulkan_sdk) + if(__vulkan_sdk) +set(Vulkan_INCLUDE_DIRS ${__vulkan_sdk}/include) +find_library(Vulkan_LIBRARY NAMES vulkan vulkan-1 PATHS ${__vulkan_sdk}/lib) +if(Vulkan_LIBRARY) + set(Vulkan_FOUND TRUE) +endif() + endif(__vulkan_sdk) - # resort to find vulkan of option is on - if(NOT Vulkan_FOUND) - if(${__use_vulkan} MATCHES ${IS_TRUE_PATTERN}) - find_package(Vulkan QUIET) - endif() - endif() - # additional libraries + # resort to find vulkan of option is on + if(NOT Vulkan_FOUND) +if(${__use_vulkan} MATCHES ${IS_TRUE_PATTERN}) + find_package(Vulkan QUIET) +endif() + endif() if(Vulkan_FOUND) get_filename_component(VULKAN_LIBRARY_PATH ${Vulkan_LIBRARY} DIRECTORY)
[tvm] branch main updated: [Test] Improve check for TVMError exception in test_cast (#15138)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 2c48d7bf21 [Test] Improve check for TVMError exception in test_cast (#15138) 2c48d7bf21 is described below commit 2c48d7bf21cbea2792aebf58e8c9ac085b4cdd38 Author: Krzysztof Parzyszek AuthorDate: Thu Jun 22 06:32:31 2023 -0500 [Test] Improve check for TVMError exception in test_cast (#15138) The original code has ``` with pytest.raises(tvm.error.TVMError): s.astype("int") assert "Can't cast a handle to other types" in str(e) ``` The `astype` is expected to throw a TVMError exception, but if it does, the following assertion never gets executed. Add try/except to the test to verify that the expected exception is thrown. --- tests/python/unittest/test_tir_nodes.py | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/tests/python/unittest/test_tir_nodes.py b/tests/python/unittest/test_tir_nodes.py index 7826b5960b..49816778f1 100644 --- a/tests/python/unittest/test_tir_nodes.py +++ b/tests/python/unittest/test_tir_nodes.py @@ -118,9 +118,12 @@ def test_cast(): assert z.lanes == 4 s = tvm.tir.StringImm("s") -with pytest.raises(tvm.error.TVMError) as cm: -s.astype("int") -assert "Can't cast a handle to other types" in str(cm.execption) +with pytest.raises(tvm.error.TVMError): +try: +s.astype("int") +except Exception as e: +assert "Can't cast a handle to other types" in str(e) +raise def test_attr():
[tvm] branch main updated: [Bugfix][Relay][Keras] Fix the wrong implementation logic about cropping2D (#15053)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 90b5acca59 [Bugfix][Relay][Keras] Fix the wrong implementation logic about cropping2D (#15053) 90b5acca59 is described below commit 90b5acca59421d67494955c4fc943a5ddbbc7373 Author: Qingchao Shen AuthorDate: Thu Jun 15 17:46:37 2023 +0800 [Bugfix][Relay][Keras] Fix the wrong implementation logic about cropping2D (#15053) * fix the wrong calculation logic of cropping2d The implementation of cropping2D is wrong. This pr fix it. * add a test case to caputure the bug * Update test_forward.py * Update test_forward.py * correct the patch * Update keras.py * Update test_forward.py * Update test_forward.py * Update test_forward.py --- python/tvm/relay/frontend/keras.py | 10 -- tests/python/frontend/keras/test_forward.py | 10 +- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/python/tvm/relay/frontend/keras.py b/python/tvm/relay/frontend/keras.py index d963a5d160..8d437027e5 100644 --- a/python/tvm/relay/frontend/keras.py +++ b/python/tvm/relay/frontend/keras.py @@ -816,10 +816,16 @@ def _convert_cropping( f"Operator {crop_type} is not supported for frontend Keras." ) int32_max = np.iinfo(np.int32).max +if data_layout == "NHWC": +begin = [0, crop_t, crop_l, 0] +end = [int32_max, in_h - crop_b, in_w - crop_r, int32_max] +else: +begin = [0, 0, crop_t, crop_l] +end = [int32_max, int32_max, in_h - crop_b, in_w - crop_r] return _op.strided_slice( inexpr, -begin=[0, 0, crop_t, crop_l], -end=[int32_max, int32_max, in_h - crop_b, in_w - crop_r], +begin=begin, +end=end, ) diff --git a/tests/python/frontend/keras/test_forward.py b/tests/python/frontend/keras/test_forward.py index 45935f87f4..cc6421614e 100644 --- a/tests/python/frontend/keras/test_forward.py +++ b/tests/python/frontend/keras/test_forward.py @@ -449,7 +449,15 @@ class TestKeras: x = keras_mod.layers.Cropping2D(cropping=0)(x) x = keras_mod.layers.Add()([x, x]) keras_model = keras_mod.models.Model(data, x) -verify_keras_frontend(keras_model) +verify_keras_frontend(keras_model, layout="NHWC") +verify_keras_frontend(keras_model, layout="NHWC") + +data = keras_mod.layers.Input(shape=(32, 32, 3)) +x = keras_mod.layers.Cropping2D(cropping=(2, 1))(data) +x = keras_mod.layers.Cropping2D(cropping=(1, 2))(x) +keras_model = keras_mod.models.Model(data, x) +verify_keras_frontend(keras_model, layout="NHWC") +verify_keras_frontend(keras_model, layout="NCHW") def test_forward_multi_inputs(self, keras_mod): data1 = keras_mod.layers.Input(shape=(32, 32, 3))
[tvm] branch main updated: [Bugfix][Relay][Keras] Fix UpSampling2D about the wrong assertion about size (#15082)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 081cc2ef64 [Bugfix][Relay][Keras] Fix UpSampling2D about the wrong assertion about size (#15082) 081cc2ef64 is described below commit 081cc2ef64c866f6dc2e33dbf6912395d09ca749 Author: Qingchao Shen AuthorDate: Wed Jun 14 17:13:19 2023 +0800 [Bugfix][Relay][Keras] Fix UpSampling2D about the wrong assertion about size (#15082) * fix wrong assertion about unsample in keras.py * Update test_forward.py * Update test_forward.py --- python/tvm/relay/frontend/keras.py | 4 +--- tests/python/frontend/keras/test_forward.py | 5 + 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/python/tvm/relay/frontend/keras.py b/python/tvm/relay/frontend/keras.py index 16192617fe..d963a5d160 100644 --- a/python/tvm/relay/frontend/keras.py +++ b/python/tvm/relay/frontend/keras.py @@ -767,10 +767,8 @@ def _convert_upsample( params["scale_h"] = h elif upsample_type == "UpSampling2D": h, w = keras_layer.size -if h != w: -raise tvm.error.OpAttributeInvalid("Height must equal width for operator Upsample.") params["scale_h"] = h -params["scale_w"] = h +params["scale_w"] = w if hasattr(keras_layer, "interpolation"): interpolation = keras_layer.interpolation diff --git a/tests/python/frontend/keras/test_forward.py b/tests/python/frontend/keras/test_forward.py index debd50b37a..45935f87f4 100644 --- a/tests/python/frontend/keras/test_forward.py +++ b/tests/python/frontend/keras/test_forward.py @@ -389,6 +389,11 @@ class TestKeras: x = keras_mod.layers.UpSampling2D(size=(3, 3), interpolation=interpolation)(data) keras_model = keras_mod.models.Model(data, x) verify_keras_frontend(keras_model) +# Height and width are not equal for the attribute size +data = keras_mod.layers.Input(shape=(2, 1, 3)) +x = keras_mod.layers.UpSampling2D(size=(1, 2), interpolation=interpolation)(data) +keras_model = keras_mod.models.Model(data, x) +verify_keras_frontend(keras_model) def test_forward_reshape(self, keras_mod): """test_forward_reshape"""
[tvm] branch main updated: Revert "[CI] Make Graviton3 default AArch64 job runner node (#14983)" (#15065)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new f172f6c70d Revert "[CI] Make Graviton3 default AArch64 job runner node (#14983)" (#15065) f172f6c70d is described below commit f172f6c70d8e72556225877c7522ef13b7b250e2 Author: masahi AuthorDate: Fri Jun 9 15:08:14 2023 +0900 Revert "[CI] Make Graviton3 default AArch64 job runner node (#14983)" (#15065) This reverts commit 7dc1d6a99f7984855edf1a5c42b66215addeb13f. --- ci/jenkins/generated/arm_jenkinsfile.groovy| 38 +- ci/jenkins/templates/arm_jenkinsfile.groovy.j2 | 8 +++--- 2 files changed, 23 insertions(+), 23 deletions(-) diff --git a/ci/jenkins/generated/arm_jenkinsfile.groovy b/ci/jenkins/generated/arm_jenkinsfile.groovy index 2eb6687bee..40211e4eb9 100644 --- a/ci/jenkins/generated/arm_jenkinsfile.groovy +++ b/ci/jenkins/generated/arm_jenkinsfile.groovy @@ -60,7 +60,7 @@ // 'python3 jenkins/generate.py' // Note: This timestamp is here to ensure that updates to the Jenkinsfile are // always rebased on main before merging: -// Generated at 2023-06-02T10:54:19.255872 +// Generated at 2023-05-05T13:39:06.460410 import org.jenkinsci.plugins.pipeline.modeldefinition.Utils // These are set at runtime from data in ci/jenkins/docker-images.yml, update @@ -577,17 +577,17 @@ def build(node_type) { } } try { -build('ARM-GRAVITON3-SPOT') +build('ARM-SMALL-SPOT') } catch (Exception ex) { -build('ARM-GRAVITON3') +build('ARM-SMALL') } -def shard_run_integration_aarch64_1_of_4(node_type='ARM-GRAVITON3-SPOT', on_demand=false) { +def shard_run_integration_aarch64_1_of_4(node_type='ARM-SMALL-SPOT', on_demand=false) { if (!skip_ci && is_docs_only_build != 1) { if (on_demand==true || node_type.contains('ARM')) { -node_type = 'ARM-GRAVITON3' +node_type = 'ARM-SMALL' } node(node_type) { ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") { @@ -633,10 +633,10 @@ def shard_run_integration_aarch64_1_of_4(node_type='ARM-GRAVITON3-SPOT', on_dema } } -def shard_run_integration_aarch64_2_of_4(node_type='ARM-GRAVITON3-SPOT', on_demand=false) { +def shard_run_integration_aarch64_2_of_4(node_type='ARM-SMALL-SPOT', on_demand=false) { if (!skip_ci && is_docs_only_build != 1) { if (on_demand==true || node_type.contains('ARM')) { -node_type = 'ARM-GRAVITON3' +node_type = 'ARM-SMALL' } node(node_type) { ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") { @@ -682,10 +682,10 @@ def shard_run_integration_aarch64_2_of_4(node_type='ARM-GRAVITON3-SPOT', on_dema } } -def shard_run_integration_aarch64_3_of_4(node_type='ARM-GRAVITON3-SPOT', on_demand=false) { +def shard_run_integration_aarch64_3_of_4(node_type='ARM-SMALL-SPOT', on_demand=false) { if (!skip_ci && is_docs_only_build != 1) { if (on_demand==true || node_type.contains('ARM')) { -node_type = 'ARM-GRAVITON3' +node_type = 'ARM-SMALL' } node(node_type) { ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") { @@ -731,10 +731,10 @@ def shard_run_integration_aarch64_3_of_4(node_type='ARM-GRAVITON3-SPOT', on_dema } } -def shard_run_integration_aarch64_4_of_4(node_type='ARM-GRAVITON3-SPOT', on_demand=false) { +def shard_run_integration_aarch64_4_of_4(node_type='ARM-SMALL-SPOT', on_demand=false) { if (!skip_ci && is_docs_only_build != 1) { if (on_demand==true || node_type.contains('ARM')) { -node_type = 'ARM-GRAVITON3' +node_type = 'ARM-SMALL' } node(node_type) { ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") { @@ -782,10 +782,10 @@ def shard_run_integration_aarch64_4_of_4(node_type='ARM-GRAVITON3-SPOT', on_dema -def shard_run_topi_aarch64_1_of_2(node_type='ARM-GRAVITON3-SPOT', on_demand=false) { +def shard_run_topi_aarch64_1_of_2(node_type='ARM-SMALL-SPOT', on_demand=false) { if (!skip_ci && is_docs_only_build != 1) { if (on_demand==true || node_type.contains('ARM')) { -node_type = 'ARM-GRAVITON3' +node_type = 'ARM-SMALL' } node(node_type) { ws("workspace/exec_${env.EXECUTOR_NUMBER}/tvm/ut-python-arm") { @@ -836,10 +836,10 @@ def shard_run_topi_aarch64_1_of_2(node_type='ARM-GRAVITON3-SPOT', on_demand=fals } } -def shard_run_topi_aarch64_2_of_2(node_type='ARM-GRAVITON3-SPOT', on_demand=false) { +def shard_run_topi_aarch64_2_of_2(node_type='ARM-SMALL-SPOT', on_demand=false) { if (!skip_ci && is_docs_only_build != 1) { if (on_demand==true || node_type.contains('ARM')) { -node_type = 'ARM-GRAVITON3' +node_type = 'ARM-SMALL' }
[tvm] branch main updated (80079b6493 -> 1366f2e250)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 80079b6493 Fix type parse error about AdaptiveMaxPool (#15016) add 1366f2e250 [OpenCLML] Reactor and introduce on chip memory and memory planner (#14922) No new revisions were added by this update. Summary of changes: CMakeLists.txt |2 +- apps/cpp_clml/clml_runner.cc | 36 +- apps/cpp_clml/clml_runner.h|2 +- apps/cpp_clml/scripts/clml_codegen.py |2 +- cmake/modules/OpenCL.cmake | 39 +- python/tvm/relay/op/contrib/clml.py| 165 ++-- src/runtime/contrib/clml/clml_memory_planner.cc| 268 + .../runtime/contrib/clml/clml_memory_planner.h | 21 +- src/runtime/contrib/clml/clml_runtime.cc | 1035 ++-- src/runtime/contrib/clml/clml_runtime.h| 185 src/runtime/contrib/clml/clml_utils.cc | 257 + src/runtime/contrib/clml/clml_utils.h | 77 ++ tests/cpp-runtime/opencl/clml_memory_planner.cc| 439 + tests/python/contrib/test_clml/infrastructure.py | 19 + tests/python/contrib/test_clml/test_ops.py | 110 ++- tests/scripts/task_build_adreno_bins.sh|4 +- 16 files changed, 2056 insertions(+), 605 deletions(-) create mode 100644 src/runtime/contrib/clml/clml_memory_planner.cc copy include/tvm/runtime/contrib/libtorch_runtime.h => src/runtime/contrib/clml/clml_memory_planner.h (65%) create mode 100644 src/runtime/contrib/clml/clml_runtime.h create mode 100644 src/runtime/contrib/clml/clml_utils.cc create mode 100644 src/runtime/contrib/clml/clml_utils.h create mode 100644 tests/cpp-runtime/opencl/clml_memory_planner.cc
[tvm] branch main updated: [DOCKER][ADRENO] Docker setup changes for multi user environments (#15003)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new ca30b13668 [DOCKER][ADRENO] Docker setup changes for multi user environments (#15003) ca30b13668 is described below commit ca30b136686747e2817fb48b302dfa08eda935de Author: Siva AuthorDate: Thu Jun 1 21:13:25 2023 +0530 [DOCKER][ADRENO] Docker setup changes for multi user environments (#15003) Docker setup changes for multi user environments Target setup shouldn't conflict while multiple users on the same target device libc++_shared.so may not be available for some android devices from system --- tests/scripts/setup-adreno-env.sh | 13 + tests/scripts/task_python_adreno.sh | 13 + 2 files changed, 18 insertions(+), 8 deletions(-) diff --git a/tests/scripts/setup-adreno-env.sh b/tests/scripts/setup-adreno-env.sh index 44eb7edce8..55a92c5f61 100755 --- a/tests/scripts/setup-adreno-env.sh +++ b/tests/scripts/setup-adreno-env.sh @@ -90,15 +90,20 @@ case ${ENVIRONMENT} in def_environment export ANDROID_SERIAL=${ADB_SERIAL} -adb shell "mkdir -p /data/local/tmp/tvm_ci" -adb push build-adreno-target/tvm_rpc /data/local/tmp/tvm_ci/tvm_rpc_ci -adb push build-adreno-target/libtvm_runtime.so /data/local/tmp/tvm_ci +TARGET_FOLDER=/data/local/tmp/tvm_ci-${USER} +CPP_LIB=`find ${ANDROID_NDK_HOME} -name libc++_shared.so | grep aarch64` +adb shell "mkdir -p ${TARGET_FOLDER}" +adb push build-adreno-target/tvm_rpc ${TARGET_FOLDER}/tvm_rpc-${USER} +adb push build-adreno-target/libtvm_runtime.so ${TARGET_FOLDER} +if [ -f ${CPP_LIB} ] ; then +adb push ${CPP_LIB} ${TARGET_FOLDER} +fi adb reverse tcp:${TVM_TRACKER_PORT} tcp:${TVM_TRACKER_PORT} adb forward tcp:5000 tcp:5000 adb forward tcp:5001 tcp:5001 adb forward tcp:5002 tcp:5002 -adb shell "cd /data/local/tmp/tvm_ci; killall -9 tvm_rpc_ci; sleep 2; LD_LIBRARY_PATH=/data/local/tmp/tvm_ci/ ./tvm_rpc_ci server --host=0.0.0.0 --port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} --key=${RPC_DEVICE_KEY}" +adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 --port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} --key=${RPC_DEVICE_KEY}" ;; "query") diff --git a/tests/scripts/task_python_adreno.sh b/tests/scripts/task_python_adreno.sh index 809df1ed2d..6d9c370aec 100755 --- a/tests/scripts/task_python_adreno.sh +++ b/tests/scripts/task_python_adreno.sh @@ -38,15 +38,20 @@ sleep 5 # Wait for tracker to bind export ANDROID_SERIAL=$1 -adb shell "mkdir -p /data/local/tmp/tvm_ci" -adb push build-adreno-target/tvm_rpc /data/local/tmp/tvm_ci/tvm_rpc_ci -adb push build-adreno-target/libtvm_runtime.so /data/local/tmp/tvm_ci +TARGET_FOLDER=/data/local/tmp/tvm_ci-${USER} +adb shell "mkdir -p ${TARGET_FOLDER}" +adb push build-adreno-target/tvm_rpc ${TARGET_FOLDER}/tvm_rpc-${USER} +adb push build-adreno-target/libtvm_runtime.so ${TARGET_FOLDER} +CPP_LIB=`find ${ANDROID_NDK_HOME} -name libc++_shared.so | grep aarch64` +if [ -f ${CPP_LIB} ] ; then +adb push ${CPP_LIB} ${TARGET_FOLDER} +fi adb reverse tcp:${TVM_TRACKER_PORT} tcp:${TVM_TRACKER_PORT} adb forward tcp:5000 tcp:5000 adb forward tcp:5001 tcp:5001 adb forward tcp:5002 tcp:5002 -env adb shell "cd /data/local/tmp/tvm_ci; killall -9 tvm_rpc_ci; sleep 2; LD_LIBRARY_PATH=/data/local/tmp/tvm_ci/ ./tvm_rpc_ci server --host=0.0.0.0 --port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} --key=${RPC_DEVICE_KEY}" & +env adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 --port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} --key=${RPC_DEVICE_KEY}" & DEVICE_PID=$! sleep 5 # Wait for the device connections trap "{ kill ${TRACKER_PID}; kill ${DEVICE_PID}; }" 0
[tvm] branch main updated: [METAL] Fix int8 vectorized cast (#14962)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 6198c7fd8a [METAL] Fix int8 vectorized cast (#14962) 6198c7fd8a is described below commit 6198c7fd8a75534d98efd0ef800b36fc4e3dc021 Author: Siyuan Feng AuthorDate: Fri May 26 22:22:46 2023 +0800 [METAL] Fix int8 vectorized cast (#14962) Current codegen output `(half4)*(device uint*)A` tries to create a `int32` number and then cast it to `half4`, which is not the expected behavior. As Metal supports `uchar4` and `char4` types, we can direct use them to solve that problem. --- src/target/source/codegen_metal.cc | 5 tests/python/unittest/test_target_codegen_metal.py | 30 +- 2 files changed, 24 insertions(+), 11 deletions(-) diff --git a/src/target/source/codegen_metal.cc b/src/target/source/codegen_metal.cc index bd2b930166..b7105e4bcd 100644 --- a/src/target/source/codegen_metal.cc +++ b/src/target/source/codegen_metal.cc @@ -220,11 +220,6 @@ void CodeGenMetal::PrintType(DataType t, std::ostream& os) { // NOLINT(*) if (t.is_uint()) { os << 'u'; } -if (t.bits() == 8 && t.lanes() == 4) { - // directly 4 8 bit int in integer. - os << "int"; - return; -} switch (t.bits()) { case 8: os << "char"; diff --git a/tests/python/unittest/test_target_codegen_metal.py b/tests/python/unittest/test_target_codegen_metal.py index 3b1cdb4422..dcbbba8c9c 100644 --- a/tests/python/unittest/test_target_codegen_metal.py +++ b/tests/python/unittest/test_target_codegen_metal.py @@ -14,12 +14,12 @@ # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. -import tvm -from tvm import te import numpy as np -import tvm.testing +import tvm import tvm.script +import tvm.testing +from tvm import te from tvm.script import tir as T @@ -149,7 +149,25 @@ def test_select_vectorize(): np.testing.assert_allclose(b_nd.numpy(), a, atol=1e-5, rtol=1e-5) +@tvm.testing.requires_gpu +@tvm.testing.requires_metal +def test_vectorized_uint8(): +@T.prim_func +def func(A: T.Buffer((16), "uint8"), B: T.Buffer((16), "float32")): +for i in T.thread_binding(4, thread="threadIdx.x"): +for j in T.vectorized(4): +with T.block("block"): +vi = T.axis.spatial(16, i * 4 + j) +B[vi] = T.Cast("float32", A[vi]) + +dev = tvm.metal() +a = np.arange(16).astype("uint8") +a_nd = tvm.nd.array(a, dev) +b_nd = tvm.nd.empty((16,), "float32", dev) +f = tvm.build(func, target="metal") +f(a_nd, b_nd) +np.testing.assert_allclose(b_nd.numpy(), a.astype("float32"), atol=1e-5, rtol=1e-5) + + if __name__ == "__main__": -test_ramp() -test_metal_inf_nan() -test_metal_erf() +tvm.testing.main()
[tvm] branch main updated (1c39613811 -> bcf7abba29)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 1c39613811 [LLVM] Expose Host CPU Feature Detection (#14946) add bcf7abba29 Fix pytorch axis (#14930) No new revisions were added by this update. Summary of changes: python/tvm/relay/frontend/pytorch.py | 3 ++- tests/python/frontend/pytorch/test_forward.py | 6 ++ 2 files changed, 8 insertions(+), 1 deletion(-)
[tvm] branch main updated: [Target] Add target to all TVM callbacks (#14939)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new e11913be06 [Target] Add target to all TVM callbacks (#14939) e11913be06 is described below commit e11913be06b30d806e6c09a6610b4e8fbaaa30a0 Author: Junru Shao AuthorDate: Wed May 24 23:10:41 2023 -0700 [Target] Add target to all TVM callbacks (#14939) * [Target] Add target to all TVM callbacks This PR adds an extra parameter `target` to all `tvm_callback_*` so that the callback can decide its own behavior by querying which target to compile against. * fix lint * fix lint --- apps/ios_rpc/tests/ios_rpc_mobilenet.py| 27 +++--- apps/ios_rpc/tests/ios_rpc_test.py | 13 ++- apps/topi_recipe/broadcast/test_broadcast_map.py | 12 -- apps/topi_recipe/conv/depthwise_conv2d_test.py | 13 +-- apps/topi_recipe/conv/test_conv2d_hwcn_map.py | 9 apps/topi_recipe/reduce/test_reduce_map.py | 10 apps/topi_recipe/rnn/lstm.py | 9 apps/topi_recipe/rnn/matexp.py | 13 ++- jvm/core/src/test/scripts/test_add_gpu.py | 4 ++-- python/tvm/contrib/nvcc.py | 6 ++--- python/tvm/contrib/sdaccel.py | 10 src/target/opt/build_cuda_on.cc| 4 ++-- src/target/source/codegen_aocl.cc | 2 +- src/target/source/codegen_metal.cc | 2 +- src/target/source/codegen_opencl.cc| 2 +- src/target/source/codegen_vhls.cc | 5 ++-- src/target/spirv/spirv_utils.cc| 2 +- tests/python/integration/test_ewise.py | 5 ++-- tests/python/integration/test_ewise_fpga.py| 3 +-- .../test_tir_transform_inject_ptx_async_copy.py| 2 +- 20 files changed, 76 insertions(+), 77 deletions(-) diff --git a/apps/ios_rpc/tests/ios_rpc_mobilenet.py b/apps/ios_rpc/tests/ios_rpc_mobilenet.py index b90b459280..0c958a07d8 100644 --- a/apps/ios_rpc/tests/ios_rpc_mobilenet.py +++ b/apps/ios_rpc/tests/ios_rpc_mobilenet.py @@ -15,24 +15,24 @@ # specific language governing permissions and limitations # under the License. -import tvm -from tvm import rpc, relay -from tvm.contrib.download import download_testdata -from tvm.relay.expr_functor import ExprMutator -from tvm.relay import transform -from tvm.relay.op.annotation import compiler_begin, compiler_end -from tvm.relay.quantize.quantize import prerequisite_optimize -from tvm.contrib import utils, xcode, graph_executor, coreml_runtime -from tvm.contrib.target import coreml as _coreml - +import argparse import os import re import sys + +import coremltools import numpy as np +import tvm from mxnet import gluon from PIL import Image -import coremltools -import argparse +from tvm import relay, rpc +from tvm.contrib import coreml_runtime, graph_executor, utils, xcode +from tvm.contrib.download import download_testdata +from tvm.contrib.target import coreml as _coreml +from tvm.relay import transform +from tvm.relay.expr_functor import ExprMutator +from tvm.relay.op.annotation import compiler_begin, compiler_end +from tvm.relay.quantize.quantize import prerequisite_optimize # Change target configuration, this is setting for iphone6s # arch = "x86_64" @@ -43,9 +43,10 @@ target_host = "llvm -mtriple=%s-apple-darwin" % arch MODES = {"proxy": rpc.connect, "tracker": rpc.connect_tracker, "standalone": rpc.connect} + # override metal compiler to compile to iphone @tvm.register_func("tvm_callback_metal_compile") -def compile_metal(src): +def compile_metal(src, target): return xcode.compile_metal(src, sdk=sdk) diff --git a/apps/ios_rpc/tests/ios_rpc_test.py b/apps/ios_rpc/tests/ios_rpc_test.py index 94340dcd4e..78f1d3a6c5 100644 --- a/apps/ios_rpc/tests/ios_rpc_test.py +++ b/apps/ios_rpc/tests/ios_rpc_test.py @@ -20,15 +20,15 @@ To use it, start a rpc proxy with "python -m tvm.exec.rpc_proxy". And configure the proxy host field as commented. """ -import tvm -from tvm import te +import argparse import os import re import sys -from tvm import rpc -from tvm.contrib import utils, xcode + import numpy as np -import argparse +import tvm +from tvm import rpc, te +from tvm.contrib import utils, xcode # Change target configuration, this is setting for iphone6s arch = "arm64" @@ -37,9 +37,10 @@ target = "llvm -mtriple=%s-apple-darwin" % arch MODES = {"proxy": rpc.connect, "tracker": rpc.connect_tracker, "standalone": rpc.connect} + # override metal compiler to compile to iphone @tvm.register_func("tvm_cal
[tvm] branch main updated: [Relay] add a dimension check to reject invalid input (#14925)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new d776bccc1e [Relay] add a dimension check to reject invalid input (#14925) d776bccc1e is described below commit d776bccc1edb90b06d06b4ea4a94157115b1d972 Author: Qingchao Shen AuthorDate: Wed May 24 13:28:17 2023 +0800 [Relay] add a dimension check to reject invalid input (#14925) * add a dimension check * Update test_forward.py * Update pytorch.py * Update pytorch.py --- python/tvm/relay/frontend/pytorch.py | 6 ++ tests/python/frontend/pytorch/test_forward.py | 5 + 2 files changed, 11 insertions(+) diff --git a/python/tvm/relay/frontend/pytorch.py b/python/tvm/relay/frontend/pytorch.py index 08bf5d517c..f3bcda7c23 100644 --- a/python/tvm/relay/frontend/pytorch.py +++ b/python/tvm/relay/frontend/pytorch.py @@ -4642,6 +4642,12 @@ def _get_relay_input_vars(graph, input_infos, prelude, is_module=True, default_d ): msg = "Shapes of input list and information in the graph do not match" raise RuntimeError(msg) +if len(ishape) > 1 and any(dim <= 0 for dim in ishape[1:]): +msg = ( +"Expected input's non-batch dimensions to have positive length, " +f"but input has a shape of {pt_type.sizes()}" +) +raise RuntimeError(msg) pt_dtype = pt_type.scalarType() if not pt_dtype and itype: pt_dtype = itype diff --git a/tests/python/frontend/pytorch/test_forward.py b/tests/python/frontend/pytorch/test_forward.py index b602c14df3..de1de6421b 100644 --- a/tests/python/frontend/pytorch/test_forward.py +++ b/tests/python/frontend/pytorch/test_forward.py @@ -789,6 +789,11 @@ def test_forward_celu(): input_data = torch.tensor([-1.0, 2.0], dtype=torch.float32) verify_model(torch.nn.CELU().eval(), input_data=input_data) +input_shape = [2, 0, 1] +input_data = torch.rand(input_shape).float() +with pytest.raises(RuntimeError): +verify_model(torch.nn.CELU().eval(), input_data=input_data) + @tvm.testing.uses_gpu def test_forward_gelu():
[tvm] branch main updated (dddc339d2f -> 4f99750611)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from dddc339d2f [Test Cases][BugFix] Add some version check to make test cases run in all PyTorch versions (#14903) add 4f99750611 [TypoFix] fix some typo problem in keras frontend (#14916) No new revisions were added by this update. Summary of changes: python/tvm/relay/frontend/keras.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
[tvm] branch main updated: [RUNTIME] Make systemlib unique per prefix (#14887)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 006b11df04 [RUNTIME] Make systemlib unique per prefix (#14887) 006b11df04 is described below commit 006b11df046f3aecc076384fb96cb6733287b25f Author: Tianqi Chen AuthorDate: Fri May 19 08:04:44 2023 -0400 [RUNTIME] Make systemlib unique per prefix (#14887) This PR enhances systemlib to make it unique per prefix. Can help reduce flaky problems in multiple loading of same lib. --- src/runtime/system_library.cc | 53 +-- tests/python/unittest/test_target_codegen_blob.py | 2 + 2 files changed, 42 insertions(+), 13 deletions(-) diff --git a/src/runtime/system_library.cc b/src/runtime/system_library.cc index be9257e53f..55335649a7 100644 --- a/src/runtime/system_library.cc +++ b/src/runtime/system_library.cc @@ -32,30 +32,30 @@ namespace tvm { namespace runtime { -class SystemLibraryRegistry { +class SystemLibSymbolRegistry { public: void RegisterSymbol(const std::string& name, void* ptr) { std::lock_guard lock(mutex_); -auto it = tbl_.find(name); -if (it != tbl_.end() && ptr != it->second) { +auto it = symbol_table_.find(name); +if (it != symbol_table_.end() && ptr != it->second) { LOG(WARNING) << "SystemLib symbol " << name << " get overriden to a different address " << ptr << "->" << it->second; } -tbl_[name] = ptr; +symbol_table_[name] = ptr; } void* GetSymbol(const char* name) { std::lock_guard lock(mutex_); -auto it = tbl_.find(name); -if (it != tbl_.end()) { +auto it = symbol_table_.find(name); +if (it != symbol_table_.end()) { return it->second; } else { return nullptr; } } - static SystemLibraryRegistry* Global() { -static SystemLibraryRegistry* inst = new SystemLibraryRegistry(); + static SystemLibSymbolRegistry* Global() { +static SystemLibSymbolRegistry* inst = new SystemLibSymbolRegistry(); return inst; } @@ -63,7 +63,7 @@ class SystemLibraryRegistry { // Internal mutex std::mutex mutex_; // Internal symbol table - std::unordered_map tbl_; + std::unordered_map symbol_table_; }; class SystemLibrary : public Library { @@ -80,22 +80,49 @@ class SystemLibrary : public Library { } private: - SystemLibraryRegistry* reg_ = SystemLibraryRegistry::Global(); + SystemLibSymbolRegistry* reg_ = SystemLibSymbolRegistry::Global(); std::string symbol_prefix_; }; +class SystemLibModuleRegistry { + public: + runtime::Module GetOrCreateModule(std::string symbol_prefix) { +std::lock_guard lock(mutex_); +auto it = lib_map_.find(symbol_prefix); +if (it != lib_map_.end()) { + return it->second; +} else { + auto mod = CreateModuleFromLibrary(make_object(symbol_prefix)); + lib_map_[symbol_prefix] = mod; + return mod; +} + } + + static SystemLibModuleRegistry* Global() { +static SystemLibModuleRegistry* inst = new SystemLibModuleRegistry(); +return inst; + } + + private: + // Internal mutex + std::mutex mutex_; + // we need to make sure each lib map have an unique + // copy through out the entire lifetime of the process + // so the cached PackedFunc in the system do not get out dated. + std::unordered_map lib_map_; +}; + TVM_REGISTER_GLOBAL("runtime.SystemLib").set_body([](TVMArgs args, TVMRetValue* rv) { std::string symbol_prefix = ""; if (args.size() != 0) { symbol_prefix = args[0].operator std::string(); } - auto mod = CreateModuleFromLibrary(make_object(symbol_prefix)); - *rv = mod; + *rv = SystemLibModuleRegistry::Global()->GetOrCreateModule(symbol_prefix); }); } // namespace runtime } // namespace tvm int TVMBackendRegisterSystemLibSymbol(const char* name, void* ptr) { - tvm::runtime::SystemLibraryRegistry::Global()->RegisterSymbol(name, ptr); + tvm::runtime::SystemLibSymbolRegistry::Global()->RegisterSymbol(name, ptr); return 0; } diff --git a/tests/python/unittest/test_target_codegen_blob.py b/tests/python/unittest/test_target_codegen_blob.py index d7683fd68c..2848c875a4 100644 --- a/tests/python/unittest/test_target_codegen_blob.py +++ b/tests/python/unittest/test_target_codegen_blob.py @@ -122,6 +122,8 @@ def test_cuda_multi_lib(): b_nd = tvm.nd.array(a_np, dev) syslibA = tvm.runtime.system_lib("modA_") syslibB = tvm.runtime.system_lib("modB_") +# reload same lib twice +syslibA = tvm.runtime.system_lib("modA_") syslibA["my_inplace_update"](a_nd) syslibB["my_inplace_update"](b_nd) np.testing.assert_equal(a_nd.numpy(), a_np + 1)
[tvm] branch main updated (613ad5cc51 -> af06b30a98)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 613ad5cc51 [bugfix] fix the undeclared identifier 'f' (#14879) add af06b30a98 [CONTRIB] Enable create_staticlib to take in tar files (#14876) No new revisions were added by this update. Summary of changes: python/tvm/contrib/cc.py | 50 +++ python/tvm/contrib/tar.py | 41 +++ tests/python/unittest/test_target_codegen_blob.py | 15 --- 3 files changed, 100 insertions(+), 6 deletions(-)
[tvm] branch main updated: [bugfix][Relay] Fix softplus in paddlepaddle frontend (#14845)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 602133e6b9 [bugfix][Relay] Fix softplus in paddlepaddle frontend (#14845) 602133e6b9 is described below commit 602133e6b9a3c7925abd5bbdb315f92fa7170c93 Author: Qingchao Shen AuthorDate: Mon May 15 12:34:16 2023 +0800 [bugfix][Relay] Fix softplus in paddlepaddle frontend (#14845) * fix softplus in paddlepaddle.py * add test case * Update test_forward.py --- python/tvm/relay/frontend/paddlepaddle.py | 5 - tests/python/frontend/paddlepaddle/test_forward.py | 4 +++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/python/tvm/relay/frontend/paddlepaddle.py b/python/tvm/relay/frontend/paddlepaddle.py index fdbc96676f..4268a4876a 100755 --- a/python/tvm/relay/frontend/paddlepaddle.py +++ b/python/tvm/relay/frontend/paddlepaddle.py @@ -2163,7 +2163,10 @@ def convert_softplus(g, op, block): dtype = infer_type(x).checked_type.dtype beta = op.attr("beta") beta = _expr.const(beta, dtype=dtype) -out = _op.log(_op.exp(x * beta) + _expr.const(1.0, dtype=dtype)) / beta +threshold = op.attr("threshold") +threshold = _expr.const(threshold, dtype=dtype) +out_softplus = _op.log(_op.exp(x * beta) + _expr.const(1.0, dtype=dtype)) / beta +out = _op.where(_op.greater(x * beta, threshold), x, out_softplus) g.add_node(op.output("Out")[0], out) diff --git a/tests/python/frontend/paddlepaddle/test_forward.py b/tests/python/frontend/paddlepaddle/test_forward.py index 289fc0faa3..1555ba1aaa 100755 --- a/tests/python/frontend/paddlepaddle/test_forward.py +++ b/tests/python/frontend/paddlepaddle/test_forward.py @@ -1722,7 +1722,9 @@ def test_forward_sin(): @run_math_api def test_forward_softplus(): -pass +x = paddle.to_tensor([-0.4, 1], dtype="float32") +m = paddle.nn.Softplus(5, 1) +verify_model(m, [x]) @run_math_api
[tvm] branch unity updated: [Unity][Training] Categorical cross entropy loss (#14757)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch unity in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/unity by this push: new 25a491d52b [Unity][Training] Categorical cross entropy loss (#14757) 25a491d52b is described below commit 25a491d52bac4a043dd75f08fa24b31a2c7c1928 Author: Nikita Demashov AuthorDate: Fri May 12 10:09:53 2023 +0300 [Unity][Training] Categorical cross entropy loss (#14757) Add categorical cross entropy loss fix lint Add targets dtype checker and description of the reason for two implementations Lift the common part fix dtype checker fix lint --- python/tvm/relax/training/loss.py| 94 +++- tests/python/relax/test_training_loss.py | 76 ++ 2 files changed, 169 insertions(+), 1 deletion(-) diff --git a/python/tvm/relax/training/loss.py b/python/tvm/relax/training/loss.py index 466c2996e7..d98d2d727c 100644 --- a/python/tvm/relax/training/loss.py +++ b/python/tvm/relax/training/loss.py @@ -27,7 +27,7 @@ from typing_extensions import Literal from ..block_builder import BlockBuilder from ..expr import Expr, Var, Function, StructInfo -from ..op import abs, sum, mean, subtract, multiply +from ..op import abs, sum, mean, subtract, multiply, reshape, argmax from ..op.nn import log_softmax, nll_loss @@ -290,3 +290,95 @@ class CrossEntropyLoss(Loss): bb.emit_func_output(loss) return bb.get()[self._loss_name] + + +class CategoricalCrossEntropyLoss(Loss): +r"""CategoricalCrossEntropyLoss. +It is a combination of a converting one-hot target vector to a label, +a log_softmax computation and a nll_loss. + +Parameters +-- +reduction : Literal["mean", "sum", "none"] +The reduction method to apply to output. Can be "mean", "sum" or "none". + +none : no reduction will be applied, +mean : the sum of the output will be divided by the batch_size, +sum : the output will be summed. + +ignore_index : int +Specifies a target value that is ignored and does not contribute to the input gradient. +""" + +ignore_index: int + +def __init__( +self, +reduction: Literal["mean", "sum", "none"] = "mean", +ignore_index: int = -100, +) -> None: +super().__init__("categorical_cross_entropy_loss", 1, reduction) +self.ignore_index = ignore_index + +def __call__( +self, +predictions: Union[Var, StructInfo], +targets: Union[Var, StructInfo], +weights: Optional[Union[Var, StructInfo]] = None, +) -> Function: +"""Get the relax function of CategoricalCrossEntropyLoss. If the parameters are +struct info, it will create corresponding variables. + +Parameters +-- +predictions : Union[Var, StructInfo] +The predictions of the model in the calculation of loss. + +targets : Union[Var, StructInfo] +The ground truth in the calculation of loss. + +weights : Optional[Union[Var, StructInfo]] +a manual rescaling weight given to each class. It has to be a Tensor of size C. + +Returns +--- +The relax function of CategoricalCrossEntropyLoss with the loss name as its global symbol. +""" + +if not "int" in targets.dtype: +raise TypeError( +f"Dtype of targets expected to be int/uint. \ + However, the dtype of targets is {targets.dtype}" +) + +bb = BlockBuilder() + +predictions = _create_param_var(predictions, "predictions") +targets = _create_param_var(targets, "targets") + +arg_list = [predictions, targets] +if weights: +weights = _create_param_var(weights, "weights") +arg_list.append(weights) + +# In the case of ignore_index >= 0, +# the nll_loss function is used to handle the ignore index. +# In other cases where ignore_index is not needed, just use the simpe product. +with bb.function(self._loss_name, arg_list): +with bb.dataflow(): +logits = bb.emit(log_softmax(predictions)) +if self.ignore_index >= 0: +targets = bb.emit( +reshape(argmax(targets, axis=1), shape=(targets.struct_info.shape[0],)) +) +loss = bb.emit_output( +nll_loss(logits, targets, weights, self._reduction, self.ignore_index) +) +
[tvm] branch main updated: [Relay][BugFix] fix a bug about ReLu in the threshold attribute which causes a different results with keras (#14824)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new fd2a510ccf [Relay][BugFix] fix a bug about ReLu in the threshold attribute which causes a different results with keras (#14824) fd2a510ccf is described below commit fd2a510ccf76efd7b678c4746c1fa72ce496b58d Author: Qingchao Shen AuthorDate: Thu May 11 16:48:41 2023 +0800 [Relay][BugFix] fix a bug about ReLu in the threshold attribute which causes a different results with keras (#14824) * fix relu threshold attribute * Update test_forward.py --- python/tvm/relay/frontend/keras.py | 2 ++ tests/python/frontend/keras/test_forward.py | 1 + 2 files changed, 3 insertions(+) diff --git a/python/tvm/relay/frontend/keras.py b/python/tvm/relay/frontend/keras.py index ef94c74e03..b820ad586d 100644 --- a/python/tvm/relay/frontend/keras.py +++ b/python/tvm/relay/frontend/keras.py @@ -145,6 +145,8 @@ def _convert_advanced_activation(inexpr, keras_layer, etab, data_layout, input_s axis = axis + 1 if axis < dims - 1 else 1 return _op.nn.softmax(inexpr, axis=axis) if act_type == "ReLU": +if np.isnan(keras_layer.threshold).any(): +raise tvm.error.OpAttributeInvalid("The threshold value of a ReLU cannot be None.") threshold = _expr.const(keras_layer.threshold, dtype="float32") if keras_layer.max_value and float(keras_layer.threshold) == 0: # f(x) = max_value, for x >= max_value diff --git a/tests/python/frontend/keras/test_forward.py b/tests/python/frontend/keras/test_forward.py index 92b8ce0a64..a5305430c2 100644 --- a/tests/python/frontend/keras/test_forward.py +++ b/tests/python/frontend/keras/test_forward.py @@ -227,6 +227,7 @@ class TestKeras: act_funcs = [ keras_mod.layers.LeakyReLU(alpha=None), keras_mod.layers.LEU(2, 3, 4), +keras_mod.layers.ReLU(threshold=None), ] data = keras_mod.layers.Input(shape=(2, 3, 4)) for act_func in act_funcs:
[tvm] branch main updated (3829ebb68c -> cca7d78334)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 3829ebb68c [TVMScript] Handle AllocatedPoolInfo, ConstantPoolInfo, ConstantInfo (#14812) add cca7d78334 [OpenCLML] Transposed convolution support and other fixes (#14767) No new revisions were added by this update. Summary of changes: apps/cpp_rtvm/tvm_runner.h | 3 +- cmake/modules/LibInfo.cmake | 1 + python/tvm/relay/op/contrib/clml.py | 45 +- src/relay/backend/contrib/clml/codegen.cc| 40 +++--- src/runtime/contrib/clml/clml_runtime.cc | 176 --- src/support/libinfo.cc | 1 + tests/python/contrib/test_clml/infrastructure.py | 2 + tests/python/contrib/test_clml/test_ops.py | 107 +- 8 files changed, 271 insertions(+), 104 deletions(-)
[tvm] branch main updated: [OpenCL] Refactor OpenCL runtime to support SPIRV binary ingestion (#14792)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 182de86b2c [OpenCL] Refactor OpenCL runtime to support SPIRV binary ingestion (#14792) 182de86b2c is described below commit 182de86b2cddabbd3903efe88fbff52d654117d7 Author: masahi AuthorDate: Tue May 9 15:08:04 2023 +0900 [OpenCL] Refactor OpenCL runtime to support SPIRV binary ingestion (#14792) * [OpenCL] Support SPIRV module ingestion * compile fixed * clean up * fix build when vulkan is not enabled * Introduce spirv_utils.cc * add dummy impl for LowerToSPIRV in case vulkan is not enabled * more fix * TVM_USE_VULKAN -> TVM_ENABLE_SPIRV * build fix when opencl is not enabled * mv spirv_shader.h under new spirv folder * mv spirv module to its own file * fix after reorg * build fix --- cmake/modules/OpenCL.cmake | 1 + cmake/modules/Vulkan.cmake | 1 + src/runtime/opencl/opencl_common.h | 60 + src/runtime/opencl/opencl_module.cc| 36 +++--- src/runtime/opencl/opencl_module.h | 14 ++- src/runtime/opencl/opencl_module_spirv.cc | 137 + .../vulkan_shader.h => spirv/spirv_shader.h} | 16 +-- src/runtime/vulkan/vulkan_module.cc| 6 +- src/runtime/vulkan/vulkan_module.h | 4 +- src/runtime/vulkan/vulkan_wrapped_func.h | 6 +- src/target/opt/build_opencl_off.cc | 7 ++ src/target/source/codegen_opencl.cc| 9 ++ src/target/spirv/build_vulkan.cc | 134 +--- src/target/spirv/codegen_spirv.cc | 5 +- src/target/spirv/codegen_spirv.h | 6 +- src/target/spirv/spirv_support.cc | 5 +- .../spirv/{build_vulkan.cc => spirv_utils.cc} | 64 ++ .../vulkan_module.h => target/spirv/spirv_utils.h} | 32 +++-- 18 files changed, 316 insertions(+), 227 deletions(-) diff --git a/cmake/modules/OpenCL.cmake b/cmake/modules/OpenCL.cmake index 53199f19cb..f380ad75d1 100644 --- a/cmake/modules/OpenCL.cmake +++ b/cmake/modules/OpenCL.cmake @@ -41,6 +41,7 @@ endif(USE_AOCL) if(USE_OPENCL) tvm_file_glob(GLOB RUNTIME_OPENCL_SRCS src/runtime/opencl/*.cc) + list(APPEND COMPILER_SRCS src/target/spirv/spirv_utils.cc) if(${USE_OPENCL} MATCHES ${IS_TRUE_PATTERN}) message(STATUS "Enabled runtime search for OpenCL library location") diff --git a/cmake/modules/Vulkan.cmake b/cmake/modules/Vulkan.cmake index 7470fb6125..1f303f3a03 100644 --- a/cmake/modules/Vulkan.cmake +++ b/cmake/modules/Vulkan.cmake @@ -34,4 +34,5 @@ if(USE_VULKAN) list(APPEND COMPILER_SRCS ${COMPILER_VULKAN_SRCS}) list(APPEND TVM_LINKER_LIBS ${Vulkan_SPIRV_TOOLS_LIBRARY}) list(APPEND TVM_RUNTIME_LINKER_LIBS ${Vulkan_LIBRARY}) + add_definitions(-DTVM_ENABLE_SPIRV=1) endif(USE_VULKAN) diff --git a/src/runtime/opencl/opencl_common.h b/src/runtime/opencl/opencl_common.h index a8a4cf3dc6..d25d2db0eb 100644 --- a/src/runtime/opencl/opencl_common.h +++ b/src/runtime/opencl/opencl_common.h @@ -411,18 +411,16 @@ struct BufferDescriptor { // To make the call thread-safe, we create a thread-local kernel table // and lazily install new kernels into the kernel table when the kernel is called. // The kernels are recycled when the module get destructed. -class OpenCLModuleNode : public ModuleNode { +class OpenCLModuleNodeBase : public ModuleNode { public: // Kernel table reference entry. struct KTRefEntry { size_t kernel_id; size_t version; }; - explicit OpenCLModuleNode(std::string data, std::string fmt, -std::unordered_map fmap, std::string source) - : data_(data), fmt_(fmt), fmap_(fmap), source_(source) {} + explicit OpenCLModuleNodeBase(std::unordered_map fmap) : fmap_(fmap) {} // destructor - ~OpenCLModuleNode(); + ~OpenCLModuleNodeBase(); /*! * \brief Get the global workspace @@ -436,38 +434,56 @@ class OpenCLModuleNode : public ModuleNode { return ModulePropertyMask::kBinarySerializable | ModulePropertyMask::kRunnable; } - PackedFunc GetFunction(const std::string& name, const ObjectPtr& sptr_to_self) final; - void SaveToFile(const std::string& file_name, const std::string& format) final; - void SaveToBinary(dmlc::Stream* stream) final; - std::string GetSource(const std::string& format) final; + PackedFunc GetFunction(const std::string& name, const ObjectPtr& sptr_to_self) override; + // Initialize the programs - void Init(); + virtual void Init() = 0; // install a new kernel to thread local entry -
[tvm] branch main updated (d6e0f1d001 -> e01cb479b8)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from d6e0f1d001 [NDArray] Allow arbitrary stride when the corresponding shape is 1 (#14797) add e01cb479b8 [bugfix][relay] fix wrong calculate logic about celu (#14796) No new revisions were added by this update. Summary of changes: python/tvm/relay/frontend/pytorch.py | 5 +++-- tests/python/frontend/pytorch/test_forward.py | 2 ++ 2 files changed, 5 insertions(+), 2 deletions(-)
[tvm] branch main updated (706a5b2dd5 -> eca6edf7fa)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 706a5b2dd5 [Target][Minor] Add A6000 Target Tag (#14762) add eca6edf7fa [CI] sccache support while using ci.py under multi user environments (#14666) No new revisions were added by this update. Summary of changes: tests/scripts/ci.py | 1 + 1 file changed, 1 insertion(+)
[tvm] branch main updated (4d37a0a078 -> f1033adf04)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 4d37a0a078 [CI] Upgrade CI (#14635) add f1033adf04 [TIR] Keep block annotations from tensorization (#14741) No new revisions were added by this update. Summary of changes: src/tir/schedule/primitive/blockize_tensorize.cc | 9 + .../python/unittest/test_tir_schedule_tensorize.py | 40 +- 2 files changed, 48 insertions(+), 1 deletion(-)
[tvm] branch main updated: [TOPI] Vectorize depthwise conv2d output operator (#14519)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new cc8cce50fa [TOPI] Vectorize depthwise conv2d output operator (#14519) cc8cce50fa is described below commit cc8cce50fabb5a92e5830bb81b5fa96fb613a698 Author: Jack Frankland <30410009+franklandj...@users.noreply.github.com> AuthorDate: Wed May 3 21:18:41 2023 +0100 [TOPI] Vectorize depthwise conv2d output operator (#14519) Depthwise Conv2D operations may consists of a convolution + an output operator e.g. Relu. This commit will: * Apply vectorization across the inner channel loop when there is an output operator. * Remove some unused variables in `schedule_depthwise_conv2d_nhwc`. * Limit the loop splitting to 8 elements in the inner loop. --- python/tvm/topi/arm_cpu/depthwise_conv2d.py | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/python/tvm/topi/arm_cpu/depthwise_conv2d.py b/python/tvm/topi/arm_cpu/depthwise_conv2d.py index b6c15a30c0..59660e6bb9 100644 --- a/python/tvm/topi/arm_cpu/depthwise_conv2d.py +++ b/python/tvm/topi/arm_cpu/depthwise_conv2d.py @@ -292,13 +292,13 @@ def schedule_depthwise_conv2d_nhwc(cfg, outs): out = outs[0] # space definition begin # -n, h, w, c = s[out].op.axis +_, h, w, c = s[out].op.axis # Split the number of input/output channels -cfg.define_split("tile_c", c, num_outputs=2) +cfg.define_split("tile_c", c, num_outputs=2, filter=lambda entry: entry.size[1] <= 8) # Split the height of the convolution -_, hi = cfg.define_split("tile_h", h, num_outputs=2) +cfg.define_split("tile_h", h, num_outputs=2) # Split the width of the convolution -_, wi = cfg.define_split("tile_w", w, num_outputs=2) +cfg.define_split("tile_w", w, num_outputs=2) # Additional out (e.g., requantization, bias addition, etc..) # 0: locate the output on the second last axis of the main compuation # 1: locate the output closest to the main computation @@ -394,7 +394,8 @@ def schedule_depthwise_conv2d_nhwc(cfg, outs): ci_outer, ci_inner = s[out].split(ci, 4) s[out].vectorize(ci_inner) s[out].unroll(ci_outer) - +else: +s[out].vectorize(ci) fused_n_ho = s[out].fuse(n, ho) return hi, wi, fused_n_ho
[tvm] branch main updated: [bugfix][relay] Fix alpha attribute with None in ELU (#14742)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new d1e1b4c8b2 [bugfix][relay] Fix alpha attribute with None in ELU (#14742) d1e1b4c8b2 is described below commit d1e1b4c8b2ff1ae7cdfbf8823ecffb3c305f078f Author: Qingchao Shen AuthorDate: Tue May 2 12:54:04 2023 +0800 [bugfix][relay] Fix alpha attribute with None in ELU (#14742) * fix alpha in elu * add test case --- python/tvm/relay/frontend/keras.py | 2 ++ tests/python/frontend/keras/test_forward.py | 19 +-- 2 files changed, 15 insertions(+), 6 deletions(-) diff --git a/python/tvm/relay/frontend/keras.py b/python/tvm/relay/frontend/keras.py index 4539c221c9..ef94c74e03 100644 --- a/python/tvm/relay/frontend/keras.py +++ b/python/tvm/relay/frontend/keras.py @@ -160,6 +160,8 @@ def _convert_advanced_activation(inexpr, keras_layer, etab, data_layout, input_s raise tvm.error.OpAttributeInvalid("The alpha value of a LeakyReLU cannot be None.") return _op.nn.leaky_relu(inexpr, alpha=float(keras_layer.alpha)) if act_type == "ELU": +if np.isnan(keras_layer.alpha).any(): +raise tvm.error.OpAttributeInvalid("The alpha value of a ELU cannot be None.") alpha = keras_layer.alpha if hasattr(keras_layer, "alpha") else 1.0 alpha = _expr.const(alpha, dtype="float32") return _get_elu(inexpr, alpha) diff --git a/tests/python/frontend/keras/test_forward.py b/tests/python/frontend/keras/test_forward.py index 1377c180ae..86e88d0764 100644 --- a/tests/python/frontend/keras/test_forward.py +++ b/tests/python/frontend/keras/test_forward.py @@ -214,19 +214,26 @@ class TestKeras: def test_forward_activations_except(self, keras_mod): """ -test invalid attribute alpha=None for LeakyReLU. +test invalid attribute alpha=None for LeakyReLU and ELU. after version 2.3.1 in keras, checking was added to reject the invalid api call: -LeakyReLU(alpha=None), (issue: https://github.com/tensorflow/tensorflow/pull/47017) +LeakyReLU(alpha=None) and ELU(alpha=None), +(see issue: https://github.com/tensorflow/tensorflow/pull/47017) Thus, it's necessary to check the keras version to avoid crash at LeakyReLU(alpha=None) +and ELU(alpha=None) """ if package_version.parse(keras_mod.__version__.split("-tf")[0]) <= package_version.parse( "2.3.1" ): +act_funcs = [ +keras_mod.layers.LeakyReLU(alpha=None), +keras_mod.layers.LEU(2, 3, 4), +] data = keras_mod.layers.Input(shape=(2, 3, 4)) -layer = keras_mod.layers.LeakyReLU(alpha=None)(data) -keras_model = keras_mod.models.Model(data, layer) -with pytest.raises(tvm.error.OpAttributeInvalid): -verify_keras_frontend(keras_model) +for act_func in act_funcs: +layer = act_func(data) +keras_model = keras_mod.models.Model(data, layer) +with pytest.raises(tvm.error.OpAttributeInvalid): +verify_keras_frontend(keras_model) def test_forward_dense(self, keras_mod): """test_forward_dense"""
[tvm] branch main updated: [Keras][Bugfix] fix a bug about alpha attribute in LeakyReLU which lead to passes conflict (#14707)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new f9ae487aee [Keras][Bugfix] fix a bug about alpha attribute in LeakyReLU which lead to passes conflict (#14707) f9ae487aee is described below commit f9ae487aeeeccf0997344813d7412f7c9bddde7f Author: Qingchao Shen AuthorDate: Fri Apr 28 17:21:52 2023 +0800 [Keras][Bugfix] fix a bug about alpha attribute in LeakyReLU which lead to passes conflict (#14707) * fix a bug in LeakyReLU * add CI test * Update test_forward.py * Update test_forward.py * Update test_forward.py * Update test_forward.py * Update test_forward.py * Update test_forward.py * Update test_forward.py * Update test_forward.py --- python/tvm/relay/frontend/keras.py | 2 ++ tests/python/frontend/keras/test_forward.py | 19 +++ 2 files changed, 21 insertions(+) diff --git a/python/tvm/relay/frontend/keras.py b/python/tvm/relay/frontend/keras.py index 532c93c399..4539c221c9 100644 --- a/python/tvm/relay/frontend/keras.py +++ b/python/tvm/relay/frontend/keras.py @@ -156,6 +156,8 @@ def _convert_advanced_activation(inexpr, keras_layer, etab, data_layout, input_s return _op.multiply(negative_slope, _op.subtract(inexpr, threshold)) return _op.nn.relu(inexpr) if act_type == "LeakyReLU": +if np.isnan(keras_layer.alpha).any(): +raise tvm.error.OpAttributeInvalid("The alpha value of a LeakyReLU cannot be None.") return _op.nn.leaky_relu(inexpr, alpha=float(keras_layer.alpha)) if act_type == "ELU": alpha = keras_layer.alpha if hasattr(keras_layer, "alpha") else 1.0 diff --git a/tests/python/frontend/keras/test_forward.py b/tests/python/frontend/keras/test_forward.py index 50fa97cbca..1377c180ae 100644 --- a/tests/python/frontend/keras/test_forward.py +++ b/tests/python/frontend/keras/test_forward.py @@ -15,11 +15,13 @@ # specific language governing permissions and limitations # under the License. """Unit tests for various models and operators""" +from packaging import version as package_version import numpy as np import tvm from tvm import relay from tvm.contrib import graph_executor import tvm.testing +import pytest try: import tensorflow.compat.v1 as tf @@ -210,6 +212,22 @@ class TestKeras: verify_keras_frontend(keras_model) verify_keras_frontend(keras_model, need_transpose=False, layout="NHWC") +def test_forward_activations_except(self, keras_mod): +""" +test invalid attribute alpha=None for LeakyReLU. +after version 2.3.1 in keras, checking was added to reject the invalid api call: +LeakyReLU(alpha=None), (issue: https://github.com/tensorflow/tensorflow/pull/47017) +Thus, it's necessary to check the keras version to avoid crash at LeakyReLU(alpha=None) +""" +if package_version.parse(keras_mod.__version__.split("-tf")[0]) <= package_version.parse( +"2.3.1" +): +data = keras_mod.layers.Input(shape=(2, 3, 4)) +layer = keras_mod.layers.LeakyReLU(alpha=None)(data) +keras_model = keras_mod.models.Model(data, layer) +with pytest.raises(tvm.error.OpAttributeInvalid): +verify_keras_frontend(keras_model) + def test_forward_dense(self, keras_mod): """test_forward_dense""" data = keras_mod.layers.Input(shape=(32, 32, 1)) @@ -749,6 +767,7 @@ if __name__ == "__main__": sut.test_forward_merge_dot(keras_mod=k) sut.test_forward_merge(keras_mod=k) sut.test_forward_activations(keras_mod=k) +sut.test_forward_activations_except(keras_mod=k) sut.test_forward_dense(keras_mod=k) sut.test_forward_permute(keras_mod=k) sut.test_forward_sequential(keras_mod=k)
[tvm] branch main updated: [Tensorflow] Fix conv2d_transpose for NHWC layout (#14546)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 62f9b1d29a [Tensorflow] Fix conv2d_transpose for NHWC layout (#14546) 62f9b1d29a is described below commit 62f9b1d29ae25fbdeb425bfc600c5dac7c23f694 Author: Qingchao Shen AuthorDate: Fri Apr 14 15:12:29 2023 +0800 [Tensorflow] Fix conv2d_transpose for NHWC layout (#14546) * [Tensorflow] Fix conv2d_transpose for NHWC layout If "data_format" == "NHWC", the kernel_layout should be "HWOI" rather than "HWIO". * remove deed code * add test cases * Update test_forward.py * Update test_forward.py * Update tensorflow_ops.py * Update tensorflow_ops.py --- python/tvm/relay/frontend/tensorflow_ops.py | 4 ++-- tests/python/frontend/tensorflow/test_forward.py | 21 - 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/python/tvm/relay/frontend/tensorflow_ops.py b/python/tvm/relay/frontend/tensorflow_ops.py index 6b3f144619..27374fad1a 100644 --- a/python/tvm/relay/frontend/tensorflow_ops.py +++ b/python/tvm/relay/frontend/tensorflow_ops.py @@ -464,8 +464,8 @@ def _conv(opname): if opname == "conv": attr["kernel_layout"] = "HWIO" if attr["data_format"] == "NHWC" else "OIHW" elif opname == "conv_transpose": -# conv_transpose in TVM has weights be IOHW for NCHW -attr["kernel_layout"] = "HWIO" if attr["data_format"] == "NHWC" else "IOHW" +# conv_transpose has weights be IOHW, because the attr["data_format"] always be NCHW +attr["kernel_layout"] = "IOHW" else: attr["kernel_layout"] = "HWOI" if attr["data_format"] == "NHWC" else "OIHW" diff --git a/tests/python/frontend/tensorflow/test_forward.py b/tests/python/frontend/tensorflow/test_forward.py index 703df79942..bd966fa71c 100644 --- a/tests/python/frontend/tensorflow/test_forward.py +++ b/tests/python/frontend/tensorflow/test_forward.py @@ -742,7 +742,16 @@ def test_forward_convolution(): "NCHW", [1, 1, 8, 8], ) - +_test_convolution( +"conv_transpose", +[4, 19, 8, 8], +[2, 2, 66, 19], +[1, 1], +[2, 2], +"VALID", +"NCHW", +[4, 66, 16, 16], +) _test_convolution("conv", [4, 8, 8, 176], [1, 1, 176, 32], [1, 1], [1, 1], "SAME", "NHWC") _test_convolution("conv", [4, 17, 17, 19], [3, 3, 19, 19], [1, 1], [2, 2], "VALID", "NHWC") _test_convolution("conv", [4, 17, 17, 124], [1, 1, 124, 19], [1, 1], [1, 1], "SAME", "NHWC") @@ -917,6 +926,16 @@ def test_forward_convolution(): [4, 8, 8, 176], add_shapes_to_graph_def=False, ) +_test_convolution( +"conv_transpose", +[4, 8, 8, 19], +[2, 2, 66, 19], +[1, 1], +[2, 2], +"VALID", +"NHWC", +[4, 16, 16, 66], +) # Explicit padding if package_version.parse(tf.VERSION) >= package_version.parse("2.4.1"): _test_convolution(
[tvm] branch main updated (a7a1980480 -> f79e4ebf30)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from a7a1980480 [CI] Update ci_cpu image and build with llvm-15 (#14466) add f79e4ebf30 [Bugfix][Topi] Output strides in pack_buffer() utility (#14566) No new revisions were added by this update. Summary of changes: include/tvm/topi/detail/extern.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[tvm] branch main updated: [OpenCL][Textures] Always use SSA for texture loading (#14397)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 4011280b16 [OpenCL][Textures] Always use SSA for texture loading (#14397) 4011280b16 is described below commit 4011280b16d89678cddd716d214b1840e4de73ec Author: Egor Churaev AuthorDate: Thu Mar 30 08:35:14 2023 +0300 [OpenCL][Textures] Always use SSA for texture loading (#14397) * [OpenCL][Textures] Always use SSA for texture loading In some cases we must use SSA for textures loading but we didn't do that. Example of such cases: 1. Storing texture (NCHW4c) directly (w/o temporary buffer) to the output buffer (NCHW). In this case we have to use SSA because we need to get only one channel from the pixel. In case of storing to the local buffer the SSA was used because the buffer was allocated in kernel and the logic was written that if the buffer was allocated then we should use SSA. But if we store the same texture directly to the output buffer then SSA wasn't used and this OpenCL code wasn't compiled. 2. Casting texture (NCHW4c) to another data type and then storing it to the buffer (NCHW). The SSA for textures was disabled in case of cast operation. As a result it was necessary to take an channel from the pixel but we got the vector data type (e.g. float4) and then we tried to cast it to scalar data type. This code also wasn't compiled. In this PR SSA form was enabled for all cases when `texture2d_load` is used. The relevant tests cases were added. * Add regression test on injective * Fix lit * Add skip for FP16 test * Add additional test cases * Fix lint * Apply comment * Fix lint - Co-authored-by: Andrey Malyshev --- src/target/source/codegen_opencl.cc| 48 +-- src/target/source/codegen_opencl.h | 5 - .../relay/opencl_texture/test_injection_texture.py | 85 + .../python/unittest/test_target_codegen_opencl.py | 6 +- .../unittest/test_target_texture_codegen_opencl.py | 375 + 5 files changed, 469 insertions(+), 50 deletions(-) diff --git a/src/target/source/codegen_opencl.cc b/src/target/source/codegen_opencl.cc index 613b1d0847..525ee95f41 100644 --- a/src/target/source/codegen_opencl.cc +++ b/src/target/source/codegen_opencl.cc @@ -382,33 +382,6 @@ std::string CodeGenOpenCL::CastTo(std::string value, DataType target) { return os.str(); } -void CodeGenOpenCL::VisitStmt_(const BufferStoreNode* op) { - if (auto call = op->value.as()) { -if (call->op.same_as(builtin::texture2d_load())) { - need_texture_ssa_ = false; - // If storing a texture load into a buffer, don't use an - // intermediate local unless the buffer allocation is a - // single element selected from the texture read. - auto it = allocation_size_.find(op->buffer->data.get()); - if (it != allocation_size_.end() && it->second == 1) { -need_texture_ssa_ = true; - } -} - } - CodeGenC::VisitStmt_(op); - need_texture_ssa_ = true; -} - -void CodeGenOpenCL::VisitExpr_(const CastNode* op, std::ostream& os) { - if (auto call = op->value.as()) { -if (call->op.same_as(builtin::texture2d_load())) { - need_texture_ssa_ = false; -} - } - CodeGenC::VisitExpr_(op, os); - need_texture_ssa_ = true; -} - void CodeGenOpenCL::VisitStmt_(const AllocateNode* op) { allocation_size_.insert({op->buffer_var.get(), op->ConstantAllocationSize() * op->dtype.lanes()}); CodeGenC::VisitStmt_(op); @@ -472,20 +445,15 @@ void CodeGenOpenCL::VisitExpr_(const CallNode* op, std::ostream& os) { this->PrintExpr(op->args[2], ss); ss << ")))"; -// Only use local SSA if texture is not already being stored -if (need_texture_ssa_) { - std::string rhs = SSAGetID(ss.str(), op->dtype.with_lanes(4)); - if (op->args.back().as()) { -os << rhs; - } else { -os << "(("; -this->PrintType(op->dtype.with_lanes(1), os); -os << "*)&" << rhs << ")["; -this->PrintExpr(op->args.back(), os); -os << "]"; - } +std::string rhs = SSAGetID(ss.str(), op->dtype.with_lanes(4)); +if (op->args.back().as()) { + os << rhs; } else { - os << ss.str(); + os << "(("; + this->PrintType(op->dtype.with_lanes(1), os); + os << "*)&" << rhs << ")["; + this->PrintExpr(op->args.back(), os); + os << "]"; }
[tvm] branch main updated (970cd1def8 -> 6eb4b873e1)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 970cd1def8 [TIR][Hexagon] Enhancement of NarrowDataType pass for binary ops (#14298) add 6eb4b873e1 [DOCS][ADRENO] Improved Adreno documentation (#13867) No new revisions were added by this update. Summary of changes: docs/how_to/deploy/adreno.rst | 718 +++-- .../how_to/deploy_models/deploy_model_on_adreno.py | 309 + .../deploy_models/deploy_model_on_adreno_tvmc.py | 198 ++ tests/python/relay/opencl_texture/test_network.py | 40 +- tests/scripts/setup-adreno-env.sh | 113 tests/scripts/task_build_adreno_bins.sh| 1 - tests/scripts/task_config_build_adreno.sh | 1 - 7 files changed, 1063 insertions(+), 317 deletions(-) create mode 100644 gallery/how_to/deploy_models/deploy_model_on_adreno_tvmc.py create mode 100755 tests/scripts/setup-adreno-env.sh
[tvm] branch main updated: [COMMUNITY] Cheng Wen -> Reviewer (#14153)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 428400c6e0 [COMMUNITY] Cheng Wen -> Reviewer (#14153) 428400c6e0 is described below commit 428400c6e0574f5bf43619ca2684397a2e14ba21 Author: Siyuan Feng AuthorDate: Wed Mar 1 12:41:12 2023 +0800 [COMMUNITY] Cheng Wen -> Reviewer (#14153) Please join me @chengven027-intellif as a new Reviewer in TVM. Cheng has contributed to ONNX/PyTorch frontend and Relay passes, making TVM support more input models. - [Commits History](https://github.com/apache/tvm/pulls?q=author%3Achengven027-intellif+) - [Code Review](https://github.com/apache/tvm/pulls?q=reviewed-by%3Achengven027-intellif+) --- CONTRIBUTORS.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CONTRIBUTORS.md b/CONTRIBUTORS.md index a0ff1bc57e..4304d6c647 100644 --- a/CONTRIBUTORS.md +++ b/CONTRIBUTORS.md @@ -196,6 +196,7 @@ We do encourage everyone to work anything they are interested in. - [Alex Weaver](https://github.com/alex-weaver): @alex-weaver - [Logan Weber](https://github.com/weberlo): @weberlo - [Matt Welsh](https://github.com/mdw-octoml): @mdw-octoml +- [Cheng Wen](https://github.com/chengven027-intellif): @chengven027-intellif - [Jian Weng](https://github.com/were): @were - [wrongtest](https://github.com/wrongtest-intellif): @wrongtest-intellif - [Yong Wu](https://github.com/yongwww): @yongwww
[tvm] branch main updated: [DOCKER] Configurable NDK version support (#14000)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new d9b0a80e1b [DOCKER] Configurable NDK version support (#14000) d9b0a80e1b is described below commit d9b0a80e1b97fae695d54eec6627c6b7fe840d3b Author: Siva AuthorDate: Mon Feb 27 09:36:27 2023 +0530 [DOCKER] Configurable NDK version support (#14000) Let the Android NDK version configurable as a command line argument --- docker/Dockerfile.ci_adreno | 5 +--- docker/Dockerfile.ci_cpu| 3 -- docker/Dockerfile.ci_hexagon| 2 -- docker/Dockerfile.demo_android | 2 -- docker/install/ubuntu_install_androidsdk.sh | 43 - 5 files changed, 37 insertions(+), 18 deletions(-) diff --git a/docker/Dockerfile.ci_adreno b/docker/Dockerfile.ci_adreno index 8f4ede3a5e..11be0a8baa 100644 --- a/docker/Dockerfile.ci_adreno +++ b/docker/Dockerfile.ci_adreno @@ -22,10 +22,7 @@ COPY utils/apt-install-and-clear.sh /usr/local/bin/apt-install-and-clear # Android SDK COPY install/ubuntu_install_androidsdk.sh /install/ubuntu_install_androidsdk.sh -RUN bash /install/ubuntu_install_androidsdk.sh -ENV ANDROID_HOME=/opt/android-sdk-linux -ENV ANDROID_NDK_HOME=/opt/android-sdk-linux/ndk/21.3.6528147 -ENV ANDROID_NDK_MAJOR=21 +RUN bash /install/ubuntu_install_androidsdk.sh 25.2.9519653 3.22.1 33.0.2 33 ENV PATH /opt/android-sdk-linux/platform-tools:$PATH # Clang tool for CLML source codegen diff --git a/docker/Dockerfile.ci_cpu b/docker/Dockerfile.ci_cpu index 74e90670ab..c75bf31361 100644 --- a/docker/Dockerfile.ci_cpu +++ b/docker/Dockerfile.ci_cpu @@ -128,9 +128,6 @@ RUN bash /install/ubuntu_install_vitis_ai_packages_ci.sh # Android SDK COPY install/ubuntu_install_androidsdk.sh /install/ubuntu_install_androidsdk.sh RUN bash /install/ubuntu_install_androidsdk.sh -ENV ANDROID_HOME=/opt/android-sdk-linux/ -ENV ANDROID_NDK_HOME=/opt/android-sdk-linux/ndk/21.3.6528147/ -ENV ANDROID_NDK_MAJOR=21 # PaddlePaddle deps COPY install/ubuntu_install_paddle.sh /install/ubuntu_install_paddle.sh diff --git a/docker/Dockerfile.ci_hexagon b/docker/Dockerfile.ci_hexagon index bf4e33fc7c..3b4c58ef43 100644 --- a/docker/Dockerfile.ci_hexagon +++ b/docker/Dockerfile.ci_hexagon @@ -62,8 +62,6 @@ RUN bash /install/ubuntu_install_java.sh COPY install/ubuntu_install_androidsdk.sh /install/ubuntu_install_androidsdk.sh RUN bash /install/ubuntu_install_androidsdk.sh ENV ANDROID_HOME=/opt/android-sdk-linux -ENV ANDROID_NDK_HOME=/opt/android-sdk-linux/ndk/21.3.6528147 -ENV ANDROID_NDK_MAJOR=21 ENV PATH /opt/android-sdk-linux/platform-tools:$PATH # Hexagon diff --git a/docker/Dockerfile.demo_android b/docker/Dockerfile.demo_android index ebfacabb4e..42bf705106 100644 --- a/docker/Dockerfile.demo_android +++ b/docker/Dockerfile.demo_android @@ -81,5 +81,3 @@ RUN cd /usr && \ # Environment variables ENV PYTHONPATH=/usr/tvm/python:/usr/tvm/vta/python:${PYTHONPATH} ENV ANDROID_HOME=/opt/android-sdk-linux/ -ENV ANDROID_NDK_HOME=/opt/android-sdk-linux/ndk/21.3.6528147/ -ENV ANDROID_NDK_MAJOR=21 diff --git a/docker/install/ubuntu_install_androidsdk.sh b/docker/install/ubuntu_install_androidsdk.sh index bf2d80cd28..5e7278c5d6 100755 --- a/docker/install/ubuntu_install_androidsdk.sh +++ b/docker/install/ubuntu_install_androidsdk.sh @@ -26,6 +26,32 @@ ASDKTOOLS_HOME=/opt/android-sdk-tools ASDKTOOLS_VERSION=3859397 ASDKTOOLS_SHA256=444e22ce8ca0f67353bda4b85175ed3731cae3ffa695ca18119cbacef1c1bea0 +ANDROID_NDK_VERSION=21.3.6528147 +CMAKE_VERSION=3.6.4111459 +BUILD_TOOLS_VERSION=27.0.3 +ANDROID_PLATFORM=27 +ANDROID_NDK_MAJOR=21 + +if [ $# -gt 0 ] ; then + ANDROID_NDK_VERSION=$1 + ANDROID_NDK_MAJOR=`echo $1 | cut -d "." -f1` +fi +if [ $# -gt 1 ] ; then + CMAKE_VERSION=$2 +fi +if [ $# -gt 2 ] ; then + BUILD_TOOLS_VERSION=$3 +fi +if [ $# -gt 3 ] ; then + ANDROID_PLATFORM=$4 +fi + +echo "NDK Version: ${ANDROID_NDK_VERSION}" +echo "NDK Major : ${ANDROID_NDK_MAJOR}" +echo "Cmake Version: ${CMAKE_VERSION}" +echo "Build Tools: ${BUILD_TOOLS_VERSION}" +echo "Android Platform: ${ANDROID_PLATFORM}" + wget -q http://dl.google.com/android/repository/sdk-tools-linux-${ASDKTOOLS_VERSION}.zip -O sdk-tools-linux.zip echo "${ASDKTOOLS_SHA256} *sdk-tools-linux.zip" | sha256sum --check - unzip sdk-tools-linux.zip @@ -52,8 +78,8 @@ done cat >/install/package-list-minimal.txt </dev/null || true @@ -77,9 +103,12 @@ touch /root/.android/repositories.cfg # NOTE: sdkmanager returns exit code 141 (yes || true) | sdkmanager --licenses --sdk_root="$ANDROID_HOME" || true sdkmanager --verbose --package_file=/install/package-list-minimal.txt --sdk_root="$ANDROID_HOME" -test -d "${ANDROID_HOME}/build-tools/2
[tvm] branch main updated (aa927df6b3 -> e7ad4bc071)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from aa927df6b3 [Metaschedule] EvolutionarySearchNode::State constructor typo fix (#14002) add e7ad4bc071 [CLML] Changes corresponding to OpenCL workspace refactorization (#13972) No new revisions were added by this update. Summary of changes: src/runtime/contrib/clml/clml_runtime.cc | 127 +-- src/runtime/opencl/opencl_common.h | 2 + 2 files changed, 70 insertions(+), 59 deletions(-)
[tvm] branch main updated (bea4919a75 -> 266ff51d2a)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from bea4919a75 [TVMScript] Encourage using T.Buffer directly (#13971) add 266ff51d2a [ONNX] Extend reduction types supported by ScatterND (#13946) No new revisions were added by this update. Summary of changes: include/tvm/relay/attrs/transform.h | 5 ++- python/tvm/relay/frontend/onnx.py | 53 ++- python/tvm/relay/op/transform.py | 8 +++- python/tvm/topi/cuda/scatter.py | 16 +++- python/tvm/topi/scatter.py| 28 +- tests/python/frontend/onnx/test_forward.py| 2 - tests/python/relay/test_op_level3.py | 19 -- tests/python/topi/python/test_topi_scatter.py | 18 +++-- 8 files changed, 123 insertions(+), 26 deletions(-)
[tvm] branch main updated: [ONNX] Support Bitwise operations (#13888)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 583f5ae37c [ONNX] Support Bitwise operations (#13888) 583f5ae37c is described below commit 583f5ae37c3b34ac8e1ce36fccf22c1e7eb89ab1 Author: Valery Chernov AuthorDate: Fri Feb 3 14:30:55 2023 +0400 [ONNX] Support Bitwise operations (#13888) * add base class for bitwise operations. BitwiseAnd, BitwiseNot, BitwiseOr and BitwiseXor were implemented * add test for BitwiseAnd, BitwiseNot, BitwiseOr, BitwiseXor operations to ONNX front-end * add test of BitShift for ONNX front-end * fix dtype for test * skip test due to old version of ORT - Co-authored-by: Valery Chernov --- python/tvm/relay/frontend/onnx.py | 83 ++-- tests/python/frontend/onnx/test_forward.py | 150 + 2 files changed, 225 insertions(+), 8 deletions(-) diff --git a/python/tvm/relay/frontend/onnx.py b/python/tvm/relay/frontend/onnx.py index 8b4a0cc5e8..8de5e0e08b 100644 --- a/python/tvm/relay/frontend/onnx.py +++ b/python/tvm/relay/frontend/onnx.py @@ -5578,13 +5578,31 @@ class ConvInteger(OnnxOpConverter): ) -class BitShift(OnnxOpConverter): -"""Operator converter for NonZero""" +class BitwiseBase(OnnxOpConverter): +"""Base class of operator converter for Bitwise operations""" + +name = "" + +@classmethod +def check_inputs(cls, inputs, num=2, use_int=True): +assert len(inputs) == num, "{} takes {} inputs, {} given".format(cls.name, num, len(inputs)) + +valid_types = ["uint8", "uint16", "uint32", "uint64"] +if use_int: +valid_types += ["int8", "int16", "int32", "int64"] +for i in range(num): +in_dtype = infer_type(inputs[i]).checked_type.dtype +assert in_dtype in valid_types, "Wrong dtype of the {}-th input: {}".format(i, in_dtype) + + +class BitShift(BitwiseBase): +"""Operator converter for BitShift""" + +name = "BitShift" @classmethod def _impl_v11(cls, inputs, attr, params): -if len(inputs) != 2: -raise ValueError("Bitshift expects 2 inputs") +cls.check_inputs(inputs, use_int=False) direction = attr.get("direction", "LEFT").decode("ascii") if direction == "LEFT": @@ -5596,6 +5614,54 @@ class BitShift(OnnxOpConverter): return out +class BitwiseAnd(BitwiseBase): +"""Operator converter for BitwiseAnd""" + +name = "BitwiseAnd" + +@classmethod +def _impl_v18(cls, inputs, attr, params): +cls.check_inputs(inputs) + +return _op.bitwise_and(*inputs) + + +class BitwiseNot(BitwiseBase): +"""Operator converter for BitwiseNot""" + +name = "BitwiseNot" + +@classmethod +def _impl_v18(cls, inputs, attr, params): +cls.check_inputs(inputs, num=1) + +return _op.bitwise_not(*inputs) + + +class BitwiseOr(BitwiseBase): +"""Operator converter for BitwiseOr""" + +name = "BitwiseOr" + +@classmethod +def _impl_v18(cls, inputs, attr, params): +cls.check_inputs(inputs) + +return _op.bitwise_or(*inputs) + + +class BitwiseXor(BitwiseBase): +"""Operator converter for BitwiseXor""" + +name = "BitwiseXor" + +@classmethod +def _impl_v18(cls, inputs, attr, params): +cls.check_inputs(inputs) + +return _op.bitwise_xor(*inputs) + + class Unique(OnnxOpConverter): """Operator converter for unique""" @@ -6319,7 +6385,12 @@ def _get_convert_map(opset): "OptionalHasElement": OptionalHasElement.get_converter(opset), "OptionalGetElement": OptionalGetElement.get_converter(opset), "Affine": Affine.get_converter(opset), +# Bitwise operators "BitShift": BitShift.get_converter(opset), +"BitwiseAnd": BitwiseAnd.get_converter(opset), +"BitwiseNot": BitwiseNot.get_converter(opset), +"BitwiseOr": BitwiseOr.get_converter(opset), +"BitwiseXor": BitwiseXor.get_converter(opset), "ThresholdedRelu": ThresholdedRelu.get_converter(opset), "ScaledTanh": ScaledTanh.get_converter(opset), "ParametricSoftplus": ParametricSoftPlus.get_converter(opset), @@ -6337,10 +6408
[tvm] branch main updated: [CLML][CODEGEN] CLML native codegen utility (#13837)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new d35a8ab135 [CLML][CODEGEN] CLML native codegen utility (#13837) d35a8ab135 is described below commit d35a8ab1353afc40317396b2ddfda8f35a99ba8a Author: Siva AuthorDate: Fri Feb 3 11:35:55 2023 +0530 [CLML][CODEGEN] CLML native codegen utility (#13837) * [CLML][CODEGEN] CLML native codegen utility This util generates native CLML code given a DNN model. It does import via tvmc, extracts clml_modules, get the json source and finally generates clml_models.cc that holds source for various sub graphs. cpp_clml tool has additional infrastructure to compile it as a standalong binary that runs these models. This PR adds symbol name to the generates json grpah. Also, extends const_loader interface to get constant params. * * review comments * * review * * review --- apps/cpp_clml/CMakeLists.txt | 61 ++ apps/cpp_clml/README.md| 145 apps/cpp_clml/clml_runner.cc | 818 + apps/cpp_clml/clml_runner.h| 262 +++ apps/cpp_clml/main.cc | 243 ++ apps/cpp_clml/scripts/clml_codegen.py | 64 ++ cmake/modules/contrib/CLML.cmake | 2 +- docker/Dockerfile.ci_adreno| 3 + python/tvm/relay/op/contrib/clml.py| 772 +++ .../backend/contrib/codegen_json/codegen_json.h| 1 + src/runtime/const_loader_module.cc | 10 + src/runtime/contrib/json/json_runtime.h| 3 + 12 files changed, 2383 insertions(+), 1 deletion(-) diff --git a/apps/cpp_clml/CMakeLists.txt b/apps/cpp_clml/CMakeLists.txt new file mode 100644 index 00..8c0fd53bf9 --- /dev/null +++ b/apps/cpp_clml/CMakeLists.txt @@ -0,0 +1,61 @@ +cmake_minimum_required(VERSION 3.13) + +project(clml_run VERSION 2.0) + +if(NOT DEFINED CMAKE_TOOLCHAIN_FILE) + message( FATAL_ERROR "CMAKE_TOOLCHAIN_FILE Not set, forcing exit. Suggested value: {ANDROID_NDK_PATH}/build/cmake/android.toolchain.cmake." ) +endif(NOT DEFINED CMAKE_TOOLCHAIN_FILE) + +if(NOT DEFINED ANDROID_ABI) + message( FATAL_ERROR "ANDROID_ABI Not set, forcing exit. Suggested value(s): arm64-v8a (64), armeabi-v7a (32)" ) +endif(NOT DEFINED ANDROID_ABI) + +if(NOT DEFINED CLML_SDK) + message( FATAL_ERROR "CLML_SDK Not set, forcing exit." ) +endif(NOT DEFINED CLML_SDK) + +if (CMAKE_FIND_ROOT_PATH_MODE_LIBRARY STREQUAL "ONLY") + set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY BOTH) +endif() + +find_library(CLML_LIBRARIES NAMES libOpenCL.so NO_DEFAULT_PATH PATHS ${CLML_SDK}/lib ${CLML_SDK}/lib64) + +# CMake/Android variables +set( ANDROID_STL c++_static CACHE STRING "Target Android STL") # default + +# Source variables +set( OPENCL_INCLUDE_DIRS ${CLML_SDK} CACHE PATH "filepath to OpenCL headers") + +set(CMAKE_CXX_STANDARD 17) +set(CMAKE_CXX_STANDARD_REQUIRED True) + +#we do not want to pass -fno-exceptions +if(${CMAKE_CXX_FLAGS} MATCHES "-fno-exceptions") + message ( WARNING "Disabling -fno-exceptions") + string(REGEX REPLACE "-fno-exceptions" "" CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS}) +endif() + +#we do not want to pass -fno-rtti +if(${CMAKE_CXX_FLAGS} MATCHES "-fno-rtti") + message ( WARNING "Disabling -fno-rtti") + string(REGEX REPLACE "-fno-rtti" "" CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS}) +endif() + +set(COMMON_SOURCE_FILES +clml_models.cc +clml_runner.cc +clml_runner.h +main.cc +../../3rdparty/cnpy/cnpy.cpp +) + +include_directories( +src +${OPENCL_INCLUDE_DIRS} +"../../3rdparty/dmlc-core/include" +"../../3rdparty/cnpy/" +) + +add_executable(clml_run ${COMMON_SOURCE_FILES}) +target_link_options(clml_run PRIVATE -Wl,--unresolved-symbols=ignore-in-shared-libs) +target_link_libraries(clml_run ${CLML_LIBRARIES} z) diff --git a/apps/cpp_clml/README.md b/apps/cpp_clml/README.md new file mode 100644 index 00..3200492122 --- /dev/null +++ b/apps/cpp_clml/README.md @@ -0,0 +1,145 @@ + + + + + + + + + + + + + + + + + +# OpenCLML Debug Tool + +Tool to generate OpenCLML source file given a model from any framework and compile it as a native application that runs on Android target. +This tool helps to debug or triage OpenCLML offloaded sub graphs as a standalone application. + +### Codegen + +Models can be downloaded from well known frameworks like Tensorflow, PyTorch, TFLite, Onnx ..etc. +Assuming ```resnet50.h5``` is a Keras ResNet50 model file, use the below
[tvm] branch main updated: [TOPHUB] use keys as a keyword for searching of existing statistics (#13874)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new ea34e6eb0b [TOPHUB] use keys as a keyword for searching of existing statistics (#13874) ea34e6eb0b is described below commit ea34e6eb0bd47b397a6c29b18b5ff23ef88f4998 Author: Andrey Malyshev AuthorDate: Thu Feb 2 16:43:05 2023 +0200 [TOPHUB] use keys as a keyword for searching of existing statistics (#13874) * [TOPHUB] use keys as a keyword for searching of existing statistics In case of ARM we might not specify -device and in this case llvm will be used while even in this case we can determin proper filename with stat since keys have architecture defined. The same situatin must with with x86 * Add test on target not having arm_cpu device * minor fix, add comment * Fix pylint * Fix comment --- python/tvm/autotvm/tophub.py | 10 ++ tests/python/unittest/test_autotvm_dispatch_context.py | 16 2 files changed, 26 insertions(+) diff --git a/python/tvm/autotvm/tophub.py b/python/tvm/autotvm/tophub.py index f705d591e6..99dd312d87 100644 --- a/python/tvm/autotvm/tophub.py +++ b/python/tvm/autotvm/tophub.py @@ -106,10 +106,20 @@ def context(target, extra_files=None): if isinstance(tgt, str): tgt = Target(tgt) +# The TOPHUB file names rely on Target's device or kind. Both these types of +# information exist in Target.keys, but rules of filling this filed is not explicitly +# defined, we are afraid to rely only on Target.keys. At the same time Target.device +# is filled only if device was pointed explicitly in target string, that is not mandatory +# and in some cases we need to get information about device from Target.keys +# In priority order we verify: +# 1) Target.device +# 2) Target.keys +# 3) Target.kind possible_names = [] device = tgt.attrs.get("device", "") if device != "": possible_names.append(_alias(device)) +possible_names.extend(tgt.keys) possible_names.append(tgt.kind.name) all_packages = list(PACKAGE_VERSION.keys()) diff --git a/tests/python/unittest/test_autotvm_dispatch_context.py b/tests/python/unittest/test_autotvm_dispatch_context.py index 6ca062047f..ba75992128 100644 --- a/tests/python/unittest/test_autotvm_dispatch_context.py +++ b/tests/python/unittest/test_autotvm_dispatch_context.py @@ -19,6 +19,7 @@ The dispatcher can choose which template to use according to the parameters of workload""" from tvm import autotvm +import tvm @autotvm.template("testing/dispatch_fallback") @@ -31,5 +32,20 @@ def test_fallback(): simple_template(2, 3) +def test_tophub_kinds_match(): +def verify_arm_cpu(target): +best_by_targetkey = autotvm.tophub.context(target).best_by_targetkey +assert len(best_by_targetkey) +found_arm_cpu = False +for a, _ in best_by_targetkey: +if "arm_cpu" in a: +found_arm_cpu = True +break +assert found_arm_cpu + +verify_arm_cpu("llvm -device=arm_cpu -mtriple=aarch64-linux-gnu -mattr=+neon,+v8.2a,+dotprod") +verify_arm_cpu("llvm -model=snapdragon835 -mtriple=arm64-linux-android -mattr=+neon") + + if __name__ == "__main__": test_fallback()
[tvm] branch main updated: [QNN][Relay][Topi] Add qnn.dense with weight layout (#13854)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 37e1a6862c [QNN][Relay][Topi] Add qnn.dense with weight layout (#13854) 37e1a6862c is described below commit 37e1a6862ca1bb77e33ca9c03e1365d50f468bd9 Author: ibsidorenko <98739392+ibsidore...@users.noreply.github.com> AuthorDate: Thu Feb 2 17:21:00 2023 +0300 [QNN][Relay][Topi] Add qnn.dense with weight layout (#13854) * [Hexagon][QNN] Improve performance of qnn.mul This commit imroves performance of qnn.mul operation without QNN canonicalization. * [QNN][Relay][Topi] Add qnn.dense with weight layout This commit adds new Relay operation "qnn.contrib_dense_pack" that supports different weights layout (nn.dense and qnn.dense do not support this attribute). This new operation is full analog of "nn.contrib_dense_pack" operation but in QNN space. --- python/tvm/relay/qnn/op/_qnn.py| 11 +- python/tvm/relay/qnn/op/legalizations.py | 134 - python/tvm/relay/qnn/op/qnn.py | 64 ++ python/tvm/relay/qnn/strategy/generic.py | 6 + python/tvm/relay/qnn/strategy/hexagon.py | 18 ++ python/tvm/topi/hexagon/qnn/__init__.py| 1 + .../hexagon/qnn/{__init__.py => dense_alter_op.py} | 26 +-- python/tvm/topi/hexagon/qnn/nn.py | 216 + python/tvm/topi/nn/qnn.py | 19 ++ src/relay/backend/te_compiler_cache.cc | 20 +- src/relay/op/nn/nn.h | 5 + src/relay/qnn/op/dense.cc | 105 +- .../contrib/test_arm_compute_lib/test_dense.py | 6 +- .../test_hexagon/test_wo_qnn_canonicalization.py | 172 +++- tests/python/relay/test_pass_qnn_legalize.py | 92 + 15 files changed, 779 insertions(+), 116 deletions(-) diff --git a/python/tvm/relay/qnn/op/_qnn.py b/python/tvm/relay/qnn/op/_qnn.py index c9c4c86e8b..e2157a051a 100644 --- a/python/tvm/relay/qnn/op/_qnn.py +++ b/python/tvm/relay/qnn/op/_qnn.py @@ -93,7 +93,16 @@ def alter_op_layout_qnn_conv2d(attrs, inputs, tinfos, out_type): # qnn.dense register_strategy("qnn.dense", strategy.qnn_dense_strategy) -register_pattern("qnn.dense", OpPattern.OUT_ELEMWISE_FUSABLE) + + +@register_alter_op_layout("qnn.dense") +def alter_op_layout_qnn_dense(attrs, inputs, tinfos, out_type): +"""Alternate the layout of qnn.dense""" +return topi.nn.qnn_dense_alter_layout(attrs, inputs, tinfos, out_type) + + +# qnn.contrib_dense_pack +register_strategy("qnn.contrib_dense_pack", strategy.qnn_dense_pack_strategy) # qnn.batch_matmul register_strategy("qnn.batch_matmul", strategy.qnn_batch_matmul_strategy) diff --git a/python/tvm/relay/qnn/op/legalizations.py b/python/tvm/relay/qnn/op/legalizations.py index ef368a016e..53cb41c2fb 100644 --- a/python/tvm/relay/qnn/op/legalizations.py +++ b/python/tvm/relay/qnn/op/legalizations.py @@ -340,6 +340,62 @@ def helper_change_dtypes_to_int8(attrs, inputs, types, relay_op): ) +def helper_change_dtypes_to_uint8(attrs, inputs, types, relay_op): +"""Helper function to change dtypes to uint8 x uint8. +Legalizes QNN dense op for Hexagon DSP. It supports fast u8 x u8 vrmpy instruction. + +Converting from int8 to uint8 can be done in following manner: + +Original equation + scale * (QA - zp_a) + scale * (QA + 128 - 128 - zp_a) + scale * ( (QA + 128) - (zp_a + 128)) + +Replacing QA + 128 with QA' and (zp_a + 128) with zp_a' +We get our new quantized uint8 tensor - scale * (QA' - zp_a') + +Parameters +-- +attrs : tvm.ir.Attrs +Attributes of current convolution +inputs : list of tvm.relay.Expr +The args of the Relay expr to be legalized +types : list of types +List of input and output types + +Returns +--- +result : tvm.relay.Expr +The legalized expr +""" +# Collect the dtypes. +data_dtype = types[0].dtype +kernel_dtype = types[1].dtype + +# Do nothing since it is already uint8. +if data_dtype == "uint8" and kernel_dtype == "uint8": +return None + +# Collect the input exprs. +data, kernel, input_zero_point, kernel_zero_point, input_scale, kernel_scale = inputs + +# Shift input if necessary. +if data_dtype == "int8": +# Compute (QA + 128) and (zp_a + 128) +data, input_zero_point = _shift(data, input_zero_point, "uint8") + +# Shift kernel if necessary. +if kernel_dtype == "int8": +# Compute (QA + 128) an
[tvm] branch main updated: [RUNTIME] Fix the manual determination of cores in FillDataForMeasure (#13849)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new f0ea9e461a [RUNTIME] Fix the manual determination of cores in FillDataForMeasure (#13849) f0ea9e461a is described below commit f0ea9e461a4bc056d320dafe651135c3a8357a34 Author: dsbarinov1 <71228944+dsbarin...@users.noreply.github.com> AuthorDate: Thu Feb 2 08:14:49 2023 +0300 [RUNTIME] Fix the manual determination of cores in FillDataForMeasure (#13849) * Assertion failed during tuning * Cleanup * Do not commit * Do not commit * Undo fix + provide test for multithread random filling * Random fill test with fix enabled * Isolate the effect of this test on the other tests * Correct the typo in the function name * Import threading + lint --- src/runtime/contrib/random/mt_random_engine.cc | 10 - tests/python/contrib/test_random.py| 28 ++ 2 files changed, 32 insertions(+), 6 deletions(-) diff --git a/src/runtime/contrib/random/mt_random_engine.cc b/src/runtime/contrib/random/mt_random_engine.cc index ac52594360..dc01114af0 100644 --- a/src/runtime/contrib/random/mt_random_engine.cc +++ b/src/runtime/contrib/random/mt_random_engine.cc @@ -192,12 +192,12 @@ class RandomEngine { struct ParallelTask { static int RunTask(int task_id, TVMParallelGroupEnv* penv, void* cdata) { ParallelTask* task = static_cast(cdata); -task->Run(task_id); +task->Run(task_id, penv->num_task); return 0; } - void Run(int i) { -int64_t chunk_size = size / num_threads; + void Run(int i, int num_tasks) { +int64_t chunk_size = size / num_tasks; int64_t st = i * chunk_size; int64_t ed = std::min(st + chunk_size, size); self->FillDataImpl(data, st, ed, dtype); @@ -205,7 +205,6 @@ class RandomEngine { RandomEngine* self; void* data; - int num_threads; int64_t size; DLDataType dtype; }; @@ -220,8 +219,7 @@ class RandomEngine { } if (dtype.bits == 1 || dtype.bits == 4 || dtype.bits == 8 || dtype.bits == 16 || dtype.bits == 32 || dtype.bits == 64) { - int num_threads = task.num_threads = runtime::threading::MaxConcurrency(); - int res = TVMBackendParallelLaunch(ParallelTask::RunTask, , num_threads); + int res = TVMBackendParallelLaunch(ParallelTask::RunTask, , 0); ICHECK_EQ(res, 0) << "RandomFillForMeasure: TVMBackendParallelLaunch failed"; } else { LOG(FATAL) << "Doesn't support dtype code " << dtype.code << " dtype bits " << dtype.bits; diff --git a/tests/python/contrib/test_random.py b/tests/python/contrib/test_random.py index 7a52c0dbf1..ddc06b0711 100644 --- a/tests/python/contrib/test_random.py +++ b/tests/python/contrib/test_random.py @@ -20,6 +20,7 @@ import numpy as np from tvm.contrib import random from tvm import rpc import tvm.testing +import threading def test_randint(): @@ -155,8 +156,35 @@ def test_random_fill(): test_rpc(dtype) +def test_random_fill_mt(): +"""Check random filler applicability in case of nontrivial thread pool configuration. +Particularly when MaxConcurrency != num_workers_used_ which is actual for big-little systems. +""" +no_exception_happened = True + +def test_body(): +try: +num_thread_used = 1 +configure_threads = tvm.get_global_func("runtime.config_threadpool") +configure_threads(1, num_thread_used) + +test_input = tvm.runtime.ndarray.empty((10, 10)) +random_fill = tvm.get_global_func("tvm.contrib.random.random_fill_for_measure") +random_fill(test_input) +except: +nonlocal no_exception_happened +no_exception_happened = False + +# ThreadPool object is thread local. To eliminate effect on other test cases put it into thread +x = threading.Thread(target=test_body) +x.start() +x.join() +assert no_exception_happened + + if __name__ == "__main__": test_randint() test_uniform() test_normal() test_random_fill() +test_random_fill_mt()
[tvm] branch main updated (c3fe08fb20 -> 7db77ada3d)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from c3fe08fb20 [ARITH] support floordiv in deduce bound (#13880) add 7db77ada3d [Build] Fix misleading error messages (#13887) No new revisions were added by this update. Summary of changes: cmake/utils/FindLLVM.cmake | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-)
[tvm] branch main updated (0d5baacc02 -> d8833bdbbe)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 0d5baacc02 [ONNX] Support SequenceErase op (#13865) add d8833bdbbe [ONNX] Support SequenceEmpty op (#13866) No new revisions were added by this update. Summary of changes: python/tvm/relay/frontend/onnx.py | 10 ++ tests/python/frontend/onnx/test_forward.py | 32 ++ 2 files changed, 42 insertions(+)
[tvm] branch main updated: [ONNX] Support SequenceErase op (#13865)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 0d5baacc02 [ONNX] Support SequenceErase op (#13865) 0d5baacc02 is described below commit 0d5baacc0241253547fe5235f95416a6467e712c Author: Valery Chernov AuthorDate: Tue Jan 31 11:06:02 2023 +0400 [ONNX] Support SequenceErase op (#13865) * SequenceErase was implemented in ONNX front-end * add SequenceErase node to Sequence test * remark from reviewer. fix negative position recalculation * add assert - Co-authored-by: Valery Chernov --- python/tvm/relay/frontend/onnx.py | 42 ++ tests/python/frontend/onnx/test_forward.py | 10 ++- 2 files changed, 46 insertions(+), 6 deletions(-) diff --git a/python/tvm/relay/frontend/onnx.py b/python/tvm/relay/frontend/onnx.py index 6e0c7cc2dd..93429a8638 100644 --- a/python/tvm/relay/frontend/onnx.py +++ b/python/tvm/relay/frontend/onnx.py @@ -6148,13 +6148,35 @@ class SequenceConstruct(OnnxOpConverter): return _expr.Tuple(inputs) -class SequenceLength(OnnxOpConverter): -"""Operator converter for sequence length op.""" +class SequenceErase(OnnxOpConverter): +"""Operator converter for sequence erase op.""" @classmethod def _impl_v11(cls, inputs, attr, params): -# Get length of input sequence -return _expr.const(len(inputs[0]), dtype="int64") +# Erase tensor from sequence on specified position +input_sequence = inputs[0] + +if len(inputs) == 2: +position = inputs[1] +# Non constant position is not supported. +if isinstance(position, _expr.Constant): +position = position.data.numpy() +elif position.name_hint in params: +position = params[position.name_hint].numpy() +else: +raise NotImplementedError("Position must be a constant.") +else: +position = -1 + +seq_len = len(input_sequence) +assert -seq_len <= position < seq_len, "Position is out of bounds" + +if position < 0: +position = seq_len + position +# Convert sequence to a list, insert tensors before erased, and repackage as Tuple. +tensor_list = [input_sequence[i] for i in range(seq_len) if i != position] +# Create new tuple and return. +return _expr.Tuple(tensor_list) class SequenceInsert(OnnxOpConverter): @@ -6188,6 +6210,15 @@ class SequenceInsert(OnnxOpConverter): return _expr.Tuple(tensor_list) +class SequenceLength(OnnxOpConverter): +"""Operator converter for sequence length op.""" + +@classmethod +def _impl_v11(cls, inputs, attr, params): +# Get length of input sequence +return _expr.const(len(inputs[0]), dtype="int64") + + class ConcatFromSequence(OnnxOpConverter): """Operator converter for sequence concatenation op.""" @@ -6492,8 +6523,9 @@ def _get_convert_map(opset): "LinearRegressor": LinearRegressor.get_converter(opset), # Sequence operators "SequenceConstruct": SequenceConstruct.get_converter(opset), -"SequenceLength": SequenceLength.get_converter(opset), +"SequenceErase": SequenceErase.get_converter(opset), "SequenceInsert": SequenceInsert.get_converter(opset), +"SequenceLength": SequenceLength.get_converter(opset), "ConcatFromSequence": ConcatFromSequence.get_converter(opset), "SplitToSequence": SplitToSequence.get_converter(opset), "SequenceAt": SequenceAt.get_converter(opset), diff --git a/tests/python/frontend/onnx/test_forward.py b/tests/python/frontend/onnx/test_forward.py index 6a780a632f..3e1af40867 100644 --- a/tests/python/frontend/onnx/test_forward.py +++ b/tests/python/frontend/onnx/test_forward.py @@ -7747,10 +7747,17 @@ def test_sequence(target, dev): outputs=["inserted_sequence"], ) +# Test sequence erase. +erase_node = helper.make_node( +"SequenceErase", +inputs=["inserted_sequence", "position"], +outputs=["erased_sequence"], +) + # Test sequence concatenation. concat_node = helper.make_node( "ConcatFromSequence", -inputs=["inserted_sequence"], +inputs=["erased_sequence"], outputs=["concat_sequence"], axis=axis, ) @@ -7796,6 +7803,7 @@ def test_sequence(target, dev): position_node, construct_node, insert_node, +erase_node, concat_node, split_node, at_node,
[tvm] branch main updated: [ONNX] Support SequenceLength op (#13863)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new e6262bf985 [ONNX] Support SequenceLength op (#13863) e6262bf985 is described below commit e6262bf9855a0c6f546f097910b48f955f2749cf Author: Valery Chernov AuthorDate: Mon Jan 30 13:29:04 2023 +0400 [ONNX] Support SequenceLength op (#13863) * add SequenceLength op * add SequenceLength test * graph fix - Co-authored-by: Valery Chernov --- python/tvm/relay/frontend/onnx.py | 10 ++ tests/python/frontend/onnx/test_forward.py | 21 +++-- 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/python/tvm/relay/frontend/onnx.py b/python/tvm/relay/frontend/onnx.py index 7b35d4a481..6e0c7cc2dd 100644 --- a/python/tvm/relay/frontend/onnx.py +++ b/python/tvm/relay/frontend/onnx.py @@ -6148,6 +6148,15 @@ class SequenceConstruct(OnnxOpConverter): return _expr.Tuple(inputs) +class SequenceLength(OnnxOpConverter): +"""Operator converter for sequence length op.""" + +@classmethod +def _impl_v11(cls, inputs, attr, params): +# Get length of input sequence +return _expr.const(len(inputs[0]), dtype="int64") + + class SequenceInsert(OnnxOpConverter): """Operator converter for sequence insert op.""" @@ -6483,6 +6492,7 @@ def _get_convert_map(opset): "LinearRegressor": LinearRegressor.get_converter(opset), # Sequence operators "SequenceConstruct": SequenceConstruct.get_converter(opset), +"SequenceLength": SequenceLength.get_converter(opset), "SequenceInsert": SequenceInsert.get_converter(opset), "ConcatFromSequence": ConcatFromSequence.get_converter(opset), "SplitToSequence": SplitToSequence.get_converter(opset), diff --git a/tests/python/frontend/onnx/test_forward.py b/tests/python/frontend/onnx/test_forward.py index 4b17cfbbb3..6a780a632f 100644 --- a/tests/python/frontend/onnx/test_forward.py +++ b/tests/python/frontend/onnx/test_forward.py @@ -7760,10 +7760,16 @@ def test_sequence(target, dev): "SplitToSequence", inputs=["concat_sequence"], outputs=["split_sequence"], axis=axis ) +# Test tensor extraction from sequence at_node = helper.make_node( "SequenceAt", inputs=["split_sequence", "position"], outputs=["output"] ) +# Test sequence length +length_node = helper.make_node( +"SequenceLength", inputs=["split_sequence"], outputs=["output_2"] +) + if new_axis is not None: new_axis_attr = helper.make_attribute("new_axis", new_axis) concat_node.attribute.append(new_axis_attr) @@ -7781,9 +7787,20 @@ def test_sequence(target, dev): output_shape[axis] = num_tensors + 1 else: output_shape[axis] = (num_tensors + 1) * output_shape[axis] -graph_outputs = [helper.make_tensor_value_info("output", TensorProto.FLOAT, output_shape)] +graph_outputs = [ +helper.make_tensor_value_info("output", TensorProto.FLOAT, output_shape), +helper.make_tensor_value_info("output_2", TensorProto.INT64, []), +] -graph_nodes = [position_node, construct_node, insert_node, concat_node, split_node, at_node] +graph_nodes = [ +position_node, +construct_node, +insert_node, +concat_node, +split_node, +at_node, +length_node, +] graph = helper.make_graph( graph_nodes,
[tvm] branch main updated (10d6c17238 -> 3c81d9bbac)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 10d6c17238 [WASM][FIX] test tests/node/websock_rpc_test.py (#13862) add 3c81d9bbac [RUNTIME][CLML] OpenCLML tuning and profiling enhanced (#13843) No new revisions were added by this update. Summary of changes: src/runtime/contrib/clml/clml_runtime.cc | 143 --- src/runtime/opencl/opencl_common.h | 40 - 2 files changed, 111 insertions(+), 72 deletions(-)
[tvm] branch main updated (1d89071863 -> 1f40b925a5)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 1d89071863 [TVMScript] More concise `T.allocate` syntax printing (#13830) add 1f40b925a5 [TOOL][NATIVE] Android native application for deploy and run (#13791) No new revisions were added by this update. Summary of changes: .gitmodules| 3 + 3rdparty/cnpy | 1 + CMakeLists.txt | 4 + LICENSE| 1 + apps/cpp_rtvm/CMakeLists.txt | 98 ++ apps/cpp_rtvm/README.md| 354 + apps/cpp_rtvm/main.cc | 264 +++ .../cpp_rtvm/scripts/download_models.py| 28 +- apps/cpp_rtvm/tvm_runner.cc| 320 +++ apps/cpp_rtvm/tvm_runner.h | 93 ++ cmake/config.cmake | 3 + cmake/modules/LibInfo.cmake| 1 + python/tvm/driver/tvmc/composite_target.py | 5 + python/tvm/relay/op/contrib/clml.py| 2 +- src/auto_scheduler/search_task.cc | 8 + src/runtime/graph_executor/graph_executor.cc | 44 ++- src/runtime/graph_executor/graph_executor.h| 6 + src/support/libinfo.cc | 5 + tests/scripts/task_build_adreno_bins.sh| 3 +- 19 files changed, 1227 insertions(+), 16 deletions(-) create mode 16 3rdparty/cnpy create mode 100644 apps/cpp_rtvm/CMakeLists.txt create mode 100644 apps/cpp_rtvm/README.md create mode 100644 apps/cpp_rtvm/main.cc copy ci/scripts/jenkins/http_utils.py => apps/cpp_rtvm/scripts/download_models.py (60%) create mode 100644 apps/cpp_rtvm/tvm_runner.cc create mode 100644 apps/cpp_rtvm/tvm_runner.h
[tvm] branch main updated: [Adreno] Update interface of AnnotateMemoryScope pass (#13779)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 6e01f3d855 [Adreno] Update interface of AnnotateMemoryScope pass (#13779) 6e01f3d855 is described below commit 6e01f3d85581043593dd9b65cb5718c3e7386a81 Author: Egor Churaev AuthorDate: Thu Jan 19 17:38:33 2023 +0300 [Adreno] Update interface of AnnotateMemoryScope pass (#13779) Removed redundant and unused parameter from AnnotateMemoryScope pass. --- include/tvm/relay/transform.h| 2 +- src/relay/backend/build_module.cc| 2 +- src/relay/transforms/annotate_texture_storage.cc | 8 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/tvm/relay/transform.h b/include/tvm/relay/transform.h index 3227f7979d..43a0f89d95 100644 --- a/include/tvm/relay/transform.h +++ b/include/tvm/relay/transform.h @@ -584,7 +584,7 @@ TVM_DLL Pass CapturePostDfsIndexInSpans(); * \brief Calls device dependent memory scope analysis pass, collects mapping of desirable * expr->memory_scope and annotates expressions by VirtualDevice with required memory_scope */ -TVM_DLL Pass AnnotateMemoryScope(CompilationConfig config); +TVM_DLL Pass AnnotateMemoryScope(); /*! * \brief Removes non-fused reshapes after lowering the graph. diff --git a/src/relay/backend/build_module.cc b/src/relay/backend/build_module.cc index bca524794a..0642c0c672 100644 --- a/src/relay/backend/build_module.cc +++ b/src/relay/backend/build_module.cc @@ -396,7 +396,7 @@ class RelayBuildModule : public runtime::ModuleNode { relay_module = transform::Inline()(relay_module); relay_module = transform::InferType()(relay_module); relay_module = transform::LabelOps()(relay_module); -relay_module = transform::AnnotateMemoryScope(config_)(relay_module); +relay_module = transform::AnnotateMemoryScope()(relay_module); ICHECK(relay_module.defined()); diff --git a/src/relay/transforms/annotate_texture_storage.cc b/src/relay/transforms/annotate_texture_storage.cc index 9dbd631ad3..39f065ea8c 100644 --- a/src/relay/transforms/annotate_texture_storage.cc +++ b/src/relay/transforms/annotate_texture_storage.cc @@ -645,7 +645,7 @@ Map>> CollectStorageInfo(const Expr& expr) { return storage_info; } -Expr AnnotateMemoryScopeExpr(const Expr& expr, const IRModule& mod, CompilationConfig config) { +Expr AnnotateMemoryScopeExpr(const Expr& expr, const IRModule& mod) { auto storage_scope = CollectStorageInfo(expr); if (storage_scope.size()) { return RewriteVDStorageScopes(storage_scope).Rewrite(expr); @@ -655,10 +655,10 @@ Expr AnnotateMemoryScopeExpr(const Expr& expr, const IRModule& mod, CompilationC } namespace transform { -tvm::transform::Pass AnnotateMemoryScope(CompilationConfig config) { +tvm::transform::Pass AnnotateMemoryScope() { runtime::TypedPackedFunc pass_func = - [config = std::move(config)](Function f, IRModule m, PassContext pc) { -return Downcast(AnnotateMemoryScopeExpr(f, m, config)); + [](Function f, IRModule m, PassContext pc) { +return Downcast(AnnotateMemoryScopeExpr(f, m)); }; return CreateFunctionPass(pass_func, 2, "AnnotateMemoryScope", {}); }
[tvm] branch main updated: [FIX] Minor Compilation Warning Fixes (#13794)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 3b49f7973d [FIX] Minor Compilation Warning Fixes (#13794) 3b49f7973d is described below commit 3b49f7973d497a3087004c3487c3e89b8d9e6c2f Author: Siyuan Feng AuthorDate: Tue Jan 17 20:24:30 2023 +0800 [FIX] Minor Compilation Warning Fixes (#13794) This PR fixes some warnings from the latest clang compiler. --- include/tvm/tir/function.h| 8 src/tir/analysis/control_flow_graph.h | 7 --- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/include/tvm/tir/function.h b/include/tvm/tir/function.h index cf92f97360..9f7c0fa16b 100644 --- a/include/tvm/tir/function.h +++ b/include/tvm/tir/function.h @@ -224,14 +224,14 @@ class TensorIntrin : public ObjectRef { TVM_DEFINE_OBJECT_REF_METHODS(TensorIntrin, ObjectRef, TensorIntrinNode) }; -/* +/*! * \brief Specialize parameters of PrimFunc. * \param func The PrimFunc to be specialized. * \param param_map The mapping from function params to the instance. * \return The new function with parameter specialized. * \note We can define a Meta TIR function with symbolic shape: * - * \code + * \code{.py} * @T.prim_func * def mem_copy(a: T.handle, b: T.handle, m: T.int32, n: T.int32) -> None: * A = T.match_buffer(a, (m, n), "float32") @@ -244,14 +244,14 @@ class TensorIntrin : public ObjectRef { * * Then we can make it specialized with given shapes or buffers. * - * \code + * \code{.py} * a, _, m, n = mem_copy.params * func = mem_copy.specialize({a: tir.decl_buffer((16, 16))}) * # or * func = mem_copy.specialize({n: 16, m: 16}) * \endcode * - * \code {.language-id} + * \code{.py} * @T.prim_func * def mem_copy_16_16(a: T.handle, b: T.handle) -> None: * A = T.match_buffer(a, (16, 16), "float32") diff --git a/src/tir/analysis/control_flow_graph.h b/src/tir/analysis/control_flow_graph.h index 590392cf65..00a6b68ff9 100644 --- a/src/tir/analysis/control_flow_graph.h +++ b/src/tir/analysis/control_flow_graph.h @@ -292,7 +292,8 @@ class BufferState { std::vector constraints_; }; -/*! \brief Represents the flow of control through a `tir::Stmt` +/*! + * \brief Represents the flow of control through a `tir::Stmt` * * This class contains an internal representation of the possible * control flow that may occur during execution of a `tir::Stmt`. It @@ -312,7 +313,7 @@ class BufferState { * * For example, consider the following PrimFunc * - * ```python + * \code{.py} * @T.prim_func * def func(T.Buffer[16, "float32"]): * for i in T.serial(16): @@ -320,7 +321,7 @@ class BufferState { * B[i] = i * else: * B[i] = i-8 - * ``` + * \endcode * * The control flow graph would have eight control blocks. *
[tvm] branch main updated (287597b45d -> f9759920e0)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 287597b45d [CI] Update ci_minimal docker image to cross-compile TVM to aarch64 (#13776) add f9759920e0 [UnitTest] Parametrized test_arith_iter_affine_map::test_padding (#13774) No new revisions were added by this update. Summary of changes: .../python/unittest/test_arith_iter_affine_map.py | 160 + 1 file changed, 70 insertions(+), 90 deletions(-)
[tvm] branch main updated (d6507b256f -> 8551a5c71f)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from d6507b256f [TIR] Create Layout with specified axis dtype (#13663) add 8551a5c71f [MetaSchedule] Add "disabled_pass" option in tuning API (#13659) No new revisions were added by this update. Summary of changes: python/tvm/meta_schedule/relay_integration.py | 17 +++- .../test_meta_schedule_relay_integration.py| 45 ++ 2 files changed, 60 insertions(+), 2 deletions(-)
[tvm] branch main updated: [RPC] Add fail-guard for termination time exception (#13651)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 209845fb91 [RPC] Add fail-guard for termination time exception (#13651) 209845fb91 is described below commit 209845fb910aac76f1388d868bb4fe4a46d9170f Author: Tianqi Chen AuthorDate: Thu Dec 22 14:59:45 2022 -0500 [RPC] Add fail-guard for termination time exception (#13651) This PR adds fail-guard to reduce error messages thrown during process termination time. Such error won't trigger test error but will bring extra message during exit time. --- python/tvm/rpc/proxy.py | 5 - python/tvm/rpc/server.py | 5 - python/tvm/rpc/server_ios_launcher.py | 5 - 3 files changed, 12 insertions(+), 3 deletions(-) diff --git a/python/tvm/rpc/proxy.py b/python/tvm/rpc/proxy.py index 4c3144e7b5..d7027c88a4 100644 --- a/python/tvm/rpc/proxy.py +++ b/python/tvm/rpc/proxy.py @@ -643,7 +643,10 @@ class Proxy(object): self.proc = None def __del__(self): -self.terminate() +try: +self.terminate() +except ImportError: +pass def websocket_proxy_server(url, key=""): diff --git a/python/tvm/rpc/server.py b/python/tvm/rpc/server.py index 7932e98aa2..a144356f2e 100644 --- a/python/tvm/rpc/server.py +++ b/python/tvm/rpc/server.py @@ -499,4 +499,7 @@ class Server(object): self.proc = None def __del__(self): -self.terminate() +try: +self.terminate() +except ImportError: +pass diff --git a/python/tvm/rpc/server_ios_launcher.py b/python/tvm/rpc/server_ios_launcher.py index 2e31586f64..06ead1c3d1 100644 --- a/python/tvm/rpc/server_ios_launcher.py +++ b/python/tvm/rpc/server_ios_launcher.py @@ -335,7 +335,10 @@ class ServerIOSLauncher: print(e) def __del__(self): -self.terminate() +try: +self.terminate() +except ImportError: +pass @staticmethod def is_compatible_environment():
[tvm] branch main updated: [LLVM] Fix get tm allow_missing check pos (#13591)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new ae07437a32 [LLVM] Fix get tm allow_missing check pos (#13591) ae07437a32 is described below commit ae07437a32c8addadfec4002426e03a6b2bc8781 Author: wrongtest AuthorDate: Mon Dec 12 18:53:15 2022 +0800 [LLVM] Fix get tm allow_missing check pos (#13591) Fix get tm allow_missing check pos --- src/target/llvm/llvm_instance.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/target/llvm/llvm_instance.cc b/src/target/llvm/llvm_instance.cc index 44454fc6b9..2aa190ad70 100644 --- a/src/target/llvm/llvm_instance.cc +++ b/src/target/llvm/llvm_instance.cc @@ -297,9 +297,9 @@ llvm::TargetMachine* LLVMTargetInfo::GetOrCreateTargetMachine(bool allow_missing llvm_instance->createTargetMachine(triple_, cpu_, GetTargetFeatureString(), target_options_, reloc_model_, code_model_, opt_level_); target_machine_ = std::unique_ptr(tm); -if (!allow_missing) { - ICHECK(target_machine_ != nullptr) << error; -} + } + if (!allow_missing) { +ICHECK(target_machine_ != nullptr) << error; } return target_machine_.get(); }
[tvm] branch main updated: [Adreno] Add global pooling schedule (#13573)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new f4cfcafba5 [Adreno] Add global pooling schedule (#13573) f4cfcafba5 is described below commit f4cfcafba5024d9bbef1b8bf422c6a25368837f3 Author: Andrey Malyshev AuthorDate: Thu Dec 8 12:29:00 2022 +0200 [Adreno] Add global pooling schedule (#13573) * [Adreno] Add global pooling schedule The parallelizm opportuninties in case of global pooling are limited by number of channels, need to change schedule to have parallelizm by reduction axis/use rfactor * address pylint hits * address PR comments * switch spatial axis to blk binding --- python/tvm/relay/op/strategy/adreno.py | 7 ++ python/tvm/topi/adreno/pooling.py | 107 .../relay/opencl_texture/test_pool_texture.py | 135 + 3 files changed, 249 insertions(+) diff --git a/python/tvm/relay/op/strategy/adreno.py b/python/tvm/relay/op/strategy/adreno.py index 21252215fc..b606ab05d7 100644 --- a/python/tvm/relay/op/strategy/adreno.py +++ b/python/tvm/relay/op/strategy/adreno.py @@ -215,6 +215,13 @@ def schedule_reduce_adreno(attrs, outs, target): return topi.adreno.schedule_reduce(outs) +@schedule_adaptive_pool.register(["adreno"]) +def schedule_adaptive_pool_adreno(attrs, outs, target): +"""schedule adaptive pooling ops for adreno""" +with target: +return topi.adreno.schedule_adaptive_pool(outs, attrs.layout) + + @concatenate_strategy.register(["adreno"]) def concatenate_strategy_adreno(attrs, inputs, out_type, target): strategy = _op.OpStrategy() diff --git a/python/tvm/topi/adreno/pooling.py b/python/tvm/topi/adreno/pooling.py index 49f103c04a..f02af0c01f 100644 --- a/python/tvm/topi/adreno/pooling.py +++ b/python/tvm/topi/adreno/pooling.py @@ -19,6 +19,113 @@ import tvm from tvm import te from .. import tag +from .utils import get_div + + +def schedule_adaptive_pool(outs, layout="NCHW"): +"""Schedule for adaptive_pool. + +Parameters +-- +outs: Array of Tensor +The computation graph description of adaptive_pool +in the format of an array of tensors. + +Returns +--- +s: Schedule +The computation schedule for adaptive_pool. +""" +outs = [outs] if isinstance(outs, te.tensor.Tensor) else outs +s = te.create_schedule([x.op for x in outs]) + +def _schedule_global(Pool, layout): +# examples of latest pool op is global max pool and non latest is global avg pooling +# OL - an Expr will be used for rfactor +# Out - programming of the parallelizm on the global level +# shared is not required, local could be enough but shared scope gives quite significant +# perf boost +if Pool.op in s.outputs: +Out = Pool +OL = s.cache_write(Pool, "shared") +else: +Out = outs[0].op.output(0) +s[Pool].set_scope("shared") +OL = Pool + +PaddedInput = Pool.op.input_tensors[0] + +# detect axis for later reorder and binding of batch/channel to blocks and +# spatial to threads +if layout in ("NCHW", "NCHW4c"): +channel_index = 1 +height_index = 2 +width_index = 3 +else: +channel_index = 3 +height_index = 1 +width_index = 2 + +if isinstance(PaddedInput.op, tvm.te.ComputeOp): +s[PaddedInput].compute_inline() + +fused_reduce = s[OL].fuse(*s[OL].op.reduce_axis) + +spatial = PaddedInput.shape[height_index].value * PaddedInput.shape[width_index].value +# below values were selected empirically assuming that we should have some work in each +# thread (currently from 25-49) and number of threads not exceeding some threshold that +# was selected as 256 from performance point of view after experiments on Adreno 660 +max_threads = spatial // 25 if spatial > 25 else 1 +max_threads = 256 if max_threads > 256 else max_threads +num_thread = get_div(spatial, max_threads) + +thread_y = te.thread_axis((0, num_thread), "threadIdx.y") + +_, ki = s[OL].split(fused_reduce, factor=num_thread) +data_out_rf = s.rfactor(OL, ki) +s[data_out_rf].compute_at(s[OL], s[OL].op.reduce_axis[0]) +s[OL].bind(s[OL].op.reduce_axis[0], thread_y) + +naxis = s[Out].op.axis[0] +caxis = s[Out].op.axis[channel_index] +haxis = s[Out].op.axis[height_index] +waxis = s[Out].op.axis[width_index] + +if l
[tvm] branch main updated (22ff38dff8 -> 6069292ff9)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 22ff38dff8 [docs] Make building the cpu-only docs build explicit (#13315) add 6069292ff9 [OpenCL] Introduction of weights on buffers (#13563) No new revisions were added by this update. Summary of changes: python/tvm/topi/adreno/conv2d_nchw.py | 5 +- python/tvm/topi/adreno/conv2d_nhwc.py | 5 +- python/tvm/topi/adreno/conv2d_winograd_common.py | 2 + python/tvm/topi/adreno/depthwise_conv2d_nchw.py| 5 +- python/tvm/topi/adreno/depthwise_conv2d_nhwc.py| 5 +- src/relay/transforms/annotate_texture_storage.cc | 78 +- .../opencl_texture/test_conv2d_nchw_texture.py | 20 +++--- 7 files changed, 86 insertions(+), 34 deletions(-)
[tvm] branch main updated (95d2e9fa35 -> 57de9e7f3d)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 95d2e9fa35 [ARM] Add dynamic matvec support (#13502) add 57de9e7f3d Revert "[microTVM] enable building microTVM components by default" (#13503) No new revisions were added by this update. Summary of changes: CMakeLists.txt | 7 +-- cmake/config.cmake | 3 +++ conda/recipe/build.sh| 3 --- conda/recipe/meta.yaml | 3 --- tests/scripts/task_config_build_arm.sh | 1 + tests/scripts/task_config_build_cortexm.sh | 1 + tests/scripts/task_config_build_cpu.sh | 1 + tests/scripts/task_config_build_gpu.sh | 1 + tests/scripts/task_config_build_gpu_other.sh | 1 + tests/scripts/task_config_build_hexagon.sh | 2 ++ tests/scripts/task_config_build_i386.sh | 1 + tests/scripts/task_config_build_minimal.sh | 1 + tests/scripts/task_config_build_riscv.sh | 1 + tests/scripts/task_config_build_wasm.sh | 1 + 14 files changed, 15 insertions(+), 12 deletions(-)
[tvm] 01/01: [docs][Adreno] Remove unnecessary compilation flag
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch echuraev/remove_use_micro_from_doc in repository https://gitbox.apache.org/repos/asf/tvm.git commit 6f0ed41828c47f15b79b4a9461f45c14d76b4189 Author: Egor Churaev AuthorDate: Tue Nov 29 08:17:32 2022 +0300 [docs][Adreno] Remove unnecessary compilation flag Flag `-DUSE_MICRO=OFF` was added to avoid this issue #13482. In #13503 these changes were reverted. Removed unnecessary compilation flag to avoid user confusing. --- docs/how_to/deploy/adreno.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/how_to/deploy/adreno.rst b/docs/how_to/deploy/adreno.rst index af613aa5cb..7f4616fbf7 100644 --- a/docs/how_to/deploy/adreno.rst +++ b/docs/how_to/deploy/adreno.rst @@ -94,7 +94,7 @@ folder of TVM: mkdir build_android cd build_android - cmake .. -DUSE_OPENCL=ON -DUSE_MICRO=OFF -DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_HOME}/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_NATIVE_API_LEVEL=android-28 -DCMAKE_FIND_ROOT_PATH_MODE_PACKAGE=ON -DANDROID_STL=c++_static -DUSE_CPP_RPC=ON + cmake .. -DUSE_OPENCL=ON -DCMAKE_TOOLCHAIN_FILE=${ANDROID_NDK_HOME}/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_NATIVE_API_LEVEL=android-28 -DCMAKE_FIND_ROOT_PATH_MODE_PACKAGE=ON -DANDROID_STL=c++_static -DUSE_CPP_RPC=ON make -jN tvm_runtime tvm_rpc where **N** is the number of cores available on your *CPU*.
[tvm] branch echuraev/remove_use_micro_from_doc created (now 6f0ed41828)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch echuraev/remove_use_micro_from_doc in repository https://gitbox.apache.org/repos/asf/tvm.git at 6f0ed41828 [docs][Adreno] Remove unnecessary compilation flag This branch includes the following new commits: new 6f0ed41828 [docs][Adreno] Remove unnecessary compilation flag The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
[tvm] branch main updated: [Adreno] Add documentation for Adreno deployment (#13393)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new c38a0c50a7 [Adreno] Add documentation for Adreno deployment (#13393) c38a0c50a7 is described below commit c38a0c50a7cde09d548f570f7aafa8e293ef1485 Author: dsbarinov1 <71228944+dsbarin...@users.noreply.github.com> AuthorDate: Mon Nov 28 11:25:55 2022 +0300 [Adreno] Add documentation for Adreno deployment (#13393) * [Adreno] Add documentation for Adreno deployment Purpose: assist TVM users compile and deploy on Adreno by expanding our documentation and providing sample scripts in TVM. Information about PR: The present PR consists globally of 3 parts. The first part is an introductory article on compilation and deployment of neural networks on Adreno, covering such topics as: «Build TVM for Android/Adreno», «Advantages of textures» and «Differences in compilation and deployment of models for Adreno devices». The second part is a straightforward example script for compiling and inferring models at different precisions for Adreno devices. The third part is auxiliary files, images, etc. * Add correct links to images + small fixes * Remove images (.png) * Add request_hook in deploy_model_on_adreno.py * Fix trailing newline + add license * No newline at the EOF + blanks * Fix request hook placing * Fix style * Fix trailing * Fix whitespaces * Fix whitespaces v2 * Add newline at adreno.rst EOF * Add license to adreno.rst * Remove sphinx 'autosectionlabel' extension + modify cross-references in docs to work without this extension * Set default values to tracker_host and tracker_port * Add local_demo to be able to autogenerate docs * Fix quotes * Fix benchmark * . --- docs/how_to/deploy/adreno.rst | 336 docs/how_to/deploy/index.rst | 1 + .../how_to/deploy_models/deploy_model_on_adreno.py | 351 + 3 files changed, 688 insertions(+) diff --git a/docs/how_to/deploy/adreno.rst b/docs/how_to/deploy/adreno.rst new file mode 100644 index 00..af613aa5cb --- /dev/null +++ b/docs/how_to/deploy/adreno.rst @@ -0,0 +1,336 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + +..http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. + +Deploy to Adreno GPU +=== + +**Authors**: Daniil Barinov, Egor Churaev, Andrey Malyshev + +Introduction + + +Adreno is a series of graphics processing unit (GPU) semiconductor +intellectual property cores developed by Qualcomm and used in many of +their SoCs. + +The Adreno GPU accelerates the rendering of complex geometries to +deliver high-performance graphics and a rich user experience with low +power consumption. + +This guide will demonstrate :ref:`the benefits of using textures with Adreno`, +how to :ref:`build TVM with OpenCL` (needed by Adreno devices) and TVM RPC +enabled. It will also provide :ref:`example code` to better understand the differences in compiling and deploying models +for Adreno devices. + +.. _advantages_of_the_textures: + +Advantages of the Textures +-- + +One of the Adreno's advantages is the clever handling of textures. At +the moment, TVM is able to benefit from this by having texture support +for Adreno. The graph below shows the Adreno A5x architecture. + +|High-level overview of the Adreno A5x architecture for OpenCL| + +*Fig. 1 High-level overview of the Adreno A5x architecture for OpenCL* + +*source:* `OpenCL Optimization and Best Practices for Qualcomm Adreno GPUs <https://dl.acm.org/doi/10.1145/3204919.3204935>`_ + +Reasons of using textures: + +- Texture processor (TP) has a dedicated L1 cache, which is read-only cache and stores data + fetched from level-2 (L2) cache for texture operations (primary + reason) + +- The handling of image boundaries is built-in. + +- Supports num
[tvm] branch main updated (7cfa62e255 -> 3ad425fbdf)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 7cfa62e255 [Meta Schedule] Patch ICHECK for `target_has_vnni` to avoid seg fault (#13441) add 3ad425fbdf [Docker]Add privileged option for hardware with USB access testing (#13460) No new revisions were added by this update. Summary of changes: docker/bash.sh | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-)
[tvm] branch main updated (77f9c49b4e -> 1b3d77a897)
This is an automated email from the ASF dual-hosted git repository. echuraev pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 77f9c49b4e [Fix][MetaSchedule] Param for rule AutoBind on Python side (#13454) add 1b3d77a897 [RPC] Fix tracker connection termination (#13420) No new revisions were added by this update. Summary of changes: python/tvm/rpc/base.py | 36 ++- python/tvm/rpc/proxy.py| 8 ++--- python/tvm/rpc/server.py | 4 +-- python/tvm/rpc/tracker.py | 2 +- ...mscript_ir_builder_base.py => test_rpc_base.py} | 42 +++--- tests/python/unittest/test_runtime_rpc.py | 13 --- 6 files changed, 61 insertions(+), 44 deletions(-) copy tests/python/unittest/{test_tvmscript_ir_builder_base.py => test_rpc_base.py} (56%)