(tvm) branch main updated: [TVMScript] Optionally use `ruff format` instead of `black` (#16876)
This is an automated email from the ASF dual-hosted git repository. sanirudh pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new d030ce27a1 [TVMScript] Optionally use `ruff format` instead of `black` (#16876) d030ce27a1 is described below commit d030ce27a197e0a3e819b311dca5c5421d1cf5ba Author: Eric Lunderberg AuthorDate: Wed Apr 17 00:04:10 2024 -0500 [TVMScript] Optionally use `ruff format` instead of `black` (#16876) * [TVMScript] Optionally use `ruff format` instead of `black` The `ruff format` tool is significantly faster than the `black` formatter. For some particularly long TVMScript modules, using it can reduce the time required to show a formatted module from ~5 minutes to ~1 minute. This commit updates the `.show()` function to apply the optionally formatting using `ruff format` if available, falling back to `black` otherwise. * Fix lint error --- python/tvm/script/highlight.py | 95 ++ 1 file changed, 77 insertions(+), 18 deletions(-) diff --git a/python/tvm/script/highlight.py b/python/tvm/script/highlight.py index be0de5a6bf..e017c1e6ca 100644 --- a/python/tvm/script/highlight.py +++ b/python/tvm/script/highlight.py @@ -17,7 +17,10 @@ """Highlight printed TVM script. """ +import functools import os +import shutil +import subprocess import sys import warnings from typing import Any, Optional, Union @@ -92,7 +95,73 @@ def cprint( print(highlight(printable, Python3Lexer(), Terminal256Formatter(style=style))) -def _format(code_str: str) -> str: +@functools.lru_cache +def _get_formatter(formatter: Optional[str] = None): +def get_ruff_formatter(): +if shutil.which("ruff") is None: +return None + +def formatter(code_str): +proc = subprocess.Popen( +["ruff", "format", "--stdin-filename=TVMScript"], +stdin=subprocess.PIPE, +stdout=subprocess.PIPE, +encoding="utf-8", +) +stdout, _stderr = proc.communicate(code_str) +return stdout + +return formatter + +def get_black_formatter(): +try: +# pylint: disable=import-outside-toplevel +import black +except ImportError: +return None + +def formatter(code_str): +return black.format_str(code_str, mode=black.FileMode()) + +return formatter + +def get_fallback_formatter(): +def formatter(code_str): +with warnings.catch_warnings(): +warnings.simplefilter("once", UserWarning) +ruff_install_cmd = sys.executable + " -m pip install ruff" +black_install_cmd = ( +sys.executable + ' -m pip install "black==22.3.0" --upgrade --user' +) +warnings.warn( +f"Neither the 'ruff' formatter nor the 'black' formatter is available. " +f"To print formatted TVM script, please a formatter. \n" +f"To install ruff: {ruff_install_cmd}\n" +f"To install black: {black_install_cmd}", +category=UserWarning, +) +return code_str + +return formatter + +# formatter = "black" +if formatter is None: +options = [get_ruff_formatter, get_black_formatter] +elif formatter == "ruff": +options = [get_ruff_formatter] +elif formatter == "black": +options = [get_black_formatter] +else: +raise ValueError(f"Unknown formatter: {formatter}") + +for option in options: +func = option() +if func is not None: +return func +return get_fallback_formatter() + + +def _format(code_str: str, formatter: Optional[str] = None) -> str: """Format a code string using Black. Parameters @@ -101,29 +170,19 @@ def _format(code_str: str) -> str: The string containing Python/TVMScript code to format +formatter: Optional[str] + +The formatter to use. Can specify `ruff`, `black`, or +auto-select by passing `None`. + Returns --- formatted: str The formatted Python/TVMScript code + """ -try: -# pylint: disable=import-outside-toplevel -import black -except ImportError as err: -with warnings.catch_warnings(): -warnings.simplefilter("once", UserWarning) -install_cmd = sys.executable + ' -m pip install "black==22.3.0" --upgrade --user' -warnings.warn( -str(err) -+ "\n" -+ "To print formatted TVM script, please install the formatter 'Black':\n" -+ install_cmd, -category=UserWarning, -) -
Re: [PR] [TVMScript] Optionally use `ruff format` instead of `black` [tvm]
quic-sanirudh merged PR #16876: URL: https://github.com/apache/tvm/pull/16876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch nightly updated (d4056ca795 -> 460f6f1d3e)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch nightly in repository https://gitbox.apache.org/repos/asf/tvm.git from d4056ca795 [SVE] Support splitting by vscale in `tir::split` and `te::split` (#16862) add cdfdd0e4ec [Contrib] Enable fp16 for thrust sort (#16887) add e738f1d4f1 [Relax][Frontend] Fix sort, argsort and topk in nn module (#16886) add 95d6778908 [dlight] Add check for matmul dtype and fix reduction rule (#16884) add d1ac73ca2d [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#16888) add 3680a0d5a2 [RUNTIME][VULKAN] Support total_global_memory (#16890) add 08965f08ff [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul (#16892) add 4cb4605ba3 [TVMScript][Bug] Add test case for missing symbolic bounds (#16877) add 94a44d7d62 [QoL][Relax] Return well-formed IR from relax::Function::CreateEmpty (#16861) add 460f6f1d3e [QoL][Relax] Infer StructInfo for relax::Tuple on construction (#16860) No new revisions were added by this update. Summary of changes: include/tvm/relax/expr.h | 2 + include/tvm/runtime/data_type.h| 3 ++ python/tvm/contrib/tvmjs.py| 19 + python/tvm/dlight/gpu/matmul.py| 3 +- python/tvm/dlight/gpu/reduction.py | 16 +++ python/tvm/relax/backend/contrib/cublas.py | 16 ++- python/tvm/relax/frontend/nn/op.py | 6 +-- python/tvm/relax/transform/legalize_ops/qdq.py | 27 +++- src/relax/backend/contrib/utils.h | 4 ++ src/relax/ir/expr.cc | 40 ++--- src/relax/op/tensor/qdq.cc | 18 +--- src/runtime/contrib/cublas/cublas.cc | 9 ++-- src/runtime/contrib/thrust/thrust.cu | 14 +- src/runtime/vulkan/vulkan_device.cc| 7 ++- src/runtime/vulkan/vulkan_device.h | 2 + src/runtime/vulkan/vulkan_device_api.cc| 1 + src/tir/op/op.cc | 2 + tests/python/relax/test_codegen_cublas.py | 59 ++ tests/python/relax/test_expr.py| 19 + tests/python/relax/test_frontend_nn_op.py | 29 + tests/python/relax/test_op_qdq.py | 37 tests/python/relax/test_tvmscript_parser.py| 24 +++ 22 files changed, 312 insertions(+), 45 deletions(-)
Re: [PR] [relay][feature] save relay IR as onnx for visualize [tvm]
ShawnZhuang closed pull request #16847: [relay][feature] save relay IR as onnx for visualize URL: https://github.com/apache/tvm/pull/16847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Relax] Handle binary operations between Tensor and PrimValue [tvm]
tqchen commented on PR #16827: URL: https://github.com/apache/tvm/pull/16827#issuecomment-2060169629 thanks @Lunderberg should be good to go after ci -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [QoL][Relax] Infer StructInfo for relax::Tuple on construction (#16860)
This is an automated email from the ASF dual-hosted git repository. wuwei pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 460f6f1d3e [QoL][Relax] Infer StructInfo for relax::Tuple on construction (#16860) 460f6f1d3e is described below commit 460f6f1d3e1625882df701252234350f83aa6da1 Author: Eric Lunderberg AuthorDate: Tue Apr 16 16:28:00 2024 -0500 [QoL][Relax] Infer StructInfo for relax::Tuple on construction (#16860) Prior to this commit, the `relax::Tuple` constructor left the `struct_info_` field undefined. This is inconsistent with other Relax leaf nodes, such as `relax::PrimValue`, `relax::Constant`, and `relax::ExternFunc`, which initialize their struct info on construction. This commit updates the `relax::Tuple` constructor to define `struct_info_` as `TupleStructInfo`, if all fields have a known struct info. If any field does not have a known struct info, the current behavior is kept, where `struct_info_` is constructed as `NullOpt`, and is later populated by the `relax::BlockBuilder`. --- src/relax/ir/expr.cc| 16 tests/python/relax/test_expr.py | 19 +++ 2 files changed, 35 insertions(+) diff --git a/src/relax/ir/expr.cc b/src/relax/ir/expr.cc index 0530bb770b..dd0f68dca4 100644 --- a/src/relax/ir/expr.cc +++ b/src/relax/ir/expr.cc @@ -137,9 +137,25 @@ TVM_REGISTER_GLOBAL("relax.If") }); Tuple::Tuple(tvm::Array fields, Span span) { + Optional tuple_sinfo = [&]() -> Optional { +Array field_sinfo; +for (const auto& field : fields) { + if (field->struct_info_.defined()) { +field_sinfo.push_back(GetStructInfo(field)); + } else { +return NullOpt; + } +} +return TupleStructInfo(field_sinfo); + }(); + ObjectPtr n = make_object(); n->fields = std::move(fields); n->span = std::move(span); + if (tuple_sinfo) { +n->checked_type_ = GetStaticType(tuple_sinfo.value()); + } + n->struct_info_ = tuple_sinfo; data_ = std::move(n); } diff --git a/tests/python/relax/test_expr.py b/tests/python/relax/test_expr.py index af1bc851be..b20c9ef2d9 100644 --- a/tests/python/relax/test_expr.py +++ b/tests/python/relax/test_expr.py @@ -86,6 +86,25 @@ def test_tuple() -> None: t[-3] +def test_tuple_sinfo_inferred_on_construction(): +v0 = rx.Var("v0", rx.ObjectStructInfo()) +v1 = rx.Var("v1", rx.ObjectStructInfo()) +tup = rx.Tuple((v0, v1)) + +assert tup.struct_info_ is not None +tvm.ir.assert_structural_equal( +tup.struct_info, rx.TupleStructInfo([rx.ObjectStructInfo(), rx.ObjectStructInfo()]) +) + + +def test_tuple_sinfo_requires_fields_with_known_sinfo(): +v0 = rx.Var("v0", rx.ObjectStructInfo()) +v1 = rx.Var("v1") +tup = rx.Tuple((v0, v1)) + +assert tup.struct_info_ is None + + def test_match_cast() -> None: # match_cast([16, 8], [m, n]) m = tir.Var("m", dtype="int64")
Re: [PR] [QoL][Relax] Infer StructInfo for relax::Tuple on construction [tvm]
vinx13 merged PR #16860: URL: https://github.com/apache/tvm/pull/16860 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [QoL][Relax] Return well-formed IR from relax::Function::CreateEmpty (#16861)
This is an automated email from the ASF dual-hosted git repository. wuwei pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 94a44d7d62 [QoL][Relax] Return well-formed IR from relax::Function::CreateEmpty (#16861) 94a44d7d62 is described below commit 94a44d7d62206849b891c1c262843d88bfb54c3b Author: Eric Lunderberg AuthorDate: Tue Apr 16 16:26:54 2024 -0500 [QoL][Relax] Return well-formed IR from relax::Function::CreateEmpty (#16861) Prior to this commit, the static method `relax::Function::CreateEmpty` returned a function with a nullptr as the body. While only intended for use in bookkeeping for TVMScript, allowing nullptr in this location can cause unexpected segfaults while debugging. For example, adding a print statement This commit updates the `relax::Function::CreateEmpty` function to contain a placeholder body, consistent with the `ret_struct_info` argument provided. --- include/tvm/relax/expr.h | 2 ++ src/relax/ir/expr.cc | 24 +++- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/include/tvm/relax/expr.h b/include/tvm/relax/expr.h index 40707675fe..e2176cf720 100644 --- a/include/tvm/relax/expr.h +++ b/include/tvm/relax/expr.h @@ -1045,6 +1045,8 @@ class ExternFuncNode : public BaseFuncNode { class ExternFunc : public BaseFunc { public: TVM_DLL ExternFunc(String global_symbol, Span span = Span()); + TVM_DLL ExternFunc(String global_symbol, StructInfo struct_info, Span span = Span()); + TVM_DEFINE_OBJECT_REF_METHODS(ExternFunc, BaseFunc, ExternFuncNode); TVM_DEFINE_OBJECT_REF_COW_METHOD(ExternFuncNode); }; diff --git a/src/relax/ir/expr.cc b/src/relax/ir/expr.cc index 1b5551e509..0530bb770b 100644 --- a/src/relax/ir/expr.cc +++ b/src/relax/ir/expr.cc @@ -559,10 +559,18 @@ Function Function::CreateEmpty(Array params, StructInfo ret_struct_info, bo FuncStructInfo finfo(param_sinfo, ret_struct_info, is_pure); + // A dummy body, to ensure that the empty function is still well-formed. + Expr body = [&]() -> Expr { +Var output("output", ret_struct_info); +Call expr(ExternFunc("_dummy_function", FuncStructInfo({}, ret_struct_info)), {}); + +return SeqExpr({BindingBlock({VarBinding(output, expr)})}, output); + }(); + // set the fields ObjectPtr n = make_object(); n->params = std::move(params); - n->body = Expr(); + n->body = std::move(body); n->is_pure = is_pure; n->checked_type_ = GetStaticType(finfo); n->struct_info_ = std::move(finfo); @@ -602,13 +610,19 @@ FuncStructInfo GetExternFuncStructInfo() { TVM_REGISTER_NODE_TYPE(ExternFuncNode); -ExternFunc::ExternFunc(String global_symbol, Span span) { +ExternFunc::ExternFunc(String global_symbol, Span span) +: ExternFunc(global_symbol, GetExternFuncStructInfo(), span) {} + +ExternFunc::ExternFunc(String global_symbol, StructInfo struct_info, Span span) { + CHECK(struct_info.as()) + << "ExternFunc must have FuncStructInfo, " + << "but declaration of '" << global_symbol << "' received " << struct_info; + ObjectPtr n = make_object(); n->global_symbol = std::move(global_symbol); n->span = span; - static auto sinfo = GetExternFuncStructInfo(); - n->struct_info_ = sinfo; - n->checked_type_ = GetStaticType(sinfo); + n->struct_info_ = struct_info; + n->checked_type_ = GetStaticType(struct_info); data_ = std::move(n); }
Re: [PR] [QoL][Relax] Return well-formed IR from relax::Function::CreateEmpty [tvm]
vinx13 merged PR #16861: URL: https://github.com/apache/tvm/pull/16861 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [TVMScript][Bug] Add test case for missing symbolic bounds (#16877)
This is an automated email from the ASF dual-hosted git repository. wuwei pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 4cb4605ba3 [TVMScript][Bug] Add test case for missing symbolic bounds (#16877) 4cb4605ba3 is described below commit 4cb4605ba3cb8e083aa0678515bac76ea66471f9 Author: Eric Lunderberg AuthorDate: Tue Apr 16 16:25:59 2024 -0500 [TVMScript][Bug] Add test case for missing symbolic bounds (#16877) Because Relax struct inference is performed while the function is being built, all constraints on symbolic variables that are used for simplifications must be provided to the analyzer. This is not currently the case, nor is there a clear way to fix this issue. --- tests/python/relax/test_tvmscript_parser.py | 24 1 file changed, 24 insertions(+) diff --git a/tests/python/relax/test_tvmscript_parser.py b/tests/python/relax/test_tvmscript_parser.py index e692768a12..64014d1c49 100644 --- a/tests/python/relax/test_tvmscript_parser.py +++ b/tests/python/relax/test_tvmscript_parser.py @@ -2293,5 +2293,29 @@ def test_function_attributes_are_defined(): assert func.attrs is not None +@pytest.mark.xfail(reason="Bug: Implicit bounds not provided when parsing") +def test_function_symbolic_variables_are_annotated(): +"""Symbolic variables must be exposed for struct inference + +Because Relax struct inference is performed while the function is +being built, all constraints on symbolic variables that are used +for simplifications must be provided to the analyzer. +""" + +@R.function(private=True) +def inferred_sinfo(A: R.Tensor(["extent"])): +extent = T.int64() +output = R.strided_slice(A, [0], [0], [extent - 1]) +return output + +@R.function(private=True) +def expected(A: R.Tensor(["extent"])) -> R.Tensor(["extent-1"]): +extent = T.int64() +output: R.Tensor([extent - 1]) = R.strided_slice(A, [0], [0], [extent - 1]) +return output + +tvm.ir.assert_structural_equal(inferred_sinfo, expected) + + if __name__ == "__main__": tvm.testing.main()
Re: [PR] [TVMScript][Bug] Add test case for missing symbolic bounds [tvm]
vinx13 merged PR #16877: URL: https://github.com/apache/tvm/pull/16877 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [BYOC] Add layout check and update shape check for cublas FP8 BYOC [tvm]
vinx13 opened a new pull request, #16895: URL: https://github.com/apache/tvm/pull/16895 This adds additional check for cublas BYOC as it only supports TN layout. It also change the shape requirement to multiple of 16 bytes. cc @tqchen @ibsidorenko -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul (#16892)
This is an automated email from the ASF dual-hosted git repository. wuwei pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 08965f08ff [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul (#16892) 08965f08ff is described below commit 08965f08ff0f6a8d34d45f8275c4aa78b04c90ee Author: Ivan Sidorenko <98739392+ibsidore...@users.noreply.github.com> AuthorDate: Tue Apr 16 23:11:01 2024 +0300 [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul (#16892) This commit replaces fp16 compute dtype and scale dtype by fp32 in cublas matmul. --- src/runtime/contrib/cublas/cublas.cc | 6 -- 1 file changed, 6 deletions(-) diff --git a/src/runtime/contrib/cublas/cublas.cc b/src/runtime/contrib/cublas/cublas.cc index 49aa35a7e0..553d4014c0 100644 --- a/src/runtime/contrib/cublas/cublas.cc +++ b/src/runtime/contrib/cublas/cublas.cc @@ -150,8 +150,6 @@ void CallCublasLt(cublasLtHandle_t hdl, cudaStream_t stream, cudaDataType_t c_type = CUDA_R_32F; float one_fp32 = 1.0; float zero_fp32 = 0.0; - auto one_fp16 = __truncXfYf2__(1.0); - auto zero_fp16 = __truncXfYf2__(0.0); int32_t one_i32 = 1; int32_t zero_i32 = 0; void* alpha = _fp32; @@ -168,10 +166,6 @@ void CallCublasLt(cublasLtHandle_t hdl, cudaStream_t stream, if (TypeMatch(C->dtype, kDLFloat, 16)) { c_type = CUDA_R_16F; -compute_type = CUBLAS_COMPUTE_16F; -scale_type = CUDA_R_16F; -alpha = _fp16; -beta = _fp16; } else if (TypeMatch(C->dtype, kDLInt, 32)) { c_type = CUDA_R_32I; compute_type = CUBLAS_COMPUTE_32I;
Re: [PR] [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul [tvm]
vinx13 merged PR #16892: URL: https://github.com/apache/tvm/pull/16892 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [Dlight] Enhance vectorization for gpu matmul [tvm]
vinx13 opened a new pull request, #16894: URL: https://github.com/apache/tvm/pull/16894 This fixed dlight matmul rule to make sure vectorization is always applied to the innermost loop cc @Hzfengsy @tqchen -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RUNTIME][VULKAN] Support total_global_memory [tvm]
MasterJH5574 merged PR #16890: URL: https://github.com/apache/tvm/pull/16890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [RUNTIME][VULKAN] Support total_global_memory (#16890)
This is an automated email from the ASF dual-hosted git repository. ruihangl pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new 3680a0d5a2 [RUNTIME][VULKAN] Support total_global_memory (#16890) 3680a0d5a2 is described below commit 3680a0d5a23da22124c17a845a39f3ae36b70ca3 Author: Tianqi Chen AuthorDate: Tue Apr 16 15:48:41 2024 -0400 [RUNTIME][VULKAN] Support total_global_memory (#16890) This PR supports total_global_memory query for vulkan devices. --- src/runtime/vulkan/vulkan_device.cc | 7 +-- src/runtime/vulkan/vulkan_device.h | 2 ++ src/runtime/vulkan/vulkan_device_api.cc | 1 + 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/src/runtime/vulkan/vulkan_device.cc b/src/runtime/vulkan/vulkan_device.cc index 7c5ac55f0b..cc39972432 100644 --- a/src/runtime/vulkan/vulkan_device.cc +++ b/src/runtime/vulkan/vulkan_device.cc @@ -293,7 +293,7 @@ VulkanDevice::VulkanDevice(const VulkanInstance& instance, VkPhysicalDevice phy_ for (uint32_t k = 0; k < prop.memoryTypeCount; ++k) { VkMemoryType ty = prop.memoryTypes[k]; -size_t heap_size = prop.memoryHeaps[ty.heapIndex].size; +int64_t heap_size = static_cast(prop.memoryHeaps[ty.heapIndex].size); // host visible if (!(ty.propertyFlags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)) continue; // match copy requirment @@ -312,7 +312,7 @@ VulkanDevice::VulkanDevice(const VulkanInstance& instance, VkPhysicalDevice phy_ win_rank = -1; for (uint32_t k = 0; k < prop.memoryTypeCount; ++k) { VkMemoryType ty = prop.memoryTypes[k]; -size_t heap_size = prop.memoryHeaps[ty.heapIndex].size; +int64_t heap_size = static_cast(prop.memoryHeaps[ty.heapIndex].size); // host visible if (!(ty.propertyFlags & VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT)) continue; // match copy requirment @@ -324,8 +324,10 @@ VulkanDevice::VulkanDevice(const VulkanInstance& instance, VkPhysicalDevice phy_ if (rank > win_rank) { win_rank = rank; compute_mtype_index = k; + compute_memory_size = heap_size; } } + ICHECK_GE(win_rank, 0) << "Cannot find suitable local memory on device."; if (device_properties.supports_push_descriptor) { @@ -383,6 +385,7 @@ void VulkanDevice::do_swap(VulkanDevice&& other) { std::swap(queue_insert_debug_utils_label_functions, other.queue_insert_debug_utils_label_functions); std::swap(compute_mtype_index, other.compute_mtype_index); + std::swap(compute_memory_size, other.compute_memory_size); std::swap(queue, other.queue); std::swap(queue_family_index, other.queue_family_index); std::swap(physical_device_, other.physical_device_); diff --git a/src/runtime/vulkan/vulkan_device.h b/src/runtime/vulkan/vulkan_device.h index 296483a6b1..0573a00e5c 100644 --- a/src/runtime/vulkan/vulkan_device.h +++ b/src/runtime/vulkan/vulkan_device.h @@ -223,6 +223,8 @@ class VulkanDevice { queue_insert_debug_utils_label_functions{nullptr}; // Memory type index for compute uint32_t compute_mtype_index{0}; + // maximum memory size for compute + int64_t compute_memory_size{0}; // queue family_index; uint32_t queue_family_index{uint32_t(-1)}; diff --git a/src/runtime/vulkan/vulkan_device_api.cc b/src/runtime/vulkan/vulkan_device_api.cc index 18a40bf54f..4b337dd524 100644 --- a/src/runtime/vulkan/vulkan_device_api.cc +++ b/src/runtime/vulkan/vulkan_device_api.cc @@ -165,6 +165,7 @@ void VulkanDeviceAPI::GetAttr(Device dev, DeviceAttrKind kind, TVMRetValue* rv) break; case kTotalGlobalMemory: { + *rv = device(index).compute_memory_size; return; } }
Re: [PR] [Target] Use LLVM target parser for determining Arm(R) A-Profile Architecture features [tvm]
tqchen commented on PR #16425: URL: https://github.com/apache/tvm/pull/16425#issuecomment-2059781680 I am also getting additional errors like ``` python test.py [15:23:28] /home/tqchen/github/tvm/src/target/parsers/aprofile.cc:97: Warning: Cannot parse target features. LLVM was not compiled with support for Arm(R)-based targets. [15:23:28] /home/tqchen/github/tvm/src/target/parsers/aprofile.cc:97: Warning: Cannot parse target features. LLVM was not compiled with support for Arm(R)-based targets. ``` this is on a LLVM that was built for rocm. We should not send out an error message during static loading time if ARM target is not used, and only have such error message when we attempt to use tags in aprofile -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC [tvm]
vinx13 merged PR #16888: URL: https://github.com/apache/tvm/pull/16888 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#16888)
This is an automated email from the ASF dual-hosted git repository. wuwei pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new d1ac73ca2d [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#16888) d1ac73ca2d is described below commit d1ac73ca2d3c14dc69e47818871478e8b0f295aa Author: Ivan Sidorenko <98739392+ibsidore...@users.noreply.github.com> AuthorDate: Tue Apr 16 21:55:11 2024 +0300 [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#16888) [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#63) Co-authored-by: Andrey Malyshev --- include/tvm/runtime/data_type.h| 3 ++ python/tvm/contrib/tvmjs.py| 19 + python/tvm/relax/backend/contrib/cublas.py | 16 ++- python/tvm/relax/transform/legalize_ops/qdq.py | 27 +++- src/relax/backend/contrib/utils.h | 4 ++ src/relax/op/tensor/qdq.cc | 18 +--- src/runtime/contrib/cublas/cublas.cc | 3 ++ src/tir/op/op.cc | 2 + tests/python/relax/test_codegen_cublas.py | 59 ++ tests/python/relax/test_op_qdq.py | 37 10 files changed, 169 insertions(+), 19 deletions(-) diff --git a/include/tvm/runtime/data_type.h b/include/tvm/runtime/data_type.h index f7284ec690..a330ccbbdf 100644 --- a/include/tvm/runtime/data_type.h +++ b/include/tvm/runtime/data_type.h @@ -126,6 +126,9 @@ class DataType { code() == DataType::kE5M2Float) && bits() == 8; } + bool is_e4m3_float8() const { return (code() == DataType::kE4M3Float && bits() == 8); } + + bool is_e5m2_float8() const { return (code() == DataType::kE5M2Float && bits() == 8); } /*! \return whether type is a float16 type. */ bool is_float16() const { return is_float() && bits() == 16; } /*! \return whether type is a bfloat16 type. */ diff --git a/python/tvm/contrib/tvmjs.py b/python/tvm/contrib/tvmjs.py index 8d8bd1b051..923301a1f5 100644 --- a/python/tvm/contrib/tvmjs.py +++ b/python/tvm/contrib/tvmjs.py @@ -28,6 +28,11 @@ from typing import Iterator, Mapping, Tuple, Union import numpy as np +try: +import ml_dtypes +except ImportError: +ml_dtypes = None + import tvm from tvm._ffi.libinfo import find_lib_path @@ -295,6 +300,20 @@ def load_ndarray_cache(cachepath: str, device: tvm.runtime.Device): arr = tvm.nd.empty(shape, dtype, device=device) assert offset + nbytes <= len(raw_data) buffer_source = raw_data[offset : offset + nbytes] +if dtype == "e4m3_float8": +if ml_dtypes is not None: +dtype = ml_dtypes.float8_e4m3fn +else: +raise RuntimeError( +"ml_dtypes is not installed, cannot convert e4m3_float8 array to numpy." +) +if dtype == "e5m2_float8": +if ml_dtypes is not None: +dtype = ml_dtypes.float8_e5m2 +else: +raise RuntimeError( +"ml_dtypes is not installed, cannot convert e5m2_float8 array to numpy." +) if encode_format == "f32-to-bf16" and dtype == "float32": data = np.frombuffer(buffer_source, dtype="uint16").reshape(shape) arr.copyfrom(_convert_bf16_to_f32(data)) diff --git a/python/tvm/relax/backend/contrib/cublas.py b/python/tvm/relax/backend/contrib/cublas.py index eecd531e74..f66001d0e8 100644 --- a/python/tvm/relax/backend/contrib/cublas.py +++ b/python/tvm/relax/backend/contrib/cublas.py @@ -28,8 +28,11 @@ from ..patterns import make_matmul_pattern from ..utils import has_leaking_intermediate_variables -def _is_supported_dtype(lhs_dtype, rhs_dtype): +def _is_supported_dtype(lhs_dtype, rhs_dtype, out_dtype): """Check if dtypes in the given workload are supported by cuBLAS BYOC.""" +if lhs_dtype == "e4m3_float8" and rhs_dtype == "e4m3_float8": +# The output cannot be 'e5m2_float8' if inputs are 'e4m3_float8' +return out_dtype != "e5m2_float8" return ( (lhs_dtype == "float16" and rhs_dtype == "float16") or (lhs_dtype == "float32" and rhs_dtype == "float32") @@ -42,10 +45,12 @@ def _check_matmul(context: PatternCheckContext) -> bool: return False lhs = context.annotated_expr["lhs"] rhs = context.annotated_expr["rhs"] +matmul_call = context.annotated_expr["root"] lhs_dtype = lhs.struct_info.dtype rhs_dtype = rhs.struct_info.dtype -if not _is_supported_dtype(lhs_dtype, rhs_dtype): +out_dtype = matmul_call.struct_info.dtype +if not _is_supported_dtype(lhs_dtype, rhs_dtype, out_dtype): return False lhs_shape = lhs.struct_info.shape.values @@ -62,6 +67,13 @@ def _check_matmul(context:
Re: [I] [Bug] Inconsistent Results between Direct Optimization and Sequential Optimization in TVM [tvm]
vinx13 commented on issue #16870: URL: https://github.com/apache/tvm/issues/16870#issuecomment-2059600023 If you call a pass directly (instead of using `Sequential`, it will bypass the check for `opt_level`, `required_pass`, etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SVE] Check for SVE target in func_attr from VectorizeLoop [tvm]
ekalda commented on PR #16893: URL: https://github.com/apache/tvm/pull/16893#issuecomment-2059426390 @Lunderberg @lhutton1 This is a follow up from https://github.com/apache/tvm/pull/16782, it makes sure that scalable vectors are created only for SVE target. In this implementation we check for target in function attributes, so we'd rely on `BindTarget` having run before. Alternatively we could access the "global" target via `Target::Current()`, which would be simpler, but would make it harder to mix SVE and non-SVE targets. I'm happy to go with either option :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [SVE] Check for SVE target in func_attr from VectorizeLoop [tvm]
ekalda commented on code in PR #16893: URL: https://github.com/apache/tvm/pull/16893#discussion_r1567592059 ## src/tir/transforms/vectorize_loop.cc: ## @@ -725,17 +728,33 @@ class Vectorizer : public StmtMutator, public ExprFunctorattrs.GetAttr(tvm::attr::kTarget); +if (target.defined()) { + target_ = Downcast(target); + has_sve_ = target_->GetFeature("has_sve").value_or(Bool(false)); +} + } + Stmt VisitStmt_(const ForNode* op) final { if (op->kind == ForKind::kVectorized) { + auto* extent_as_int = op->extent.as(); + if (!extent_as_int || extent_as_int->value < 1) { +bool is_scalable_expr = CheckContains::ExprContains(op->extent, arith::IsVScaleCall); +ICHECK(is_scalable_expr && has_sve_) +<< "Failed to vectorize loop with extent " << op->extent << " for target " << target_; + } ICHECK(is_zero(op->min)); return Vectorizer(op->loop_var, op->extent)(op->body); } else { return StmtMutator::VisitStmt_(op); } } -}; -Stmt VectorizeLoop(Stmt stmt) { return LoopVectorizer()(std::move(stmt)); } Review Comment: Is this function necessary? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [SVE] Check for SVE target in func_attr from VectorizeLoop [tvm]
ekalda opened a new pull request, #16893: URL: https://github.com/apache/tvm/pull/16893 Check that we are compiling for an SVE enabled target when the extent of a loop marked for vectorizing is a vscale dependent expression. 1. Add call to BindTarget in Graph Executor pipeline (to enable running tvmc flow) 2. Check in LoopVectorize that the extent is either a positive integer or an vscale dependent expression, in which case we'd expect an SVE enabled target to be present in func_attr -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul [tvm]
ibsidorenko opened a new pull request, #16892: URL: https://github.com/apache/tvm/pull/16892 This is attempt to bring [commit](https://github.com/octoml/tvm/commit/f21b9c9c561e7bcb7a81ae12d71568c6e7c1fc49) and align `octoml/tvm` with `apache/tvm`. This commit replaces fp16 compute dtype and scale dtype by fp32 in cublas matmul. According to cuBLAS [docs](https://docs.nvidia.com/cuda/cublas/index.html#cublasltmatmul) there are two possible options for compute/scale dtype when input/output dtype is fp16: 1. compute dtype is `fp16` and scale dtype is `fp16` 2. compute dtype is `fp32` and scale dtype is `fp32` By default, we use 1) in apache/tvm and 2) in octoml/tvm. This commit aligns different behaviour and set `fp32` as default. cc @vinx13 @masahi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [Bug] `MatMul` operator in TVM seems fragile [tvm]
shaoyuyoung opened a new issue, #16891: URL: https://github.com/apache/tvm/issues/16891 TVM seems to have strict restrictions on `MatMul` operator which means that **it cannot use tensors with different shapes**. Look at this simple graph. In Pytorch, the model is **correctly defined** and the input and output shapes are exactly as shown below. The evidence is here: [https://pytorch.org/docs/stable/generated/torch.matmul.html#torch.matmul](https://pytorch.org/docs/stable/generated/torch.matmul.html#torch.matmul) ![image](https://github.com/apache/tvm/assets/100203773/b4c812e8-5fb5-4b33-90cf-380a03e26b6c) When I try to covert `ONNX` to `TVM`, I get an error indicating that the tensor shape is inconsistent. However, When converting `Pytorch` to `TVM`, everything is OK! I guess one possible reason is that `TorchScript` plays a role in this but `ONNX` does not. Moreover, look at the last line of the error message. I wonder why **T.int64(1)** is used here. It seems that TVM has a pretty fragile system of `int64`. ![image](https://github.com/apache/tvm/assets/100203773/e62e8462-212b-4d0b-ac5f-826be20e4557) ### Expected behavior Pass compilation as it can produce results in ONNX and PyTorch. ### Actual behavior **Compilation failure** ``` Traceback (most recent call last): 18: tvm::runtime::PackedFuncObj::Extractor::AssignTypedLambda(tvm::transform::__mk_TVM9::{lambda(tvm::transform::Pass, tvm::IRModule)#1}, std::__cxx11::basic_string, std::allocator >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string, std::allocator >, tvm::runtime::TVMRetValue) 17: tvm::transform::Pass::operator()(tvm::IRModule) const 16: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 15: tvm::relay::transform::FunctionPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 14: _ZN3tvm7runtime13PackedFun 13: tvm::runtime::TypedPackedFunc::AssignTypedLambda(tvm::relay::transform::DynamicToStatic()::{lambda(tvm::relay::Function, tvm::IRModule, tvm::transform::PassContext)#1})::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*) const 12: tvm::relay::DynamicToStatic(tvm::relay::Function, tvm::IRModule) 11: tvm::relay::DynamicToStaticMutator::PrepareInput(tvm::RelayExpr const&) 10: tvm::transform::Pass::operator()(tvm::IRModule) const 9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 8: tvm::relay::transform::FunctionPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 7: tvm::transform::Pass::operator()(tvm::IRModule) const 6: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 5: tvm::transform::ModulePassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 4: tvm::runtime::PackedFuncObj::Extractor::AssignTypedLambda(tvm::relay::transform::InferType()::{lambda(tvm::IRModule, tvm::transform::PassContext const&)#1})::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) 3: tvm::relay::TypeInferencer::Infer(tvm::GlobalVar, tvm::relay::Function) 2: tvm::relay::TypeSolver::Solve() 1: tvm::runtime::PackedFuncObj::Extractor const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>::AssignTypedLambda const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>(bool (*)(tvm::runtime::Array const&, int, tvm::Attrs const&, tvm::TypeReporter const&))::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) 0: bool tvm::relay::BatchMatmulRel(tvm::runtime::Array const&, int, tvm::Attrs const&, tvm::TypeReporter const&) File "/root/anaconda3/conda-bld/tvm-package_1701590675822/work/src/relay/op/nn/nn.h", line 212 InternalError: Check failed: (reporter->AssertEQ(xk, yk)) is false: BatchDot: shapes of x and y is inconsistent, x shape=[T.int64(1), 5, 5], y shape=[5, 5, 4] ``` ### Environment Operating System: Ubuntu 18 TVM:0.15 Torch: 2.1.1 ONNX: 1.15.0 ### Steps to reproduce Here is the script: ```python import torch import torch.nn as nn import tvm from tvm import relay import onnx class DirectMatMulModel(nn.Module): def __init__(self): super(DirectMatMulModel, self).__init__() def forward(self, x1, x2, y1, y2): result1 = torch.matmul(x1, x2) result2 = torch.matmul(y1, y2) final_result = torch.matmul(result1, result2)
Re: [PR] [TVMScript] Optionally use `ruff format` instead of `black` [tvm]
quic-sanirudh commented on code in PR #16876: URL: https://github.com/apache/tvm/pull/16876#discussion_r1567479484 ## python/tvm/script/highlight.py: ## @@ -92,7 +95,73 @@ def cprint( print(highlight(printable, Python3Lexer(), Terminal256Formatter(style=style))) -def _format(code_str: str) -> str: +@functools.lru_cache +def _get_formatter(formatter: Optional[str] = None): +def get_ruff_formatter(): +if shutil.which("ruff") is None: +return None + +def formatter(code_str): +proc = subprocess.Popen( +["ruff", "format", "--stdin-filename=TVMScript"], +stdin=subprocess.PIPE, +stdout=subprocess.PIPE, +encoding="utf-8", +) +stdout, _stderr = proc.communicate(code_str) +return stdout + +return formatter + +def get_black_formatter(): +try: +# pylint: disable=import-outside-toplevel +import black +except ImportError: +return None + +def formatter(code_str): +return black.format_str(code_str, mode=black.FileMode()) + +return formatter + +def get_fallback_formatter(): +def formatter(code_str): +with warnings.catch_warnings(): +warnings.simplefilter("once", UserWarning) +ruff_install_cmd = sys.executable + " -m pip install ruff" Review Comment: Right yeah, that makes more sense. It is very hard to keep them in sync. Perhaps after we remove the version info from the warning message, we can just add a line saying something like **"For version compatibility with CI, please check docker/Dockerfile.ci_lint"** if the version info has to be specified. Thanks for the response. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [TVMScript] Optionally use `ruff format` instead of `black` [tvm]
Lunderberg commented on code in PR #16876: URL: https://github.com/apache/tvm/pull/16876#discussion_r1567426092 ## python/tvm/script/highlight.py: ## @@ -92,7 +95,73 @@ def cprint( print(highlight(printable, Python3Lexer(), Terminal256Formatter(style=style))) -def _format(code_str: str) -> str: +@functools.lru_cache +def _get_formatter(formatter: Optional[str] = None): +def get_ruff_formatter(): +if shutil.which("ruff") is None: +return None + +def formatter(code_str): +proc = subprocess.Popen( +["ruff", "format", "--stdin-filename=TVMScript"], +stdin=subprocess.PIPE, +stdout=subprocess.PIPE, +encoding="utf-8", +) +stdout, _stderr = proc.communicate(code_str) +return stdout + +return formatter + +def get_black_formatter(): +try: +# pylint: disable=import-outside-toplevel +import black +except ImportError: +return None + +def formatter(code_str): +return black.format_str(code_str, mode=black.FileMode()) + +return formatter + +def get_fallback_formatter(): +def formatter(code_str): +with warnings.catch_warnings(): +warnings.simplefilter("once", UserWarning) +ruff_install_cmd = sys.executable + " -m pip install ruff" Review Comment: I'm uncertain, but would lean instead toward removing the specific version of `black`, if we want consistency. Specifying a specific version of `black` is useful, since then it can be matched to the CI's version, in case a developer also uses `black` either for format-on-save or as a pre-commit hook. However, the version number in the warning has become out of sync with the CI's version (warning message was added in November 2022, but the CI was bumped from `22.3.0` to `22.12.0` in March 2023). Since there isn't really a good way to keep them in sync (nor any `ruff` version in the CI with which to be in sync), I'd lean toward removing the versions from the warning message altogether. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
tqchen commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2058952581 Thanks for the note. We are in the process of revamping docs. The latest set of emerging model optimizations like LLMs will be based on relax. https://github.com/apache/tvm/tree/main/python/tvm/relax/frontend/onnx likely is a good reference there -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [dlight] Add check for matmul dtype and fix reduction rule [tvm]
tqchen merged PR #16884: URL: https://github.com/apache/tvm/pull/16884 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated: [Contrib] Enable fp16 for thrust sort (#16887)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git The following commit(s) were added to refs/heads/main by this push: new cdfdd0e4ec [Contrib] Enable fp16 for thrust sort (#16887) cdfdd0e4ec is described below commit cdfdd0e4ec7452bedf4e79ba0ff474d2de70bbbf Author: Siyuan Feng AuthorDate: Tue Apr 16 20:13:21 2024 +0800 [Contrib] Enable fp16 for thrust sort (#16887) [Contrib] Enable fp16 for thrust Enable fp16 for thrust to support LLM cases --- src/runtime/contrib/thrust/thrust.cu | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/runtime/contrib/thrust/thrust.cu b/src/runtime/contrib/thrust/thrust.cu index 28edba64aa..048df518e3 100644 --- a/src/runtime/contrib/thrust/thrust.cu +++ b/src/runtime/contrib/thrust/thrust.cu @@ -167,7 +167,19 @@ void thrust_sort(DLTensor* input, DLTensor* out_values, DLTensor* out_indices, b void thrust_sort_common(DLTensor* input, DLTensor* values_out, DLTensor* indices_out, bool is_ascend, int sort_len, std::string data_dtype, std::string out_dtype, DLTensor* workspace) { - if (data_dtype == "float32") { + if (data_dtype == "float16") { +if (out_dtype == "int32") { + thrust_sort(input, values_out, indices_out, is_ascend, sort_len, workspace); +} else if (out_dtype == "int64") { + thrust_sort(input, values_out, indices_out, is_ascend, sort_len, workspace); +} else if (out_dtype == "float32") { + thrust_sort(input, values_out, indices_out, is_ascend, sort_len, workspace); +} else if (out_dtype == "float64") { + thrust_sort(input, values_out, indices_out, is_ascend, sort_len, workspace); +} else { + LOG(FATAL) << "Unsupported output dtype: " << out_dtype; +} + } else if (data_dtype == "float32") { if (out_dtype == "int32") { thrust_sort(input, values_out, indices_out, is_ascend, sort_len, workspace); } else if (out_dtype == "int64") {
(tvm) branch main updated (e738f1d4f1 -> 95d6778908)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from e738f1d4f1 [Relax][Frontend] Fix sort, argsort and topk in nn module (#16886) add 95d6778908 [dlight] Add check for matmul dtype and fix reduction rule (#16884) No new revisions were added by this update. Summary of changes: python/tvm/dlight/gpu/matmul.py| 3 ++- python/tvm/dlight/gpu/reduction.py | 16 2 files changed, 10 insertions(+), 9 deletions(-)
(tvm) branch main updated (cdfdd0e4ec -> e738f1d4f1)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from cdfdd0e4ec [Contrib] Enable fp16 for thrust sort (#16887) add e738f1d4f1 [Relax][Frontend] Fix sort, argsort and topk in nn module (#16886) No new revisions were added by this update. Summary of changes: python/tvm/relax/frontend/nn/op.py| 6 +++--- tests/python/relax/test_frontend_nn_op.py | 29 + 2 files changed, 32 insertions(+), 3 deletions(-)
Re: [PR] [Relax][Frontend] Fix sort, argsort and topk in nn module [tvm]
tqchen merged PR #16886: URL: https://github.com/apache/tvm/pull/16886 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Contrib] Enable fp16 for thrust sort [tvm]
tqchen merged PR #16887: URL: https://github.com/apache/tvm/pull/16887 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [Bug] Init block not discoverable after sch.blockize [tvm]
nautasolva opened a new issue, #16889: URL: https://github.com/apache/tvm/issues/16889 When used on a block with a init statement, blockize creates a separate init block that is not discoverable by any means. This hinders further scheduling, like tensorizing the init block. ### Expected behavior When using `blockize` on a loop that contains an init statement, the init is moved to a new block `_init` that should be discoverable with `get_block` or `get_children_blocks` on the newly created outer block. ### Actual behavior Init block exists in the TIR module but does not seem to be registered by the schedule. `get_block(")"` fails with `InternalError: Check failed: (it != self_->stmt2ref.end()) is false` Stacktrace Traceback (most recent call last): File "/home/dev/tvm_upstream/../tvm/playground/blockize_init_bug.py", line 31, in a_init = sch.get_block("A_init") File "/home/dev/tvm_upstream/python/tvm/tir/schedule/_type_checker.py", line 340, in wrap return func(*args, **kwargs) File "/home/dev/tvm_upstream/python/tvm/tir/schedule/schedule.py", line 499, in get_block return _ffi_api.ScheduleGetBlock( # type: ignore # pylint: disable=no-member File "/home/dev/tvm_upstream/python/tvm/_ffi/_ctypes/packed_func.py", line 239, in __call__ raise_last_ffi_error() File "/home/dev/tvm_upstream/python/tvm/_ffi/base.py", line 481, in raise_last_ffi_error raise py_err File "/home/dev/tvm_upstream/src/tir/schedule/traced_schedule.cc", line 128, in tvm::tir::TracedScheduleNode::GetBlock(tvm::runtime::String const&, tvm::runtime::Optional const&) BlockRV result = ConcreteScheduleNode::GetBlock(name, func_name); File "/home/dev/tvm_upstream/src/tir/schedule/concrete_schedule.cc", line 321, in tvm::tir::ConcreteScheduleNode::GetBlock(tvm::runtime::String const&, tvm::runtime::Optional const&) Array blocks = tir::GetBlocks(this->state_, name, gv); File "/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc", line 46, in tvm::tir::GetBlocks(tvm::tir::ScheduleState const&, tvm::runtime::String const&, tvm::GlobalVar const&) finder(prim_func->body); File "/home/dev/tvm_upstream/src/tir/ir/stmt_functor.cc", line 142, in tvm::tir::StmtVisitor::VisitStmt_(tvm::tir::BlockNode const*) this->VisitStmt(op->init.value()); File "/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc", line 29, in VisitStmt_ void VisitStmt_(const BlockNode* block) override { File "/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc", line 32, in VisitStmt_ ICHECK(it != self_->stmt2ref.end()); tvm.error.InternalError: Traceback (most recent call last): 5: tvm::tir::TracedScheduleNode::GetBlock(tvm::runtime::String const&, tvm::runtime::Optional const&) at /home/dev/tvm_upstream/src/tir/schedule/traced_schedule.cc:128 4: tvm::tir::ConcreteScheduleNode::GetBlock(tvm::runtime::String const&, tvm::runtime::Optional const&) at /home/dev/tvm_upstream/src/tir/schedule/concrete_schedule.cc:321 3: tvm::tir::GetBlocks(tvm::tir::ScheduleState const&, tvm::runtime::String const&, tvm::GlobalVar const&) at /home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc:46 2: tvm::tir::StmtVisitor::VisitStmt_(tvm::tir::BlockNode const*) at /home/dev/tvm_upstream/src/tir/ir/stmt_functor.cc:142 1: VisitStmt_ at /home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc:29 0: VisitStmt_ at /home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc:32 File "/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc", line 32 ### Environment Reproducible on main (d4056ca79571d4265a12beeedd1b1565953df936) ### Steps to reproduce ```python import tvm from tvm.script import ir as I from tvm.script import tir as T @I.ir_module class Module: @T.prim_func def main(): # with T.block("root"): A_sum = T.alloc_buffer((1,), "float32") A = T.alloc_buffer((1, 16), "float32") for nn, ff in T.grid(1, 16): with T.block("A"): v_nn, v_ff = T.axis.remap("SR", [nn, ff]) T.reads(A[v_nn, v_ff]) T.writes(A_sum[v_nn]) with T.init(): A_sum[v_nn] = T.float32(0) A_sum[v_nn] = A_sum[v_nn] + A[v_nn, v_ff] sch = tvm.tir.Schedule(Module) a = sch.get_block("A") loop_n, loop_f = sch.get_loops(a) sch.blockize(loop_f) print(sch.mod) # <-- A_init exists a_init = sch.get_block("A_init") # <-- fails with InternalError: Check failed: (it != self_->stmt2ref.end()) is false ``` ### Triage * tir:schedule -- This
Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]
gyenesvi commented on PR #108: URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2058579469 Hi, > as a community we recently moves towards the relax IR for latest genAI workloads Thanks for directing us towards Relax. I guess that means that new frontends should convert their representations into Relax IR instead of Relay? The documentation on tvm.apache.org refers to Relay, but not Relax. Is that documentation obsolete in this area? Is Relay going to be superseded by Relax? We only see frontend examples in tvm.relax that we can use as reference. Is there further documentation on tvm.relax? It is interesting to hear that there's more focus on dynamic graphs / shape inference, as one of the key goals of the next version of NNEF, under development, is support for dynamic graphs and shape inference. > it is unclear how much adoption NNEF have as of now versus ONNX and other formats One of the goals of integration into compiler stacks like TVM would be exactly to drive more adoption, as adoption requires public tooling to be able to demonstrate the capabilities / usage of NNEF in end-to-end workflows. As the next version of NNEF will focus on dynamic graphs, custom operations and lowering to tensor IR level, TVM seems like a good option to demonstrate its potential in compilation based inference engines. But first we would like to start with integrating the currently publicly available version of NNEF. Also, TVM has backends to multiple Khronos formats, such as SPIR-V (Vulkan) and OpenCL, that is why TVM could provide us with an end-to-end workflow starting from a Khronos defined input format, and resulting in Khronos defined outputs. Furthermore, some Khronos members may be interested in implementing their own (proprietary) hardware backends to TVM, with which an NNEF frontend could also provide an end-to-end workflow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [TVMScript] Optionally use `ruff format` instead of `black` [tvm]
quic-sanirudh commented on code in PR #16876: URL: https://github.com/apache/tvm/pull/16876#discussion_r1566946710 ## python/tvm/script/highlight.py: ## @@ -92,7 +95,73 @@ def cprint( print(highlight(printable, Python3Lexer(), Terminal256Formatter(style=style))) -def _format(code_str: str) -> str: +@functools.lru_cache +def _get_formatter(formatter: Optional[str] = None): +def get_ruff_formatter(): +if shutil.which("ruff") is None: +return None + +def formatter(code_str): +proc = subprocess.Popen( +["ruff", "format", "--stdin-filename=TVMScript"], +stdin=subprocess.PIPE, +stdout=subprocess.PIPE, +encoding="utf-8", +) +stdout, _stderr = proc.communicate(code_str) +return stdout + +return formatter + +def get_black_formatter(): +try: +# pylint: disable=import-outside-toplevel +import black +except ImportError: +return None + +def formatter(code_str): +return black.format_str(code_str, mode=black.FileMode()) + +return formatter + +def get_fallback_formatter(): +def formatter(code_str): +with warnings.catch_warnings(): +warnings.simplefilter("once", UserWarning) +ruff_install_cmd = sys.executable + " -m pip install ruff" Review Comment: Should we restrict to a specific version of ruff similar to a specific black version we've specified below? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC [tvm]
ibsidorenko opened a new pull request, #16888: URL: https://github.com/apache/tvm/pull/16888 This is attempt to bring [PR#63](https://github.com/octoml/tvm/pull/63) from `octoml/tvm` to `apache/tvm`. This commit enables FP8 e4m3 matmul offloading to cuBLAS. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [Contrib] Enable fp16 for thrust [tvm]
Hzfengsy opened a new pull request, #16887: URL: https://github.com/apache/tvm/pull/16887 Enable fp16 for thrust to support LLM cases cc @vinx13 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org