(tvm) branch main updated: [TVMScript] Optionally use `ruff format` instead of `black` (#16876)

2024-04-16 Thread sanirudh
This is an automated email from the ASF dual-hosted git repository.

sanirudh pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new d030ce27a1 [TVMScript] Optionally use `ruff format` instead of `black` 
(#16876)
d030ce27a1 is described below

commit d030ce27a197e0a3e819b311dca5c5421d1cf5ba
Author: Eric Lunderberg 
AuthorDate: Wed Apr 17 00:04:10 2024 -0500

[TVMScript] Optionally use `ruff format` instead of `black` (#16876)

* [TVMScript] Optionally use `ruff format` instead of `black`

The `ruff format` tool is significantly faster than the `black`
formatter.  For some particularly long TVMScript modules, using it can
reduce the time required to show a formatted module from ~5 minutes to
~1 minute.  This commit updates the `.show()` function to apply the
optionally formatting using `ruff format` if available, falling back
to `black` otherwise.

* Fix lint error
---
 python/tvm/script/highlight.py | 95 ++
 1 file changed, 77 insertions(+), 18 deletions(-)

diff --git a/python/tvm/script/highlight.py b/python/tvm/script/highlight.py
index be0de5a6bf..e017c1e6ca 100644
--- a/python/tvm/script/highlight.py
+++ b/python/tvm/script/highlight.py
@@ -17,7 +17,10 @@
 """Highlight printed TVM script.
 """
 
+import functools
 import os
+import shutil
+import subprocess
 import sys
 import warnings
 from typing import Any, Optional, Union
@@ -92,7 +95,73 @@ def cprint(
 print(highlight(printable, Python3Lexer(), 
Terminal256Formatter(style=style)))
 
 
-def _format(code_str: str) -> str:
+@functools.lru_cache
+def _get_formatter(formatter: Optional[str] = None):
+def get_ruff_formatter():
+if shutil.which("ruff") is None:
+return None
+
+def formatter(code_str):
+proc = subprocess.Popen(
+["ruff", "format", "--stdin-filename=TVMScript"],
+stdin=subprocess.PIPE,
+stdout=subprocess.PIPE,
+encoding="utf-8",
+)
+stdout, _stderr = proc.communicate(code_str)
+return stdout
+
+return formatter
+
+def get_black_formatter():
+try:
+# pylint: disable=import-outside-toplevel
+import black
+except ImportError:
+return None
+
+def formatter(code_str):
+return black.format_str(code_str, mode=black.FileMode())
+
+return formatter
+
+def get_fallback_formatter():
+def formatter(code_str):
+with warnings.catch_warnings():
+warnings.simplefilter("once", UserWarning)
+ruff_install_cmd = sys.executable + " -m pip install ruff"
+black_install_cmd = (
+sys.executable + ' -m pip install "black==22.3.0" 
--upgrade --user'
+)
+warnings.warn(
+f"Neither the 'ruff' formatter nor the 'black' formatter 
is available.  "
+f"To print formatted TVM script, please a formatter.  \n"
+f"To install ruff: {ruff_install_cmd}\n"
+f"To install black: {black_install_cmd}",
+category=UserWarning,
+)
+return code_str
+
+return formatter
+
+# formatter = "black"
+if formatter is None:
+options = [get_ruff_formatter, get_black_formatter]
+elif formatter == "ruff":
+options = [get_ruff_formatter]
+elif formatter == "black":
+options = [get_black_formatter]
+else:
+raise ValueError(f"Unknown formatter: {formatter}")
+
+for option in options:
+func = option()
+if func is not None:
+return func
+return get_fallback_formatter()
+
+
+def _format(code_str: str, formatter: Optional[str] = None) -> str:
 """Format a code string using Black.
 
 Parameters
@@ -101,29 +170,19 @@ def _format(code_str: str) -> str:
 
 The string containing Python/TVMScript code to format
 
+formatter: Optional[str]
+
+The formatter to use.  Can specify `ruff`, `black`, or
+auto-select by passing `None`.
+
 Returns
 ---
 formatted: str
 
 The formatted Python/TVMScript code
+
 """
-try:
-# pylint: disable=import-outside-toplevel
-import black
-except ImportError as err:
-with warnings.catch_warnings():
-warnings.simplefilter("once", UserWarning)
-install_cmd = sys.executable + ' -m pip install "black==22.3.0" 
--upgrade --user'
-warnings.warn(
-str(err)
-+ "\n"
-+ "To print formatted TVM script, please install the formatter 
'Black':\n"
-+ install_cmd,
-category=UserWarning,
-)
-

Re: [PR] [TVMScript] Optionally use `ruff format` instead of `black` [tvm]

2024-04-16 Thread via GitHub


quic-sanirudh merged PR #16876:
URL: https://github.com/apache/tvm/pull/16876


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(tvm) branch nightly updated (d4056ca795 -> 460f6f1d3e)

2024-04-16 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch nightly
in repository https://gitbox.apache.org/repos/asf/tvm.git


from d4056ca795 [SVE] Support splitting by vscale in `tir::split` and 
`te::split` (#16862)
 add cdfdd0e4ec [Contrib] Enable fp16 for thrust sort (#16887)
 add e738f1d4f1 [Relax][Frontend] Fix sort, argsort and topk in nn module 
(#16886)
 add 95d6778908 [dlight] Add check for matmul dtype and fix reduction rule 
(#16884)
 add d1ac73ca2d [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#16888)
 add 3680a0d5a2 [RUNTIME][VULKAN] Support total_global_memory (#16890)
 add 08965f08ff [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul 
(#16892)
 add 4cb4605ba3 [TVMScript][Bug] Add test case for missing symbolic bounds 
(#16877)
 add 94a44d7d62 [QoL][Relax] Return well-formed IR from 
relax::Function::CreateEmpty (#16861)
 add 460f6f1d3e [QoL][Relax] Infer StructInfo for relax::Tuple on 
construction (#16860)

No new revisions were added by this update.

Summary of changes:
 include/tvm/relax/expr.h   |  2 +
 include/tvm/runtime/data_type.h|  3 ++
 python/tvm/contrib/tvmjs.py| 19 +
 python/tvm/dlight/gpu/matmul.py|  3 +-
 python/tvm/dlight/gpu/reduction.py | 16 +++
 python/tvm/relax/backend/contrib/cublas.py | 16 ++-
 python/tvm/relax/frontend/nn/op.py |  6 +--
 python/tvm/relax/transform/legalize_ops/qdq.py | 27 +++-
 src/relax/backend/contrib/utils.h  |  4 ++
 src/relax/ir/expr.cc   | 40 ++---
 src/relax/op/tensor/qdq.cc | 18 +---
 src/runtime/contrib/cublas/cublas.cc   |  9 ++--
 src/runtime/contrib/thrust/thrust.cu   | 14 +-
 src/runtime/vulkan/vulkan_device.cc|  7 ++-
 src/runtime/vulkan/vulkan_device.h |  2 +
 src/runtime/vulkan/vulkan_device_api.cc|  1 +
 src/tir/op/op.cc   |  2 +
 tests/python/relax/test_codegen_cublas.py  | 59 ++
 tests/python/relax/test_expr.py| 19 +
 tests/python/relax/test_frontend_nn_op.py  | 29 +
 tests/python/relax/test_op_qdq.py  | 37 
 tests/python/relax/test_tvmscript_parser.py| 24 +++
 22 files changed, 312 insertions(+), 45 deletions(-)



Re: [PR] [relay][feature] save relay IR as onnx for visualize [tvm]

2024-04-16 Thread via GitHub


ShawnZhuang closed pull request #16847: [relay][feature] save relay IR as onnx 
for visualize
URL: https://github.com/apache/tvm/pull/16847


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [Relax] Handle binary operations between Tensor and PrimValue [tvm]

2024-04-16 Thread via GitHub


tqchen commented on PR #16827:
URL: https://github.com/apache/tvm/pull/16827#issuecomment-2060169629

   thanks @Lunderberg should be good to go after ci


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(tvm) branch main updated: [QoL][Relax] Infer StructInfo for relax::Tuple on construction (#16860)

2024-04-16 Thread wuwei
This is an automated email from the ASF dual-hosted git repository.

wuwei pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 460f6f1d3e [QoL][Relax] Infer StructInfo for relax::Tuple on 
construction (#16860)
460f6f1d3e is described below

commit 460f6f1d3e1625882df701252234350f83aa6da1
Author: Eric Lunderberg 
AuthorDate: Tue Apr 16 16:28:00 2024 -0500

[QoL][Relax] Infer StructInfo for relax::Tuple on construction (#16860)

Prior to this commit, the `relax::Tuple` constructor left the
`struct_info_` field undefined.  This is inconsistent with other Relax
leaf nodes, such as `relax::PrimValue`, `relax::Constant`, and
`relax::ExternFunc`, which initialize their struct info on
construction.

This commit updates the `relax::Tuple` constructor to define
`struct_info_` as `TupleStructInfo`, if all fields have a known struct
info.  If any field does not have a known struct info, the current
behavior is kept, where `struct_info_` is constructed as `NullOpt`,
and is later populated by the `relax::BlockBuilder`.
---
 src/relax/ir/expr.cc| 16 
 tests/python/relax/test_expr.py | 19 +++
 2 files changed, 35 insertions(+)

diff --git a/src/relax/ir/expr.cc b/src/relax/ir/expr.cc
index 0530bb770b..dd0f68dca4 100644
--- a/src/relax/ir/expr.cc
+++ b/src/relax/ir/expr.cc
@@ -137,9 +137,25 @@ TVM_REGISTER_GLOBAL("relax.If")
 });
 
 Tuple::Tuple(tvm::Array fields, Span span) {
+  Optional tuple_sinfo = [&]() -> Optional {
+Array field_sinfo;
+for (const auto& field : fields) {
+  if (field->struct_info_.defined()) {
+field_sinfo.push_back(GetStructInfo(field));
+  } else {
+return NullOpt;
+  }
+}
+return TupleStructInfo(field_sinfo);
+  }();
+
   ObjectPtr n = make_object();
   n->fields = std::move(fields);
   n->span = std::move(span);
+  if (tuple_sinfo) {
+n->checked_type_ = GetStaticType(tuple_sinfo.value());
+  }
+  n->struct_info_ = tuple_sinfo;
   data_ = std::move(n);
 }
 
diff --git a/tests/python/relax/test_expr.py b/tests/python/relax/test_expr.py
index af1bc851be..b20c9ef2d9 100644
--- a/tests/python/relax/test_expr.py
+++ b/tests/python/relax/test_expr.py
@@ -86,6 +86,25 @@ def test_tuple() -> None:
 t[-3]
 
 
+def test_tuple_sinfo_inferred_on_construction():
+v0 = rx.Var("v0", rx.ObjectStructInfo())
+v1 = rx.Var("v1", rx.ObjectStructInfo())
+tup = rx.Tuple((v0, v1))
+
+assert tup.struct_info_ is not None
+tvm.ir.assert_structural_equal(
+tup.struct_info, rx.TupleStructInfo([rx.ObjectStructInfo(), 
rx.ObjectStructInfo()])
+)
+
+
+def test_tuple_sinfo_requires_fields_with_known_sinfo():
+v0 = rx.Var("v0", rx.ObjectStructInfo())
+v1 = rx.Var("v1")
+tup = rx.Tuple((v0, v1))
+
+assert tup.struct_info_ is None
+
+
 def test_match_cast() -> None:
 # match_cast([16, 8], [m, n])
 m = tir.Var("m", dtype="int64")



Re: [PR] [QoL][Relax] Infer StructInfo for relax::Tuple on construction [tvm]

2024-04-16 Thread via GitHub


vinx13 merged PR #16860:
URL: https://github.com/apache/tvm/pull/16860


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(tvm) branch main updated: [QoL][Relax] Return well-formed IR from relax::Function::CreateEmpty (#16861)

2024-04-16 Thread wuwei
This is an automated email from the ASF dual-hosted git repository.

wuwei pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 94a44d7d62 [QoL][Relax] Return well-formed IR from 
relax::Function::CreateEmpty (#16861)
94a44d7d62 is described below

commit 94a44d7d62206849b891c1c262843d88bfb54c3b
Author: Eric Lunderberg 
AuthorDate: Tue Apr 16 16:26:54 2024 -0500

[QoL][Relax] Return well-formed IR from relax::Function::CreateEmpty 
(#16861)

Prior to this commit, the static method `relax::Function::CreateEmpty`
returned a function with a nullptr as the body.  While only intended for 
use in
bookkeeping for TVMScript, allowing nullptr in this location can cause
unexpected segfaults while debugging.  For example, adding a print
statement

This commit updates the `relax::Function::CreateEmpty` function to
contain a placeholder body, consistent with the `ret_struct_info`
argument provided.
---
 include/tvm/relax/expr.h |  2 ++
 src/relax/ir/expr.cc | 24 +++-
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/include/tvm/relax/expr.h b/include/tvm/relax/expr.h
index 40707675fe..e2176cf720 100644
--- a/include/tvm/relax/expr.h
+++ b/include/tvm/relax/expr.h
@@ -1045,6 +1045,8 @@ class ExternFuncNode : public BaseFuncNode {
 class ExternFunc : public BaseFunc {
  public:
   TVM_DLL ExternFunc(String global_symbol, Span span = Span());
+  TVM_DLL ExternFunc(String global_symbol, StructInfo struct_info, Span span = 
Span());
+
   TVM_DEFINE_OBJECT_REF_METHODS(ExternFunc, BaseFunc, ExternFuncNode);
   TVM_DEFINE_OBJECT_REF_COW_METHOD(ExternFuncNode);
 };
diff --git a/src/relax/ir/expr.cc b/src/relax/ir/expr.cc
index 1b5551e509..0530bb770b 100644
--- a/src/relax/ir/expr.cc
+++ b/src/relax/ir/expr.cc
@@ -559,10 +559,18 @@ Function Function::CreateEmpty(Array params, 
StructInfo ret_struct_info, bo
 
   FuncStructInfo finfo(param_sinfo, ret_struct_info, is_pure);
 
+  // A dummy body, to ensure that the empty function is still well-formed.
+  Expr body = [&]() -> Expr {
+Var output("output", ret_struct_info);
+Call expr(ExternFunc("_dummy_function", FuncStructInfo({}, 
ret_struct_info)), {});
+
+return SeqExpr({BindingBlock({VarBinding(output, expr)})}, output);
+  }();
+
   // set the fields
   ObjectPtr n = make_object();
   n->params = std::move(params);
-  n->body = Expr();
+  n->body = std::move(body);
   n->is_pure = is_pure;
   n->checked_type_ = GetStaticType(finfo);
   n->struct_info_ = std::move(finfo);
@@ -602,13 +610,19 @@ FuncStructInfo GetExternFuncStructInfo() {
 
 TVM_REGISTER_NODE_TYPE(ExternFuncNode);
 
-ExternFunc::ExternFunc(String global_symbol, Span span) {
+ExternFunc::ExternFunc(String global_symbol, Span span)
+: ExternFunc(global_symbol, GetExternFuncStructInfo(), span) {}
+
+ExternFunc::ExternFunc(String global_symbol, StructInfo struct_info, Span 
span) {
+  CHECK(struct_info.as())
+  << "ExternFunc must have FuncStructInfo, "
+  << "but declaration of '" << global_symbol << "' received " << 
struct_info;
+
   ObjectPtr n = make_object();
   n->global_symbol = std::move(global_symbol);
   n->span = span;
-  static auto sinfo = GetExternFuncStructInfo();
-  n->struct_info_ = sinfo;
-  n->checked_type_ = GetStaticType(sinfo);
+  n->struct_info_ = struct_info;
+  n->checked_type_ = GetStaticType(struct_info);
   data_ = std::move(n);
 }
 



Re: [PR] [QoL][Relax] Return well-formed IR from relax::Function::CreateEmpty [tvm]

2024-04-16 Thread via GitHub


vinx13 merged PR #16861:
URL: https://github.com/apache/tvm/pull/16861


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(tvm) branch main updated: [TVMScript][Bug] Add test case for missing symbolic bounds (#16877)

2024-04-16 Thread wuwei
This is an automated email from the ASF dual-hosted git repository.

wuwei pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 4cb4605ba3 [TVMScript][Bug] Add test case for missing symbolic bounds 
(#16877)
4cb4605ba3 is described below

commit 4cb4605ba3cb8e083aa0678515bac76ea66471f9
Author: Eric Lunderberg 
AuthorDate: Tue Apr 16 16:25:59 2024 -0500

[TVMScript][Bug] Add test case for missing symbolic bounds (#16877)

Because Relax struct inference is performed while the function is
being built, all constraints on symbolic variables that are used
for simplifications must be provided to the analyzer.  This is not
currently the case, nor is there a clear way to fix this issue.
---
 tests/python/relax/test_tvmscript_parser.py | 24 
 1 file changed, 24 insertions(+)

diff --git a/tests/python/relax/test_tvmscript_parser.py 
b/tests/python/relax/test_tvmscript_parser.py
index e692768a12..64014d1c49 100644
--- a/tests/python/relax/test_tvmscript_parser.py
+++ b/tests/python/relax/test_tvmscript_parser.py
@@ -2293,5 +2293,29 @@ def test_function_attributes_are_defined():
 assert func.attrs is not None
 
 
+@pytest.mark.xfail(reason="Bug: Implicit bounds not provided when parsing")
+def test_function_symbolic_variables_are_annotated():
+"""Symbolic variables must be exposed for struct inference
+
+Because Relax struct inference is performed while the function is
+being built, all constraints on symbolic variables that are used
+for simplifications must be provided to the analyzer.
+"""
+
+@R.function(private=True)
+def inferred_sinfo(A: R.Tensor(["extent"])):
+extent = T.int64()
+output = R.strided_slice(A, [0], [0], [extent - 1])
+return output
+
+@R.function(private=True)
+def expected(A: R.Tensor(["extent"])) -> R.Tensor(["extent-1"]):
+extent = T.int64()
+output: R.Tensor([extent - 1]) = R.strided_slice(A, [0], [0], [extent 
- 1])
+return output
+
+tvm.ir.assert_structural_equal(inferred_sinfo, expected)
+
+
 if __name__ == "__main__":
 tvm.testing.main()



Re: [PR] [TVMScript][Bug] Add test case for missing symbolic bounds [tvm]

2024-04-16 Thread via GitHub


vinx13 merged PR #16877:
URL: https://github.com/apache/tvm/pull/16877


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [BYOC] Add layout check and update shape check for cublas FP8 BYOC [tvm]

2024-04-16 Thread via GitHub


vinx13 opened a new pull request, #16895:
URL: https://github.com/apache/tvm/pull/16895

   This adds additional check for cublas BYOC as it only supports TN layout. It 
also change the shape requirement to multiple of 16 bytes.
   
   cc @tqchen @ibsidorenko 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(tvm) branch main updated: [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul (#16892)

2024-04-16 Thread wuwei
This is an automated email from the ASF dual-hosted git repository.

wuwei pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 08965f08ff [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul 
(#16892)
08965f08ff is described below

commit 08965f08ff0f6a8d34d45f8275c4aa78b04c90ee
Author: Ivan Sidorenko <98739392+ibsidore...@users.noreply.github.com>
AuthorDate: Tue Apr 16 23:11:01 2024 +0300

[CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul (#16892)

This commit replaces fp16 compute dtype and scale dtype by fp32 in
cublas matmul.
---
 src/runtime/contrib/cublas/cublas.cc | 6 --
 1 file changed, 6 deletions(-)

diff --git a/src/runtime/contrib/cublas/cublas.cc 
b/src/runtime/contrib/cublas/cublas.cc
index 49aa35a7e0..553d4014c0 100644
--- a/src/runtime/contrib/cublas/cublas.cc
+++ b/src/runtime/contrib/cublas/cublas.cc
@@ -150,8 +150,6 @@ void CallCublasLt(cublasLtHandle_t hdl, cudaStream_t stream,
   cudaDataType_t c_type = CUDA_R_32F;
   float one_fp32 = 1.0;
   float zero_fp32 = 0.0;
-  auto one_fp16 = __truncXfYf2__(1.0);
-  auto zero_fp16 = __truncXfYf2__(0.0);
   int32_t one_i32 = 1;
   int32_t zero_i32 = 0;
   void* alpha = _fp32;
@@ -168,10 +166,6 @@ void CallCublasLt(cublasLtHandle_t hdl, cudaStream_t 
stream,
 
   if (TypeMatch(C->dtype, kDLFloat, 16)) {
 c_type = CUDA_R_16F;
-compute_type = CUBLAS_COMPUTE_16F;
-scale_type = CUDA_R_16F;
-alpha = _fp16;
-beta = _fp16;
   } else if (TypeMatch(C->dtype, kDLInt, 32)) {
 c_type = CUDA_R_32I;
 compute_type = CUBLAS_COMPUTE_32I;



Re: [PR] [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul [tvm]

2024-04-16 Thread via GitHub


vinx13 merged PR #16892:
URL: https://github.com/apache/tvm/pull/16892


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [Dlight] Enhance vectorization for gpu matmul [tvm]

2024-04-16 Thread via GitHub


vinx13 opened a new pull request, #16894:
URL: https://github.com/apache/tvm/pull/16894

   This fixed dlight matmul rule to make sure vectorization is always applied 
to the innermost loop
   
   cc @Hzfengsy @tqchen 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RUNTIME][VULKAN] Support total_global_memory [tvm]

2024-04-16 Thread via GitHub


MasterJH5574 merged PR #16890:
URL: https://github.com/apache/tvm/pull/16890


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(tvm) branch main updated: [RUNTIME][VULKAN] Support total_global_memory (#16890)

2024-04-16 Thread ruihangl
This is an automated email from the ASF dual-hosted git repository.

ruihangl pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 3680a0d5a2 [RUNTIME][VULKAN] Support total_global_memory (#16890)
3680a0d5a2 is described below

commit 3680a0d5a23da22124c17a845a39f3ae36b70ca3
Author: Tianqi Chen 
AuthorDate: Tue Apr 16 15:48:41 2024 -0400

[RUNTIME][VULKAN] Support total_global_memory (#16890)

This PR supports total_global_memory query for vulkan devices.
---
 src/runtime/vulkan/vulkan_device.cc | 7 +--
 src/runtime/vulkan/vulkan_device.h  | 2 ++
 src/runtime/vulkan/vulkan_device_api.cc | 1 +
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/runtime/vulkan/vulkan_device.cc 
b/src/runtime/vulkan/vulkan_device.cc
index 7c5ac55f0b..cc39972432 100644
--- a/src/runtime/vulkan/vulkan_device.cc
+++ b/src/runtime/vulkan/vulkan_device.cc
@@ -293,7 +293,7 @@ VulkanDevice::VulkanDevice(const VulkanInstance& instance, 
VkPhysicalDevice phy_
 
   for (uint32_t k = 0; k < prop.memoryTypeCount; ++k) {
 VkMemoryType ty = prop.memoryTypes[k];
-size_t heap_size = prop.memoryHeaps[ty.heapIndex].size;
+int64_t heap_size = 
static_cast(prop.memoryHeaps[ty.heapIndex].size);
 // host visible
 if (!(ty.propertyFlags & VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT)) continue;
 // match copy requirment
@@ -312,7 +312,7 @@ VulkanDevice::VulkanDevice(const VulkanInstance& instance, 
VkPhysicalDevice phy_
   win_rank = -1;
   for (uint32_t k = 0; k < prop.memoryTypeCount; ++k) {
 VkMemoryType ty = prop.memoryTypes[k];
-size_t heap_size = prop.memoryHeaps[ty.heapIndex].size;
+int64_t heap_size = 
static_cast(prop.memoryHeaps[ty.heapIndex].size);
 // host visible
 if (!(ty.propertyFlags & VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT)) continue;
 // match copy requirment
@@ -324,8 +324,10 @@ VulkanDevice::VulkanDevice(const VulkanInstance& instance, 
VkPhysicalDevice phy_
 if (rank > win_rank) {
   win_rank = rank;
   compute_mtype_index = k;
+  compute_memory_size = heap_size;
 }
   }
+
   ICHECK_GE(win_rank, 0) << "Cannot find suitable local memory on device.";
 
   if (device_properties.supports_push_descriptor) {
@@ -383,6 +385,7 @@ void VulkanDevice::do_swap(VulkanDevice&& other) {
   std::swap(queue_insert_debug_utils_label_functions,
 other.queue_insert_debug_utils_label_functions);
   std::swap(compute_mtype_index, other.compute_mtype_index);
+  std::swap(compute_memory_size, other.compute_memory_size);
   std::swap(queue, other.queue);
   std::swap(queue_family_index, other.queue_family_index);
   std::swap(physical_device_, other.physical_device_);
diff --git a/src/runtime/vulkan/vulkan_device.h 
b/src/runtime/vulkan/vulkan_device.h
index 296483a6b1..0573a00e5c 100644
--- a/src/runtime/vulkan/vulkan_device.h
+++ b/src/runtime/vulkan/vulkan_device.h
@@ -223,6 +223,8 @@ class VulkanDevice {
   queue_insert_debug_utils_label_functions{nullptr};
   // Memory type index for compute
   uint32_t compute_mtype_index{0};
+  // maximum memory size for compute
+  int64_t compute_memory_size{0};
 
   // queue family_index;
   uint32_t queue_family_index{uint32_t(-1)};
diff --git a/src/runtime/vulkan/vulkan_device_api.cc 
b/src/runtime/vulkan/vulkan_device_api.cc
index 18a40bf54f..4b337dd524 100644
--- a/src/runtime/vulkan/vulkan_device_api.cc
+++ b/src/runtime/vulkan/vulkan_device_api.cc
@@ -165,6 +165,7 @@ void VulkanDeviceAPI::GetAttr(Device dev, DeviceAttrKind 
kind, TVMRetValue* rv)
   break;
 
 case kTotalGlobalMemory: {
+  *rv = device(index).compute_memory_size;
   return;
 }
   }



Re: [PR] [Target] Use LLVM target parser for determining Arm(R) A-Profile Architecture features [tvm]

2024-04-16 Thread via GitHub


tqchen commented on PR #16425:
URL: https://github.com/apache/tvm/pull/16425#issuecomment-2059781680

   I am also getting additional errors like
   
   ```
   python test.py 
   [15:23:28] /home/tqchen/github/tvm/src/target/parsers/aprofile.cc:97: 
Warning: Cannot parse target features. LLVM was not compiled with support for 
Arm(R)-based targets.
   [15:23:28] /home/tqchen/github/tvm/src/target/parsers/aprofile.cc:97: 
Warning: Cannot parse target features. LLVM was not compiled with support for 
Arm(R)-based targets.
   ```
   
   this is on a LLVM that was built for rocm. We should not send out an error 
message during static loading time if ARM target is not used, and only have 
such error message when we attempt to  use tags in aprofile
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC [tvm]

2024-04-16 Thread via GitHub


vinx13 merged PR #16888:
URL: https://github.com/apache/tvm/pull/16888


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(tvm) branch main updated: [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#16888)

2024-04-16 Thread wuwei
This is an automated email from the ASF dual-hosted git repository.

wuwei pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new d1ac73ca2d [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#16888)
d1ac73ca2d is described below

commit d1ac73ca2d3c14dc69e47818871478e8b0f295aa
Author: Ivan Sidorenko <98739392+ibsidore...@users.noreply.github.com>
AuthorDate: Tue Apr 16 21:55:11 2024 +0300

[CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#16888)

[CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC (#63)

Co-authored-by: Andrey Malyshev 
---
 include/tvm/runtime/data_type.h|  3 ++
 python/tvm/contrib/tvmjs.py| 19 +
 python/tvm/relax/backend/contrib/cublas.py | 16 ++-
 python/tvm/relax/transform/legalize_ops/qdq.py | 27 +++-
 src/relax/backend/contrib/utils.h  |  4 ++
 src/relax/op/tensor/qdq.cc | 18 +---
 src/runtime/contrib/cublas/cublas.cc   |  3 ++
 src/tir/op/op.cc   |  2 +
 tests/python/relax/test_codegen_cublas.py  | 59 ++
 tests/python/relax/test_op_qdq.py  | 37 
 10 files changed, 169 insertions(+), 19 deletions(-)

diff --git a/include/tvm/runtime/data_type.h b/include/tvm/runtime/data_type.h
index f7284ec690..a330ccbbdf 100644
--- a/include/tvm/runtime/data_type.h
+++ b/include/tvm/runtime/data_type.h
@@ -126,6 +126,9 @@ class DataType {
 code() == DataType::kE5M2Float) &&
bits() == 8;
   }
+  bool is_e4m3_float8() const { return (code() == DataType::kE4M3Float && 
bits() == 8); }
+
+  bool is_e5m2_float8() const { return (code() == DataType::kE5M2Float && 
bits() == 8); }
   /*! \return whether type is a float16 type. */
   bool is_float16() const { return is_float() && bits() == 16; }
   /*! \return whether type is a bfloat16 type. */
diff --git a/python/tvm/contrib/tvmjs.py b/python/tvm/contrib/tvmjs.py
index 8d8bd1b051..923301a1f5 100644
--- a/python/tvm/contrib/tvmjs.py
+++ b/python/tvm/contrib/tvmjs.py
@@ -28,6 +28,11 @@ from typing import Iterator, Mapping, Tuple, Union
 
 import numpy as np
 
+try:
+import ml_dtypes
+except ImportError:
+ml_dtypes = None
+
 import tvm
 from tvm._ffi.libinfo import find_lib_path
 
@@ -295,6 +300,20 @@ def load_ndarray_cache(cachepath: str, device: 
tvm.runtime.Device):
 arr = tvm.nd.empty(shape, dtype, device=device)
 assert offset + nbytes <= len(raw_data)
 buffer_source = raw_data[offset : offset + nbytes]
+if dtype == "e4m3_float8":
+if ml_dtypes is not None:
+dtype = ml_dtypes.float8_e4m3fn
+else:
+raise RuntimeError(
+"ml_dtypes is not installed, cannot convert 
e4m3_float8 array to numpy."
+)
+if dtype == "e5m2_float8":
+if ml_dtypes is not None:
+dtype = ml_dtypes.float8_e5m2
+else:
+raise RuntimeError(
+"ml_dtypes is not installed, cannot convert 
e5m2_float8 array to numpy."
+)
 if encode_format == "f32-to-bf16" and dtype == "float32":
 data = np.frombuffer(buffer_source, 
dtype="uint16").reshape(shape)
 arr.copyfrom(_convert_bf16_to_f32(data))
diff --git a/python/tvm/relax/backend/contrib/cublas.py 
b/python/tvm/relax/backend/contrib/cublas.py
index eecd531e74..f66001d0e8 100644
--- a/python/tvm/relax/backend/contrib/cublas.py
+++ b/python/tvm/relax/backend/contrib/cublas.py
@@ -28,8 +28,11 @@ from ..patterns import make_matmul_pattern
 from ..utils import has_leaking_intermediate_variables
 
 
-def _is_supported_dtype(lhs_dtype, rhs_dtype):
+def _is_supported_dtype(lhs_dtype, rhs_dtype, out_dtype):
 """Check if dtypes in the given workload are supported by cuBLAS BYOC."""
+if lhs_dtype == "e4m3_float8" and rhs_dtype == "e4m3_float8":
+# The output cannot be 'e5m2_float8' if inputs are 'e4m3_float8'
+return out_dtype != "e5m2_float8"
 return (
 (lhs_dtype == "float16" and rhs_dtype == "float16")
 or (lhs_dtype == "float32" and rhs_dtype == "float32")
@@ -42,10 +45,12 @@ def _check_matmul(context: PatternCheckContext) -> bool:
 return False
 lhs = context.annotated_expr["lhs"]
 rhs = context.annotated_expr["rhs"]
+matmul_call = context.annotated_expr["root"]
 
 lhs_dtype = lhs.struct_info.dtype
 rhs_dtype = rhs.struct_info.dtype
-if not _is_supported_dtype(lhs_dtype, rhs_dtype):
+out_dtype = matmul_call.struct_info.dtype
+if not _is_supported_dtype(lhs_dtype, rhs_dtype, out_dtype):
 return False
 
 lhs_shape = lhs.struct_info.shape.values
@@ -62,6 +67,13 @@ def _check_matmul(context: 

Re: [I] [Bug] Inconsistent Results between Direct Optimization and Sequential Optimization in TVM [tvm]

2024-04-16 Thread via GitHub


vinx13 commented on issue #16870:
URL: https://github.com/apache/tvm/issues/16870#issuecomment-2059600023

   If you call a pass directly (instead of using `Sequential`, it will bypass 
the check for `opt_level`, `required_pass`, etc. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [SVE] Check for SVE target in func_attr from VectorizeLoop [tvm]

2024-04-16 Thread via GitHub


ekalda commented on PR #16893:
URL: https://github.com/apache/tvm/pull/16893#issuecomment-2059426390

   @Lunderberg @lhutton1 
   
   This is a follow up from https://github.com/apache/tvm/pull/16782, it makes 
sure that scalable vectors are created only for SVE target. In this 
implementation we check for target in function attributes, so we'd rely on 
`BindTarget` having run before. Alternatively we could access the "global" 
target via `Target::Current()`, which would be simpler, but would make it 
harder to mix SVE and non-SVE targets. I'm happy to go with either option :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [SVE] Check for SVE target in func_attr from VectorizeLoop [tvm]

2024-04-16 Thread via GitHub


ekalda commented on code in PR #16893:
URL: https://github.com/apache/tvm/pull/16893#discussion_r1567592059


##
src/tir/transforms/vectorize_loop.cc:
##
@@ -725,17 +728,33 @@ class Vectorizer : public StmtMutator, public 
ExprFunctorattrs.GetAttr(tvm::attr::kTarget);
+if (target.defined()) {
+  target_ = Downcast(target);
+  has_sve_ = target_->GetFeature("has_sve").value_or(Bool(false));
+}
+  }
+
   Stmt VisitStmt_(const ForNode* op) final {
 if (op->kind == ForKind::kVectorized) {
+  auto* extent_as_int = op->extent.as();
+  if (!extent_as_int || extent_as_int->value < 1) {
+bool is_scalable_expr = CheckContains::ExprContains(op->extent, 
arith::IsVScaleCall);
+ICHECK(is_scalable_expr && has_sve_)
+<< "Failed to vectorize loop with extent " << op->extent << " for 
target " << target_;
+  }
   ICHECK(is_zero(op->min));
   return Vectorizer(op->loop_var, op->extent)(op->body);
 } else {
   return StmtMutator::VisitStmt_(op);
 }
   }
-};
 
-Stmt VectorizeLoop(Stmt stmt) { return LoopVectorizer()(std::move(stmt)); }

Review Comment:
   Is this function necessary? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [SVE] Check for SVE target in func_attr from VectorizeLoop [tvm]

2024-04-16 Thread via GitHub


ekalda opened a new pull request, #16893:
URL: https://github.com/apache/tvm/pull/16893

   Check that we are compiling for an SVE enabled target when the extent of a 
loop marked for vectorizing is a vscale dependent expression.
   
   1. Add call to BindTarget in Graph Executor pipeline (to enable running tvmc 
flow)
   
   2. Check in LoopVectorize that the extent is either a positive integer or an 
vscale dependent expression, in which case we'd expect an SVE enabled target to 
be present in func_attr


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [CUBLAS] Set fp32 compute and scale dtypes in fp16 matmul [tvm]

2024-04-16 Thread via GitHub


ibsidorenko opened a new pull request, #16892:
URL: https://github.com/apache/tvm/pull/16892

   This is attempt to bring 
[commit](https://github.com/octoml/tvm/commit/f21b9c9c561e7bcb7a81ae12d71568c6e7c1fc49)
 and align `octoml/tvm` with `apache/tvm`.
   This commit replaces fp16 compute dtype and scale dtype by fp32 in cublas 
matmul.
   
   According to cuBLAS 
[docs](https://docs.nvidia.com/cuda/cublas/index.html#cublasltmatmul) there are 
two possible options for compute/scale dtype when input/output dtype is fp16:
   1. compute dtype is `fp16` and scale dtype is `fp16`
   2. compute dtype is `fp32` and scale dtype is `fp32`
   
   By default, we use 1) in apache/tvm and 2) in octoml/tvm. This commit aligns 
different behaviour and set `fp32` as default.
   
   cc @vinx13 @masahi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Bug] `MatMul` operator in TVM seems fragile [tvm]

2024-04-16 Thread via GitHub


shaoyuyoung opened a new issue, #16891:
URL: https://github.com/apache/tvm/issues/16891

   TVM seems to have strict restrictions on `MatMul` operator which means that 
**it cannot use tensors with different shapes**.
   
   Look at this simple graph. In Pytorch, the model is **correctly defined** 
and the input and output shapes are exactly as shown below.
   The evidence is here: 
[https://pytorch.org/docs/stable/generated/torch.matmul.html#torch.matmul](https://pytorch.org/docs/stable/generated/torch.matmul.html#torch.matmul)
   
![image](https://github.com/apache/tvm/assets/100203773/b4c812e8-5fb5-4b33-90cf-380a03e26b6c)
   
   
   When I try to covert `ONNX` to `TVM`, I get an error indicating that the 
tensor shape is inconsistent. However, When converting `Pytorch` to `TVM`, 
everything is OK!
   
   I guess one possible reason is that `TorchScript` plays a role in this but 
`ONNX` does not.

   Moreover, look at the last line of the error message. I wonder why 
**T.int64(1)** is used here. It seems that TVM has a pretty fragile system of 
`int64`.
   
   
![image](https://github.com/apache/tvm/assets/100203773/e62e8462-212b-4d0b-ac5f-826be20e4557)
   
   ### Expected behavior
   Pass compilation as it can produce results in ONNX and PyTorch.
   
   ### Actual behavior
   **Compilation failure**
   ```
   Traceback (most recent call last):
 18: 
tvm::runtime::PackedFuncObj::Extractor::AssignTypedLambda(tvm::transform::__mk_TVM9::{lambda(tvm::transform::Pass, 
tvm::IRModule)#1}, std::__cxx11::basic_string, 
std::allocator >)::{lambda(tvm::runtime::TVMArgs const&, 
tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, 
std::__cxx11::basic_string, std::allocator 
>, tvm::runtime::TVMRetValue)
 17: tvm::transform::Pass::operator()(tvm::IRModule) const
 16: tvm::transform::Pass::operator()(tvm::IRModule, 
tvm::transform::PassContext const&) const
 15: tvm::relay::transform::FunctionPassNode::operator()(tvm::IRModule, 
tvm::transform::PassContext const&) const
 14: _ZN3tvm7runtime13PackedFun
 13: tvm::runtime::TypedPackedFunc::AssignTypedLambda(tvm::relay::transform::DynamicToStatic()::{lambda(tvm::relay::Function,
 tvm::IRModule, tvm::transform::PassContext)#1})::{lambda(tvm::runtime::TVMArgs 
const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs 
const&, tvm::runtime::TVMRetValue*) const
 12: tvm::relay::DynamicToStatic(tvm::relay::Function, tvm::IRModule)
 11: tvm::relay::DynamicToStaticMutator::PrepareInput(tvm::RelayExpr const&)
 10: tvm::transform::Pass::operator()(tvm::IRModule) const
 9: tvm::transform::Pass::operator()(tvm::IRModule, 
tvm::transform::PassContext const&) const
 8: tvm::relay::transform::FunctionPassNode::operator()(tvm::IRModule, 
tvm::transform::PassContext const&) const
 7: tvm::transform::Pass::operator()(tvm::IRModule) const
 6: tvm::transform::Pass::operator()(tvm::IRModule, 
tvm::transform::PassContext const&) const
 5: tvm::transform::ModulePassNode::operator()(tvm::IRModule, 
tvm::transform::PassContext const&) const
 4: 
tvm::runtime::PackedFuncObj::Extractor::AssignTypedLambda(tvm::relay::transform::InferType()::{lambda(tvm::IRModule, 
tvm::transform::PassContext const&)#1})::{lambda(tvm::runtime::TVMArgs const&, 
tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, 
tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
 3: tvm::relay::TypeInferencer::Infer(tvm::GlobalVar, tvm::relay::Function)
 2: tvm::relay::TypeSolver::Solve()
 1: 
tvm::runtime::PackedFuncObj::Extractor const&, int, tvm::Attrs const&, 
tvm::TypeReporter const&)>::AssignTypedLambda const&, int, tvm::Attrs const&, 
tvm::TypeReporter const&)>(bool (*)(tvm::runtime::Array 
const&, int, tvm::Attrs const&, tvm::TypeReporter 
const&))::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> 
>::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, 
tvm::runtime::TVMRetValue*)
 0: bool 
tvm::relay::BatchMatmulRel(tvm::runtime::Array const&, int, tvm::Attrs const&, tvm::TypeReporter const&)
 File 
"/root/anaconda3/conda-bld/tvm-package_1701590675822/work/src/relay/op/nn/nn.h",
 line 212
   InternalError: Check failed: (reporter->AssertEQ(xk, yk)) is false: 
BatchDot: shapes of x and y is inconsistent,  x shape=[T.int64(1), 5, 5], y 
shape=[5, 5, 4]
   
   ```
   
   
   ### Environment
   
   Operating System: Ubuntu 18
   TVM:0.15
   Torch: 2.1.1
   ONNX: 1.15.0
   
   ### Steps to reproduce
   Here is the script:
   ```python
   import torch
   import torch.nn as nn
   import tvm
   from tvm import relay
   import onnx
   
   class DirectMatMulModel(nn.Module):
   def __init__(self):
   super(DirectMatMulModel, self).__init__()
   
   def forward(self, x1, x2, y1, y2):
   result1 = torch.matmul(x1, x2)
   result2 = torch.matmul(y1, y2)
   final_result = torch.matmul(result1, result2)
   

Re: [PR] [TVMScript] Optionally use `ruff format` instead of `black` [tvm]

2024-04-16 Thread via GitHub


quic-sanirudh commented on code in PR #16876:
URL: https://github.com/apache/tvm/pull/16876#discussion_r1567479484


##
python/tvm/script/highlight.py:
##
@@ -92,7 +95,73 @@ def cprint(
 print(highlight(printable, Python3Lexer(), 
Terminal256Formatter(style=style)))
 
 
-def _format(code_str: str) -> str:
+@functools.lru_cache
+def _get_formatter(formatter: Optional[str] = None):
+def get_ruff_formatter():
+if shutil.which("ruff") is None:
+return None
+
+def formatter(code_str):
+proc = subprocess.Popen(
+["ruff", "format", "--stdin-filename=TVMScript"],
+stdin=subprocess.PIPE,
+stdout=subprocess.PIPE,
+encoding="utf-8",
+)
+stdout, _stderr = proc.communicate(code_str)
+return stdout
+
+return formatter
+
+def get_black_formatter():
+try:
+# pylint: disable=import-outside-toplevel
+import black
+except ImportError:
+return None
+
+def formatter(code_str):
+return black.format_str(code_str, mode=black.FileMode())
+
+return formatter
+
+def get_fallback_formatter():
+def formatter(code_str):
+with warnings.catch_warnings():
+warnings.simplefilter("once", UserWarning)
+ruff_install_cmd = sys.executable + " -m pip install ruff"

Review Comment:
   Right yeah, that makes more sense. It is very hard to keep them in sync. 
Perhaps after we remove the version info from the warning message, we can just 
add a line saying something like **"For version compatibility with CI, please 
check docker/Dockerfile.ci_lint"** if the version info has to be specified.
   
   Thanks for the response.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [TVMScript] Optionally use `ruff format` instead of `black` [tvm]

2024-04-16 Thread via GitHub


Lunderberg commented on code in PR #16876:
URL: https://github.com/apache/tvm/pull/16876#discussion_r1567426092


##
python/tvm/script/highlight.py:
##
@@ -92,7 +95,73 @@ def cprint(
 print(highlight(printable, Python3Lexer(), 
Terminal256Formatter(style=style)))
 
 
-def _format(code_str: str) -> str:
+@functools.lru_cache
+def _get_formatter(formatter: Optional[str] = None):
+def get_ruff_formatter():
+if shutil.which("ruff") is None:
+return None
+
+def formatter(code_str):
+proc = subprocess.Popen(
+["ruff", "format", "--stdin-filename=TVMScript"],
+stdin=subprocess.PIPE,
+stdout=subprocess.PIPE,
+encoding="utf-8",
+)
+stdout, _stderr = proc.communicate(code_str)
+return stdout
+
+return formatter
+
+def get_black_formatter():
+try:
+# pylint: disable=import-outside-toplevel
+import black
+except ImportError:
+return None
+
+def formatter(code_str):
+return black.format_str(code_str, mode=black.FileMode())
+
+return formatter
+
+def get_fallback_formatter():
+def formatter(code_str):
+with warnings.catch_warnings():
+warnings.simplefilter("once", UserWarning)
+ruff_install_cmd = sys.executable + " -m pip install ruff"

Review Comment:
   I'm uncertain, but would lean instead toward removing the specific version 
of `black`, if we want consistency.  Specifying a specific version of `black` 
is useful, since then it can be matched to the CI's version, in case a 
developer also uses `black` either for format-on-save or as a pre-commit hook.  
However, the version number in the warning has become out of sync with the CI's 
version (warning message was added in November 2022, but the CI was bumped from 
`22.3.0` to `22.12.0` in March 2023).
   
   Since there isn't really a good way to keep them in sync (nor any `ruff` 
version in the CI with which to be in sync), I'd lean toward removing the 
versions from the warning message altogether.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]

2024-04-16 Thread via GitHub


tqchen commented on PR #108:
URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2058952581

   Thanks for the note. We are in the process of revamping docs. The latest set 
of emerging model optimizations like LLMs will be based on relax. 
https://github.com/apache/tvm/tree/main/python/tvm/relax/frontend/onnx likely 
is a good reference there


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [dlight] Add check for matmul dtype and fix reduction rule [tvm]

2024-04-16 Thread via GitHub


tqchen merged PR #16884:
URL: https://github.com/apache/tvm/pull/16884


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



(tvm) branch main updated: [Contrib] Enable fp16 for thrust sort (#16887)

2024-04-16 Thread tqchen
This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new cdfdd0e4ec [Contrib] Enable fp16 for thrust sort (#16887)
cdfdd0e4ec is described below

commit cdfdd0e4ec7452bedf4e79ba0ff474d2de70bbbf
Author: Siyuan Feng 
AuthorDate: Tue Apr 16 20:13:21 2024 +0800

[Contrib] Enable fp16 for thrust sort (#16887)

[Contrib] Enable fp16 for thrust

Enable fp16 for thrust to support LLM cases
---
 src/runtime/contrib/thrust/thrust.cu | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/runtime/contrib/thrust/thrust.cu 
b/src/runtime/contrib/thrust/thrust.cu
index 28edba64aa..048df518e3 100644
--- a/src/runtime/contrib/thrust/thrust.cu
+++ b/src/runtime/contrib/thrust/thrust.cu
@@ -167,7 +167,19 @@ void thrust_sort(DLTensor* input, DLTensor* out_values, 
DLTensor* out_indices, b
 void thrust_sort_common(DLTensor* input, DLTensor* values_out, DLTensor* 
indices_out,
 bool is_ascend, int sort_len, std::string data_dtype, 
std::string out_dtype,
 DLTensor* workspace) {
-  if (data_dtype == "float32") {
+  if (data_dtype == "float16") {
+if (out_dtype == "int32") {
+  thrust_sort(input, values_out, indices_out, is_ascend, 
sort_len, workspace);
+} else if (out_dtype == "int64") {
+  thrust_sort(input, values_out, indices_out, is_ascend, 
sort_len, workspace);
+} else if (out_dtype == "float32") {
+  thrust_sort(input, values_out, indices_out, is_ascend, 
sort_len, workspace);
+} else if (out_dtype == "float64") {
+  thrust_sort(input, values_out, indices_out, is_ascend, 
sort_len, workspace);
+} else {
+  LOG(FATAL) << "Unsupported output dtype: " << out_dtype;
+}
+  } else if (data_dtype == "float32") {
 if (out_dtype == "int32") {
   thrust_sort(input, values_out, indices_out, is_ascend, 
sort_len, workspace);
 } else if (out_dtype == "int64") {



(tvm) branch main updated (e738f1d4f1 -> 95d6778908)

2024-04-16 Thread tqchen
This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from e738f1d4f1 [Relax][Frontend] Fix sort, argsort and topk in nn module 
(#16886)
 add 95d6778908 [dlight] Add check for matmul dtype and fix reduction rule 
(#16884)

No new revisions were added by this update.

Summary of changes:
 python/tvm/dlight/gpu/matmul.py|  3 ++-
 python/tvm/dlight/gpu/reduction.py | 16 
 2 files changed, 10 insertions(+), 9 deletions(-)



(tvm) branch main updated (cdfdd0e4ec -> e738f1d4f1)

2024-04-16 Thread tqchen
This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


from cdfdd0e4ec [Contrib] Enable fp16 for thrust sort (#16887)
 add e738f1d4f1 [Relax][Frontend] Fix sort, argsort and topk in nn module 
(#16886)

No new revisions were added by this update.

Summary of changes:
 python/tvm/relax/frontend/nn/op.py|  6 +++---
 tests/python/relax/test_frontend_nn_op.py | 29 +
 2 files changed, 32 insertions(+), 3 deletions(-)



Re: [PR] [Relax][Frontend] Fix sort, argsort and topk in nn module [tvm]

2024-04-16 Thread via GitHub


tqchen merged PR #16886:
URL: https://github.com/apache/tvm/pull/16886


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [Contrib] Enable fp16 for thrust sort [tvm]

2024-04-16 Thread via GitHub


tqchen merged PR #16887:
URL: https://github.com/apache/tvm/pull/16887


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[I] [Bug] Init block not discoverable after sch.blockize [tvm]

2024-04-16 Thread via GitHub


nautasolva opened a new issue, #16889:
URL: https://github.com/apache/tvm/issues/16889

   When used on a block with a init statement, blockize creates a separate init 
block that is not discoverable by any means. This hinders further scheduling, 
like tensorizing the init block.
   
   ### Expected behavior
   
   When using `blockize` on a loop that contains an init statement, the init is 
moved to a new block `_init` that should be discoverable with 
`get_block` or `get_children_blocks` on the newly created outer block.
   
   ### Actual behavior
   
   Init block exists in the TIR module but does not seem to be registered by 
the schedule. `get_block(")"` fails with `InternalError: Check 
failed: (it != self_->stmt2ref.end()) is false`
   
   Stacktrace
   
   
   Traceback (most recent call last):
 File "/home/dev/tvm_upstream/../tvm/playground/blockize_init_bug.py", line 
31, in 
   a_init = sch.get_block("A_init")
 File "/home/dev/tvm_upstream/python/tvm/tir/schedule/_type_checker.py", 
line 340, in wrap
   return func(*args, **kwargs)
 File "/home/dev/tvm_upstream/python/tvm/tir/schedule/schedule.py", line 
499, in get_block
   return _ffi_api.ScheduleGetBlock(  # type: ignore # pylint: 
disable=no-member
 File "/home/dev/tvm_upstream/python/tvm/_ffi/_ctypes/packed_func.py", line 
239, in __call__
   raise_last_ffi_error()
 File "/home/dev/tvm_upstream/python/tvm/_ffi/base.py", line 481, in 
raise_last_ffi_error
   raise py_err
 File "/home/dev/tvm_upstream/src/tir/schedule/traced_schedule.cc", line 
128, in tvm::tir::TracedScheduleNode::GetBlock(tvm::runtime::String const&, 
tvm::runtime::Optional const&)
   BlockRV result = ConcreteScheduleNode::GetBlock(name, func_name);
 File "/home/dev/tvm_upstream/src/tir/schedule/concrete_schedule.cc", line 
321, in tvm::tir::ConcreteScheduleNode::GetBlock(tvm::runtime::String const&, 
tvm::runtime::Optional const&)
   Array blocks = tir::GetBlocks(this->state_, name, gv);
 File 
"/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc", line 46, 
in tvm::tir::GetBlocks(tvm::tir::ScheduleState const&, tvm::runtime::String 
const&, tvm::GlobalVar const&)
   finder(prim_func->body);
 File "/home/dev/tvm_upstream/src/tir/ir/stmt_functor.cc", line 142, in 
tvm::tir::StmtVisitor::VisitStmt_(tvm::tir::BlockNode const*)
   this->VisitStmt(op->init.value());
 File 
"/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc", line 29, 
in VisitStmt_
   void VisitStmt_(const BlockNode* block) override {
 File 
"/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc", line 32, 
in VisitStmt_
   ICHECK(it != self_->stmt2ref.end());
   tvm.error.InternalError: Traceback (most recent call last):
 5: tvm::tir::TracedScheduleNode::GetBlock(tvm::runtime::String const&, 
tvm::runtime::Optional const&)
   at /home/dev/tvm_upstream/src/tir/schedule/traced_schedule.cc:128
 4: tvm::tir::ConcreteScheduleNode::GetBlock(tvm::runtime::String const&, 
tvm::runtime::Optional const&)
   at /home/dev/tvm_upstream/src/tir/schedule/concrete_schedule.cc:321
 3: tvm::tir::GetBlocks(tvm::tir::ScheduleState const&, 
tvm::runtime::String const&, tvm::GlobalVar const&)
   at 
/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc:46
 2: tvm::tir::StmtVisitor::VisitStmt_(tvm::tir::BlockNode const*)
   at /home/dev/tvm_upstream/src/tir/ir/stmt_functor.cc:142
 1: VisitStmt_
   at 
/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc:29
 0: VisitStmt_
   at 
/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc:32
 File 
"/home/dev/tvm_upstream/src/tir/schedule/primitive/get_block_loop.cc", line 32
   
   

   
   ### Environment
   
   Reproducible on main (d4056ca79571d4265a12beeedd1b1565953df936)
   
   ### Steps to reproduce
   
   ```python
   import tvm
   
   from tvm.script import ir as I
   from tvm.script import tir as T
   
   @I.ir_module
   class Module:
   @T.prim_func
   def main():
   # with T.block("root"):
   A_sum = T.alloc_buffer((1,), "float32")
   A = T.alloc_buffer((1, 16), "float32")
   for nn, ff in T.grid(1, 16):
   with T.block("A"):
   v_nn, v_ff = T.axis.remap("SR", [nn, ff])
   T.reads(A[v_nn, v_ff])
   T.writes(A_sum[v_nn])
   with T.init():
   A_sum[v_nn] = T.float32(0)
   A_sum[v_nn] = A_sum[v_nn] + A[v_nn, v_ff]
   
   sch = tvm.tir.Schedule(Module)
   
   a = sch.get_block("A")
   
   loop_n, loop_f = sch.get_loops(a)
   sch.blockize(loop_f)
   
   print(sch.mod) # <-- A_init exists
   
   a_init = sch.get_block("A_init") # <-- fails with InternalError: Check 
failed: (it != self_->stmt2ref.end()) is false
   
   ``` 
   
   ### Triage
   
   * tir:schedule
   


-- 
This 

Re: [PR] [RFC] Add NNEF frontend [tvm-rfcs]

2024-04-16 Thread via GitHub


gyenesvi commented on PR #108:
URL: https://github.com/apache/tvm-rfcs/pull/108#issuecomment-2058579469

   Hi,
   
   > as a community we recently moves towards the relax IR for latest genAI 
workloads
   
   Thanks for directing us towards Relax. I guess that means that new frontends 
should convert their representations into Relax IR instead of Relay? The 
documentation on tvm.apache.org refers to Relay, but not Relax. Is that 
documentation obsolete in this area? Is Relay going to be superseded by Relax?
   
   We only see frontend examples in tvm.relax that we can use as reference. Is 
there further documentation on tvm.relax?
   
   It is interesting to hear that there's more focus on dynamic graphs / shape 
inference, as one of the key goals of the next version of NNEF, under 
development, is support for dynamic graphs and shape inference.
   
   > it is unclear how much adoption NNEF have as of now versus ONNX and other 
formats
   
   One of the goals of integration into compiler stacks like TVM would be 
exactly to drive more adoption, as adoption requires public tooling to be able 
to demonstrate the capabilities / usage of NNEF in end-to-end workflows. As the 
next version of NNEF will focus on dynamic graphs, custom operations and 
lowering to tensor IR level, TVM seems like a good option to demonstrate its 
potential in compilation based inference engines. But first we would like to 
start with integrating the currently publicly available version of NNEF.
   
   Also, TVM has backends to multiple Khronos formats, such as SPIR-V (Vulkan) 
and OpenCL, that is why TVM could provide us with an end-to-end workflow 
starting from a Khronos defined input format, and resulting in Khronos defined 
outputs. Furthermore, some Khronos members may be interested in implementing 
their own (proprietary) hardware backends to TVM, with which an NNEF frontend 
could also provide an end-to-end workflow.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [TVMScript] Optionally use `ruff format` instead of `black` [tvm]

2024-04-16 Thread via GitHub


quic-sanirudh commented on code in PR #16876:
URL: https://github.com/apache/tvm/pull/16876#discussion_r1566946710


##
python/tvm/script/highlight.py:
##
@@ -92,7 +95,73 @@ def cprint(
 print(highlight(printable, Python3Lexer(), 
Terminal256Formatter(style=style)))
 
 
-def _format(code_str: str) -> str:
+@functools.lru_cache
+def _get_formatter(formatter: Optional[str] = None):
+def get_ruff_formatter():
+if shutil.which("ruff") is None:
+return None
+
+def formatter(code_str):
+proc = subprocess.Popen(
+["ruff", "format", "--stdin-filename=TVMScript"],
+stdin=subprocess.PIPE,
+stdout=subprocess.PIPE,
+encoding="utf-8",
+)
+stdout, _stderr = proc.communicate(code_str)
+return stdout
+
+return formatter
+
+def get_black_formatter():
+try:
+# pylint: disable=import-outside-toplevel
+import black
+except ImportError:
+return None
+
+def formatter(code_str):
+return black.format_str(code_str, mode=black.FileMode())
+
+return formatter
+
+def get_fallback_formatter():
+def formatter(code_str):
+with warnings.catch_warnings():
+warnings.simplefilter("once", UserWarning)
+ruff_install_cmd = sys.executable + " -m pip install ruff"

Review Comment:
   Should we restrict to a specific version of ruff similar to a specific black 
version we've specified below?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [CUBLAS][FP8] Support e4m3 gemm in cuBLAS BYOC [tvm]

2024-04-16 Thread via GitHub


ibsidorenko opened a new pull request, #16888:
URL: https://github.com/apache/tvm/pull/16888

   This is attempt to bring [PR#63](https://github.com/octoml/tvm/pull/63) from 
`octoml/tvm` to `apache/tvm`.
   This commit enables FP8 e4m3 matmul offloading to cuBLAS.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [Contrib] Enable fp16 for thrust [tvm]

2024-04-16 Thread via GitHub


Hzfengsy opened a new pull request, #16887:
URL: https://github.com/apache/tvm/pull/16887

   Enable fp16 for thrust to support LLM cases
   
   cc @vinx13 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org