(tvm) branch nightly updated (460f6f1d3e -> de91c5ca94)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch nightly in repository https://gitbox.apache.org/repos/asf/tvm.git from 460f6f1d3e [QoL][Relax] Infer StructInfo for relax::Tuple on construction (#16860) add d030ce27a1 [TVMScript] Optionally use `ruff format` instead of `black` (#16876) add 857fe614ab [Target] Don't register AArch64 target tags without LLVM compiler support (#16897) add b3ffd97569 [BYOC] Add layout check and update shape check for cublas FP8 BYOC (#16895) add da56c89f32 [Dlight] Enhance vectorization for gpu matmul (#16894) add de91c5ca94 [Bugfix] rocm shared memory issue on MI250 (#16901) No new revisions were added by this update. Summary of changes: cmake/modules/LLVM.cmake | 1 + cmake/utils/FindLLVM.cmake | 18 + python/tvm/dlight/gpu/gemv.py| 5 +- python/tvm/dlight/gpu/matmul.py | 7 +- python/tvm/relax/backend/contrib/cublas.py | 28 ++- python/tvm/script/highlight.py | 95 +++- src/target/parsers/aprofile.cc | 7 +- src/target/tag.cc| 6 +- tests/python/dlight/test_gpu_matmul.py | 81 ++-- tests/python/dlight/test_gpu_matmul_tensorize.py | 18 ++--- tests/python/relax/test_codegen_cublas.py| 20 +++-- 11 files changed, 199 insertions(+), 87 deletions(-)
[PR] [Bugfix] CudaDeviceAPI::GetAttr may check kExist when GPUs absent [tvm]
Lunderberg opened a new pull request, #16903: URL: https://github.com/apache/tvm/pull/16903 This commit resolves a bug that was introduced in https://github.com/apache/tvm/pull/16377. If no CUDA-capable GPUs are present, the call to `cudaGetDeviceCount` will return an error, which will be raised as an exception by the `CUDA_CALL` macro. However, checking the `kExist` flag is valid even if no GPUs are present. This commit removes the use of `CUDA_CALL`, and instead returns false in this case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (da56c89f32 -> de91c5ca94)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from da56c89f32 [Dlight] Enhance vectorization for gpu matmul (#16894) add de91c5ca94 [Bugfix] rocm shared memory issue on MI250 (#16901) No new revisions were added by this update. Summary of changes: python/tvm/dlight/gpu/gemv.py | 5 - 1 file changed, 4 insertions(+), 1 deletion(-)
Re: [PR] [Bugfix] rocm shared memory issue on MI250 [tvm]
tqchen merged PR #16901: URL: https://github.com/apache/tvm/pull/16901 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (b3ffd97569 -> da56c89f32)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from b3ffd97569 [BYOC] Add layout check and update shape check for cublas FP8 BYOC (#16895) add da56c89f32 [Dlight] Enhance vectorization for gpu matmul (#16894) No new revisions were added by this update. Summary of changes: python/tvm/dlight/gpu/matmul.py | 7 +- tests/python/dlight/test_gpu_matmul.py | 81 tests/python/dlight/test_gpu_matmul_tensorize.py | 18 +++--- 3 files changed, 54 insertions(+), 52 deletions(-)
Re: [PR] [Dlight] Enhance vectorization for gpu matmul [tvm]
tqchen merged PR #16894: URL: https://github.com/apache/tvm/pull/16894 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [Relax] Allow specifying entry_funcs for BYOC [tvm]
vinx13 opened a new pull request, #16902: URL: https://github.com/apache/tvm/pull/16902 cc @tqchen -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [BYOC] Add layout check and update shape check for cublas FP8 BYOC [tvm]
tqchen merged PR #16895: URL: https://github.com/apache/tvm/pull/16895 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (857fe614ab -> b3ffd97569)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from 857fe614ab [Target] Don't register AArch64 target tags without LLVM compiler support (#16897) add b3ffd97569 [BYOC] Add layout check and update shape check for cublas FP8 BYOC (#16895) No new revisions were added by this update. Summary of changes: python/tvm/relax/backend/contrib/cublas.py | 28 tests/python/relax/test_codegen_cublas.py | 20 2 files changed, 36 insertions(+), 12 deletions(-)
Re: [PR] [Target] Don't register AArch64 target tags without LLVM compiler support [tvm]
tqchen merged PR #16897: URL: https://github.com/apache/tvm/pull/16897 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
(tvm) branch main updated (d030ce27a1 -> 857fe614ab)
This is an automated email from the ASF dual-hosted git repository. tqchen pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/tvm.git from d030ce27a1 [TVMScript] Optionally use `ruff format` instead of `black` (#16876) add 857fe614ab [Target] Don't register AArch64 target tags without LLVM compiler support (#16897) No new revisions were added by this update. Summary of changes: cmake/modules/LLVM.cmake | 1 + cmake/utils/FindLLVM.cmake | 18 ++ src/target/parsers/aprofile.cc | 7 --- src/target/tag.cc | 6 +- 4 files changed, 28 insertions(+), 4 deletions(-)
[PR] [CMAKE] Misc improvment of Util [tvm]
tqchen opened a new pull request, #16900: URL: https://github.com/apache/tvm/pull/16900 This PR updates the utils so tvm_option can take in list argument. Also introduces a flag for MSCCLPP. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [BYOC] Add layout check and update shape check for cublas FP8 BYOC [tvm]
vinx13 commented on code in PR #16895: URL: https://github.com/apache/tvm/pull/16895#discussion_r1569209004 ## python/tvm/relax/backend/contrib/cublas.py: ## @@ -68,11 +69,30 @@ def _check_matmul(context: PatternCheckContext) -> bool: # Rows number must be multiples of 4 for IGEMM return False elif lhs_dtype == "e4m3_float8" and rhs_dtype == "e4m3_float8": -# Matrix dimensions must be multiples of 16. This requirement is missing from the cuBLAS -# docs, but it was observed during testing. -if not isinstance(rhs_shape[-1], (tvm.tir.expr.IntImm, int)) or rhs_shape[-1] % 16 != 0: +matmul_rhs_var = matmul_call.args[1] +rhs_transposed = False +if matmul_rhs_var in context.matched_bindings: +matmul_rhs_call = context.matched_bindings[matmul_rhs_var] +assert ( +isinstance(matmul_rhs_call, tvm.relax.Call) +and matmul_rhs_call.op.name == "relax.permute_dims" +) Review Comment: `if matmul_rhs_var in context.matched_bindings:` this condition implies rhs is transposed (it's the only pattern that rhs is another binding being matched), so I added an assertion here, it won't crash if we have non-transposed rhs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Target] Don't register AArch64 target tags without LLVM compiler support [tvm]
mvermeulen commented on PR #16897: URL: https://github.com/apache/tvm/pull/16897#issuecomment-2061698191 I retested with the ROCm build and no longer see warning messages about a missing ARM configuration when running ROCm. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Restore "pytest.mark.gpu" for RELAX tests [tvm]
Lunderberg commented on code in PR #16741: URL: https://github.com/apache/tvm/pull/16741#discussion_r1569089772 ## tests/python/relax/test_codegen_cudnn.py: ## @@ -36,12 +36,8 @@ def reset_seed(): has_cudnn = tvm.get_global_func("relax.ext.cudnn", True) -cudnn_enabled = pytest.mark.skipif( -not has_cudnn, -reason="cuDNN not enabled.", -) -pytestmark = [cudnn_enabled] +pytestmark = [*tvm.testing.requires_cudnn.marks()] Review Comment: Good point. I thought I had made `Feature` be a subclass of `Mark`, but I guess I never did. In that case, I like your simplification. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [Dlight] Enhance vectorization for gpu matmul [tvm]
tqchen commented on PR #16894: URL: https://github.com/apache/tvm/pull/16894#issuecomment-2061427828 @vinx13 please fix the testcase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [relay][feature] save relay IR as onnx for visualize [tvm]
tqchen commented on PR #16847: URL: https://github.com/apache/tvm/pull/16847#issuecomment-2061421527 seems a better approach to write something like https://github.com/lutzroeder/netron/blob/main/source/caffe.js in netron. That file would parse the json file exported by the tvm relax, and construct the graph. Alternatively, we can add a visualization spec json for tvm, and add netron support. This would remove the need of protobuf and allows more direct export -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [SVE][TOPI] Add conv2d NHWC hybrid SVE schedule for `arm_cpu` [tvm]
Anndrey24 opened a new pull request, #16899: URL: https://github.com/apache/tvm/pull/16899 This commit adds an `arm_cpu` conv2d NHWC schedule which generates SVE instructions by extending the hybrid GeMM approach implemented in #16106 to use scalable expressions as splitting factors. Various vscale-related fixes needed to implement the schedule are also included, such as: - adding vscale bounds in the `ConstIntBoundAnalyzer` and `IntervalSetEvaluator` - simplifying `MinNode` and `MaxNode` that have scalable expression operands in `RewriteSimplifier`, which would appear when defining the shape of a buffer padded to be a multiple of vscale and in its respective buffer access indices (e.g. `C_1 = T.Buffer((1024 * (T.vscale() * 16 + 256 - 16 % T.vscale() * 16),), data=C)` instead of `C_1 = T.Buffer((1024 * (T.max(255, T.vscale() * 16 + 255 - 16 % T.vscale() * 16) + 1),), data=C)`) The correctness of the new schedule is checked using a TOPI test, while the presence of generated SVE instructions is verified by a codegen_aarch64 test. The new rewrite_simplify rules are also covered by additional test cases. cc @ekalda @lhutton1 @Lunderberg -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[I] [Bug] Graph optimization model compilation error involving `Pad` operator [tvm]
shaoyuyoung opened a new issue, #16898: URL: https://github.com/apache/tvm/issues/16898 I am trying to compile an ONNX (graph below) model using TVM. ![5618520ff3e8817d39d4422c547eaf9](https://github.com/apache/tvm/assets/100203773/0a00c2d4-70e1-4e66-b7ae-96737e30d5b3) Of course, this is a complicated graph, but we can simplify it as below. ![image](https://github.com/apache/tvm/assets/100203773/10fd68b9-168f-4db7-b593-a88bd410e612) These two graphs are equal. When I try to compile them using TVM. The original ONNX model fails but the simplified ONNX model passes. It is very strange! This seems to involve the `Pad` operator shape-checking problem. In theory, I think **TVM should have strong compatibility with the native ONNX model**. However, the truth is not satisfactory. It seems that only simplified, simple models are acceptable to TVM ### Expected behavior ONNX compilation passes ### Actual behavior ``` onnx fail Traceback (most recent call last): 18: tvm::runtime::PackedFuncObj::Extractor::AssignTypedLambda(tvm::transform::__mk_TVM9::{lambda(tvm::transform::Pass, tvm::IRModule)#1}, std::__cxx11::basic_string, std::allocator >)::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, std::__cxx11::basic_string, std::allocator >, tvm::runtime::TVMRetValue) 17: tvm::transform::Pass::operator()(tvm::IRModule) const 16: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 15: tvm::relay::transform::FunctionPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 14: _ZN3tvm7runtime13PackedFun 13: tvm::runtime::TypedPackedFunc::AssignTypedLambda(tvm::relay::transform::DynamicToStatic()::{lambda(tvm::relay::Function, tvm::IRModule, tvm::transform::PassContext)#1})::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}::operator()(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*) const 12: tvm::relay::DynamicToStatic(tvm::relay::Function, tvm::IRModule) 11: tvm::relay::DynamicToStaticMutator::PrepareInput(tvm::RelayExpr const&) 10: tvm::transform::Pass::operator()(tvm::IRModule) const 9: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 8: tvm::relay::transform::FunctionPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 7: tvm::transform::Pass::operator()(tvm::IRModule) const 6: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 5: tvm::transform::ModulePassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const 4: tvm::runtime::PackedFuncObj::Extractor::AssignTypedLambda(tvm::relay::transform::InferType()::{lambda(tvm::IRModule, tvm::transform::PassContext const&)#1})::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) 3: tvm::relay::TypeInferencer::Infer(tvm::GlobalVar, tvm::relay::Function) 2: tvm::relay::TypeSolver::Solve() 1: tvm::runtime::PackedFuncObj::Extractor const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>::AssignTypedLambda const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>(bool (*)(tvm::runtime::Array const&, int, tvm::Attrs const&, tvm::TypeReporter const&))::{lambda(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)#1}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) 0: tvm::relay::PadRel(tvm::runtime::Array const&, int, tvm::Attrs const&, tvm::TypeReporter const&) File "/root/anaconda3/conda-bld/tvm-package_1701590675822/work/src/relay/op/nn/pad.cc", line 131 InternalError: Check failed: (data->shape.size() == param->pad_width.size()) is false: There should be as many pad width pairs as shape dimensions but the shape has 5 dimensions and there are 4 pad width pairs. ``` ### Environment Operating System: Ubuntu 18 TVM:0.15 Torch: 2.1.1 ONNX: 1.15.0 ### Steps to reproduce ONNX file is here: [onnx.zip](https://github.com/apache/tvm/files/15011753/onnx.zip) Here is the script ```python from onnxsim import simplify import tvm from tvm import relay import onnx def compile_onnx(onnx_model, shape): mod_from_onnx, params_onnx = relay.frontend.from_onnx(onnx_model, shape=shape) with tvm.transform.PassContext(opt_level=4): executor = relay.build_module.create_executor( 'graph', mod_from_onnx, tvm.cpu(), 'llvm', params_onnx ).evaluate() model = onnx.load('./model.onnx') try: compile_onnx(model, {'v0_0': [], 'v6_0': [5, 5, 4, 2, 1]}) except
Re: [PR] Restore "pytest.mark.gpu" for RELAX tests [tvm]
apeskov commented on code in PR #16741: URL: https://github.com/apache/tvm/pull/16741#discussion_r1568848341 ## tests/python/relax/test_codegen_cudnn.py: ## @@ -36,12 +36,8 @@ def reset_seed(): has_cudnn = tvm.get_global_func("relax.ext.cudnn", True) Review Comment: You re absolutely right. Will remove it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] [CUBLAS] Enable offloading of R.matmul + R.dequantize [tvm]
ibsidorenko opened a new pull request, #16896: URL: https://github.com/apache/tvm/pull/16896 This commit enables offloading of `R.matmul` + `R.dequantize` to cuBLAS codegen. Dequantization scale is passed to runtime function and set to alpha parameter. If there is no dequantization, then alpha == 1.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] Restore "pytest.mark.gpu" for RELAX tests [tvm]
apeskov commented on code in PR #16741: URL: https://github.com/apache/tvm/pull/16741#discussion_r1568846620 ## tests/python/relax/test_codegen_cudnn.py: ## @@ -36,12 +36,8 @@ def reset_seed(): has_cudnn = tvm.get_global_func("relax.ext.cudnn", True) -cudnn_enabled = pytest.mark.skipif( -not has_cudnn, -reason="cuDNN not enabled.", -) -pytestmark = [cudnn_enabled] +pytestmark = [*tvm.testing.requires_cudnn.marks()] Review Comment: I'm not sure that it's possible. Global object `pytestmark` should has type `Mark` or `List[Mark]` but our TVM markers has type `tvm.testing.utils.Feature`. So direct assignment, like you suggest above, will lead to type check error: > TypeError: got instead of Mark Moreover, the idea of this change was to utilise hierarchical structure of tvm testing features. `pytest.mark.gpu` mark is a root of this hierarchy. So I have to extract all markers from current feature and all parents. That is exactly method `marks()` do. But you are right this line can be slightly simplified by removing explicit list generator, like: > pytestmark = tvm.testing.requires_cudnn.marks() -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [BYOC] Add layout check and update shape check for cublas FP8 BYOC [tvm]
ibsidorenko commented on code in PR #16895: URL: https://github.com/apache/tvm/pull/16895#discussion_r1568429222 ## python/tvm/relax/backend/contrib/cublas.py: ## @@ -68,11 +69,30 @@ def _check_matmul(context: PatternCheckContext) -> bool: # Rows number must be multiples of 4 for IGEMM return False elif lhs_dtype == "e4m3_float8" and rhs_dtype == "e4m3_float8": -# Matrix dimensions must be multiples of 16. This requirement is missing from the cuBLAS -# docs, but it was observed during testing. -if not isinstance(rhs_shape[-1], (tvm.tir.expr.IntImm, int)) or rhs_shape[-1] % 16 != 0: +matmul_rhs_var = matmul_call.args[1] +rhs_transposed = False +if matmul_rhs_var in context.matched_bindings: +matmul_rhs_call = context.matched_bindings[matmul_rhs_var] +assert ( +isinstance(matmul_rhs_call, tvm.relax.Call) +and matmul_rhs_call.op.name == "relax.permute_dims" +) Review Comment: I am Ok, thank you! Just a nit question: do we need here assert for the case when rhs_call is something but not `permute_dims`? Just to leave rhs_transposed == False and return False in the next IF (without crash): ``` if not rhs_transposed: # cuBLAS FP8 operations require rhs being transposed return False ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org