This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a change to branch nightly
in repository https://gitbox.apache.org/repos/asf/tvm.git
from 0607484dbe [ARITH] Canonicalize mul-coefficient to rhs (#18031)
add aa312a5278 [BugFix][CUDA] Fix: Update settings for rerun on Increase
FloatImm precision when printing 64 bit values in CUDA codegen (#18035)
add 2dce84f343 [Dtype] Low-precision Blackwell Datatype Support (#18027)
add 44eda85bf2 [Metal] Fix `GetFunction` of metal runtime (#18034)
No new revisions were added by this update.
Summary of changes:
CMakeLists.txt | 1 +
ffi/cmake/Utils/Library.cmake | 2 +-
include/tvm/runtime/data_type.h | 128 ++++++++++++++++---
include/tvm/script/ir_builder/tir/ir.h | 17 ++-
include/tvm/tir/op.h | 2 +-
python/tvm/runtime/ndarray.py | 70 +++++------
python/tvm/script/ir_builder/tir/ir.py | 139 +++++++++++++++------
src/ir/expr.cc | 75 +++++++++--
src/relax/op/tensor/qdq.cc | 7 +-
src/runtime/device_api.cc | 3 +-
src/runtime/metal/metal_module.mm | 2 +-
src/runtime/ndarray.cc | 4 +
src/script/ir_builder/tir/ir.cc | 34 ++++-
src/support/scalars.h | 30 ++++-
src/target/llvm/codegen_llvm.cc | 9 +-
src/target/source/codegen_cuda.cc | 118 +++++++++++++++--
src/target/source/codegen_cuda.h | 6 +-
src/target/source/literal/cuda_half_t.h | 46 +++++++
src/tir/op/op.cc | 41 ++++++
src/tir/transforms/dtype_conversion.cc | 3 +-
src/tir/transforms/dtype_conversion.h | 38 +++++-
src/tir/transforms/vectorize_loop.cc | 4 +-
.../python/codegen/test_target_codegen_cuda_fp4.py | 130 +------------------
.../python/codegen/test_target_codegen_cuda_fp8.py | 133 +++++++++++---------
tests/python/ffi/test_dtype.py | 31 ++++-
.../test_datatype_nv_fp4.py} | 35 ++++--
tests/python/ir/test_datatype_nv_fp8.py | 31 ++++-
tests/python/ir/test_dtype.py | 9 +-
.../python/tvmscript/test_tvmscript_printer_tir.py | 43 ++++---
29 files changed, 825 insertions(+), 366 deletions(-)
copy tests/python/{arith/test_arith_detect_cse.py =>
ir/test_datatype_nv_fp4.py} (55%)