This is an automated email from the ASF dual-hosted git repository.
MasterJH5574 pushed a change to branch v0.25.0
in repository https://gitbox.apache.org/repos/asf/tvm.git
from daf90cbd79 [CI] Merge PR against its target branch instead of main
(#19712) (#19775)
add 4ff7d7c53a [Relax][PyTorch] Cast non-bool inputs to bool in
logical_and converter (#19679)
add 20257b39b4 [CI] Remove PyPI-only tag ref guard from wheel publishing
(#19685)
add 97955aab1d [Web] Bump tvmjs version to 0.25.0-dev1 (#19687)
add 977c1972ba [Fix] CommReduce could handle 0-dim data (#19683)
add f57af59d79 [CI] Pin actions by version tag, trim wheel perms (#19703)
add 923d7356ba [Tests] Fix s_tir tests using removed T.block API in TIRx
script (#19706)
add 8b33fa2cd1 [CI] Fix release verification script (#19700)
add fa921bf27c [Refactor][Meta-schedule] Remove meta-schedule as_string
mechanism in favor of default representation (#19709)
add b21c57fe69 [Relax][CoreML] Fix CoreML partition pass (#19711)
add 5fe0afd617 [Tests] Skip test modules cleanly when optional deps are
missing (#19704)
add df7f6d92fe [CI] Fix CI script test subprocess environment (#19713)
add 74e11b1741 [Codegen][LLVM] Accept splat form in VLA broadcast test
(#19716)
add 1beb148c8f [DOCS] Clarify loading serialized artifacts requires a
trusted source (#19720)
add 1d6fdd874c [REFACTOR][PYTHON] Slim tvm.libinfo to info-only helpers
(#19719)
add 2cb23f2ad0 [Codegen][NVPTX] Skip runtime execution in Vulkan codegen
tests (#19717)
add ae6c8ca5fb [REFACTOR][PYTHON] Remove tvm.ffi shim; import tvm_ffi
directly (#19721)
add e19276b0f9 [Runtime][Tests] Fix contrib wheel tests (#19714)
add ed9a4fd304 [Tests][Disco] Skip CCL tests when runtime support is
absent (#19724)
add b843c2bcf6 [Tests][Relax] Gate multi-GPU VM test on three devices
(#19725)
add d3558aaa2d [REFACTOR][IR] Phase out diagnostic.h for
visit-context-aware pass errors (#19722)
add 339af97275 [Tests][Hexagon] Lazily import pytest plugin dependencies
(#19726)
add 3a12fa5bb8 [Python] Refactor pyproject.toml dependencies (#19723)
add 2f8a918a96 [Tests][NNAPI] Skip tests cleanly when remote environment
is unavailable (#19730)
add c13b26bbff [Tests][S-TIR] Fix stale MetaSchedule sketch expectations
and migrate let binds to T.let (#19729)
add 3522d7116e [Tests] Remove test_runtime_ndarray (covered by tvm-ffi)
(#19715)
add 19f383d563 [TIRx] Preserve Triton call_kernel compile options (#19728)
add d2d9e3d6bf [Relax][PyTorch][DLight] Fix exported-program CUDA test
failures (#19732)
add c3398c291f [PYTHON] Autoload backends; simplify library loading;
remove TVMError for native errors (#19727)
add 3eb663eb82 [Script][Tests] Fix dialect redirect module re-execution
and stray category-less tirx.intrin_test op (#19731)
add 3c4fae0a2c [S-TIR][Tests] Fix transform test failures after TIRx
bringup (#19735)
add b1f5b55a11 [TIRx] Use canonical PTX async script API in s_tir test
(#19739)
add 848c32c753 [Tests] Check WebGPU volatile allreduce annotation
structurally (#19740)
add 46cad583b6 [S-TIR] Fix software pipeline offsets for legacy MMA
intrinsics (#19742)
add c3c78308c4 [Tests] Fix flaky popen pool executor test (#19746)
add aaa33de4f3 [Hexagon][Tests] Clean up stale hexagon tests (#19747)
add e769db6d63 [Runtime][Disco] Fix session attribute storage, NVSHMEM
build, and test gating (#19736)
add fca2149e68 [CI] Align cuda-python with PyTorch cuda-bindings (#19738)
add cc9cb5b511 [Codegen][LLVM][Tests] Gate +v9a vscale_range expectation
on LLVM version (#19744)
add 7754f239ea [Runtime][Tests] Drop int4 from random_fill test, fix dtype
error message (#19748)
add e14c66a298 [Tests][LLVM] Gate stepvector intrinsic rename on LLVM 20
(#19745)
add 1cda498368 [S-TIR][Tests] Mark test_cp_async_in_if_then_else as xfail
(#19751)
add 388ff5dd98 [CI] Run s_tir/transform tests in the python-unittest stage
(#19737)
add 6de8f4445d [CI] Updated cibw to 4.1.0 (#19754)
add 56ee5eaeb6 [TIRX][Tests] Fix LLVM version gate for vectorized lround
(#19753)
add 831f6a1621 [S-TIR][CUDA] Fix legacy predicated cp.async zero fill
(#19741)
add f5f2a0fb03 [Tests][AArch64] Make SVE codegen assertions robust across
LLVM versions (#19752)
add 27579ccc7b [Web] Restore tvmjs version to 0.25.0 for the v0.25.0
release
add 90d1ec472e [Relax][PyTorch] Add logical_or and logical_xor converters
(#19756)
add 1ee6c2cd21 [TIRx] Post-bringup follow-ups: op-dispatch, namespaces,
launch bounds, gemm-async, backend reorg (#19757)
add 4420027947 [REFACTOR][VM] Move CUDA graph VM builtin back under VM
runtime (#19758)
add fe5997ac84 [Runtime][CoreML] Fix FFI casts in CoreML runtime (#19762)
add 8cee7d218d [CI] Drop redundant cmake/ninja install from the Linux
wheel CUDA sidecar (#19761)
add c42bf81cae [REFACTOR][DataType] Phase out target custom datatype
support (#19760)
add 8ab09285d3 [REFACTOR][TARGET] Cleanup backend target registration
(#19759)
add ad5cb40b3c [MetaScheduler] Improve print info about builder/runner
state (#19767)
add 9f4572a66b [REFACTOR][CUDA] Phase out l2 cache flush preproc test
(#19768)
add 4ee9e0495b [Relax][ONNX] Fix LayerNormalization no-bias zero tensor
shape and dtype (#19772)
add cad3a0f0f9 [Relax][ONNX] Support exclusive option in CumSum (#19773)
add 606f532b17 [CPP_RPC] Bugfix race conditions and enhance print infos
(#19778)
add d0c04050d4 [CMAKE] Upgrade TVM build baseline to C++20 (#19734)
add 7ae1ae74e2 [REFACTOR][CUDA] Phase out cuda_common.h (#19770)
add a9a9fc57ba [REFACTOR][PYTHON] Consolidate backend autoload infra
(#19769)
No new revisions were added by this update.
Summary of changes:
.github/actions/build-wheel-for-publish/action.yml | 8 +-
.github/actions/setup/action.yml | 6 +-
.github/workflows/cc_bot.yml | 2 +-
.github/workflows/lint.yml | 4 +-
.github/workflows/main.yml | 4 +-
.github/workflows/nightly_docker_update.yml | 2 +-
.github/workflows/ping_reviewers.yml | 2 +-
.github/workflows/publish_wheel.yml | 43 +-
.github/workflows/tag_teams.yml | 2 +-
.github/workflows/tvmbot.yml | 2 +-
.../workflows/update_last_successful_branch.yml | 2 +-
.github/workflows/update_nightly_branch.yml | 2 +-
3rdparty/nvbench/l2_cache_flush.h | 74 -
CMakeLists.txt | 32 +-
LICENSE | 1 -
apps/cpp_rpc/rpc_env.cc | 53 +-
apps/cpp_rpc/rpc_env.h | 33 +
apps/cpp_rpc/rpc_server.cc | 24 +-
apps/cpp_rpc/rpc_tracker_client.h | 2 +-
ci/scripts/package/README.md | 6 +-
.../package/manylinux_build_libtvm_runtime_cuda.sh | 28 +-
.../package/windows_build_libtvm_runtime_cuda.bat | 9 +-
cmake/modules/CUDA.cmake | 2 +-
cmake/modules/Hexagon.cmake | 63 +-
cmake/modules/LLVM.cmake | 6 +-
cmake/modules/Metal.cmake | 4 +-
cmake/modules/OpenCL.cmake | 4 +-
cmake/modules/ROCM.cmake | 2 +-
cmake/modules/Vulkan.cmake | 14 +-
cmake/modules/contrib/Posit.cmake | 26 -
cmake/utils/FindLLVM.cmake | 4 +-
docker/Dockerfile.ci_cpu | 4 -
docker/Dockerfile.ci_gpu | 4 -
docker/install/ubuntu_install_cuda_python.sh | 2 +-
docs/contribute/error_handling.rst | 2 +-
docs/install/from_source.rst | 18 +-
docs/reference/api/python/ir.rst | 6 -
docs/reference/api/python/tirx/backend.rst | 6 -
docs/reference/security.rst | 3 +
include/tvm/arith/iter_affine_map.h | 1 -
include/tvm/ir/diagnostic.h | 268 -
include/tvm/ir/transform.h | 31 +-
include/tvm/relax/analysis.h | 25 +-
include/tvm/relax/block_builder.h | 7 -
include/tvm/s_tir/meta_schedule/cost_model.h | 22 +-
.../tvm/s_tir/meta_schedule/feature_extractor.h | 13 +-
include/tvm/s_tir/meta_schedule/measure_callback.h | 13 +-
include/tvm/s_tir/meta_schedule/mutator.h | 12 +-
include/tvm/s_tir/meta_schedule/postproc.h | 13 +-
include/tvm/s_tir/meta_schedule/schedule_rule.h | 13 +-
include/tvm/tirx/builtin.h | 270 -
include/tvm/tirx/op.h | 12 +-
include/tvm/tirx/target_builtin/cuda.h | 745 ---
include/tvm/tirx/target_builtin/trn.h | 156 -
include/tvm/tirx/transform.h | 9 -
include/tvm/topi/reduction.h | 5 +-
licenses/LICENSE.l2_cache_flush.txt | 218 -
pyproject.toml | 15 +-
python/tvm/__init__.py | 21 +-
.../tvm/{topi/cpp/vision => backend}/__init__.py | 13 +-
python/tvm/backend/_autoload_backends.py | 88 +
.../default.py => backend/adreno/__init__.py} | 30 +-
.../adreno.py => backend/adreno/target_tags.py} | 2 +-
python/tvm/backend/cuda/__init__.py | 103 +
python/tvm/backend/cuda/lang/__init__.py | 70 +
.../tvm/{tirx => backend/cuda}/lang/alloc_pool.py | 10 +-
python/tvm/{tirx => backend/cuda}/lang/pipeline.py | 0
.../tvm/{tirx => backend/cuda}/lang/smem_desc.py | 2 +-
.../{tirx => backend/cuda}/lang/tile_scheduler.py | 0
.../tvm/{tirx => backend/cuda}/lang/warp_role.py | 0
python/tvm/{tirx => backend/cuda}/op.py | 4071 +--------------
.../hexagon => backend/cuda/operator}/__init__.py | 4 +-
.../cuda/operator/intrinsics}/__init__.py | 0
.../cuda}/operator/intrinsics/_schema.py | 4 +-
.../cuda/operator/intrinsics}/cp_async.py | 21 +-
.../cuda/operator/intrinsics}/header.py | 0
.../cuda/operator/intrinsics}/math.py | 4 +-
.../cuda/operator/intrinsics}/memory.py | 4 +-
.../cuda/operator/intrinsics}/misc.py | 4 +-
.../cuda/operator/intrinsics}/mma.py | 2 +-
.../cuda/operator/intrinsics}/nvshmem.py | 2 +-
.../cuda/operator/intrinsics}/registry.py | 0
.../cuda/operator/intrinsics}/sync.py | 10 +-
.../cuda/operator/intrinsics}/tcgen05.py | 2 +-
.../cuda/operator/intrinsics}/types.py | 0
.../cuda/operator/intrinsics}/utils.py | 0
.../cuda/operator/intrinsics}/wgmma.py | 2 +-
.../cuda/operator/tile_primitive}/__init__.py | 4 +
.../cuda/operator/tile_primitive}/common.py | 0
.../cuda/operator/tile_primitive}/copy/__init__.py | 0
.../cuda/operator/tile_primitive}/copy/_common.py | 0
.../operator/tile_primitive}/copy/_swizzle_iter.py | 0
.../cuda/operator/tile_primitive}/copy/fallback.py | 0
.../operator/tile_primitive}/copy/gmem_smem.py | 2 +-
.../operator/tile_primitive}/copy/ld_stmatrix.py | 2 +-
.../cuda/operator/tile_primitive}/copy/reg.py | 2 +-
.../cuda/operator/tile_primitive}/copy/utils.py | 0
.../tile_primitive}/copy_async/__init__.py | 0
.../operator/tile_primitive}/copy_async/dsmem.py | 0
.../operator/tile_primitive}/copy_async/ldgsts.py | 2 +-
.../tile_primitive}/copy_async/tcgen05_cp.py | 2 +-
.../tile_primitive}/copy_async/tcgen05_ldst.py | 0
.../operator/tile_primitive}/copy_async/tma.py | 0
.../operator/tile_primitive}/copy_async/utils.py | 0
.../tile_primitive}/elementwise/__init__.py | 2 +-
.../tile_primitive}/elementwise/_common.py | 0
.../tile_primitive}/elementwise/ops/__init__.py | 0
.../tile_primitive}/elementwise/ops/binary.py | 0
.../tile_primitive}/elementwise/ops/cast.py | 0
.../tile_primitive}/elementwise/ops/fma.py | 0
.../tile_primitive}/elementwise/ops/unary.py | 0
.../operator/tile_primitive}/elementwise/reg.py | 2 +-
.../tile_primitive}/elementwise/register.py | 0
.../operator/tile_primitive}/elementwise/smem.py | 2 +-
.../elementwise/vec_emit/__init__.py | 0
.../elementwise/vec_emit/binary_f32x2.py | 0
.../elementwise/vec_emit/cast_vec2.py | 0
.../elementwise/vec_emit/fma_f32x2.py | 0
.../operator/tile_primitive}/exec_scope_utils.py | 0
.../cuda/operator/tile_primitive}/gemm/__init__.py | 0
.../operator/tile_primitive}/gemm/mma_m16n8k_.py | 0
.../tile_primitive}/gemm_async/__init__.py | 0
.../operator/tile_primitive}/gemm_async/tcgen05.py | 211 +-
.../cuda/operator/tile_primitive}/gemm_utils.py | 0
.../cuda/operator/tile_primitive}/layout_utils.py | 0
.../tile_primitive}/permute_layout/__init__.py | 0
.../permute_layout/warp_xor_swizzle.py | 0
.../operator/tile_primitive}/reduction/__init__.py | 0
.../operator/tile_primitive}/reduction/local.py | 4 +-
.../operator/tile_primitive}/reduction/shared.py | 2 +-
.../tile_primitive}/reduction/sm100_packed.py | 2 +-
.../operator/tile_primitive}/reduction/utils.py | 9 +-
.../cuda/operator/tile_primitive}/tma_utils.py | 0
python/tvm/backend/cuda/script.py | 571 +++
.../cuda.py => backend/cuda/target_tags.py} | 2 +-
python/tvm/backend/hexagon/__init__.py | 51 +
.../hexagon.py => backend/hexagon/target_tags.py} | 2 +-
python/tvm/backend/loader.py | 187 +
python/tvm/backend/metal/__init__.py | 89 +
python/tvm/backend/metal/op.py | 84 +
python/tvm/backend/metal/script.py | 55 +
.../metal.py => backend/metal/target_tags.py} | 4 +-
python/tvm/backend/opencl/__init__.py | 58 +
python/tvm/backend/rocm/__init__.py | 59 +
python/tvm/backend/trn/__init__.py | 68 +
python/tvm/backend/trn/layout.py | 123 +
python/tvm/backend/trn/op.py | 153 +
.../onnx => backend/trn/operator}/__init__.py | 8 +-
.../trn/operator/tile_primitive}/__init__.py | 0
.../operator/tile_primitive}/binary/__init__.py | 0
.../trn/operator/tile_primitive}/binary/default.py | 4 +-
.../trn/operator/tile_primitive}/binary/utils.py | 7 +-
.../trn/operator/tile_primitive}/common.py | 0
.../tile_primitive}/compose_op/__init__.py | 0
.../tile_primitive}/compose_op/binary_chain.py | 0
.../tile_primitive}/compose_op/binary_reduce.py | 0
.../tile_primitive}/compose_op/compose_op.py | 0
.../tile_primitive}/compose_op/reduce_negate.py | 0
.../tile_primitive}/compose_op/unary_reduce.py | 0
.../operator/tile_primitive}/compose_op/utils.py | 3 +-
.../trn/operator/tile_primitive}/copy/__init__.py | 0
.../trn/operator/tile_primitive}/copy/default.py | 11 +-
.../trn/operator/tile_primitive}/dim_utils.py | 0
.../trn/operator/tile_primitive}/gemm/__init__.py | 0
.../trn/operator/tile_primitive}/gemm/default.py | 11 +-
.../tile_primitive}/instruction_generator.py | 11 +-
.../trn/operator/tile_primitive}/private_alloc.py | 6 +-
.../operator/tile_primitive}/reduction/__init__.py | 0
.../operator/tile_primitive}/reduction/default.py | 2 +-
.../operator/tile_primitive}/reduction/utils.py | 11 +-
.../operator/tile_primitive}/select/__init__.py | 0
.../trn/operator/tile_primitive}/select/default.py | 5 +-
.../trn/operator/tile_primitive}/unary/__init__.py | 0
.../trn/operator/tile_primitive}/unary/default.py | 4 +-
.../trn/operator/tile_primitive}/unary/utils.py | 9 +-
.../tile_primitive}/unary/with_bias_scale.py | 4 +-
.../operator/tile_primitive}/workspace_utils.py | 0
python/tvm/backend/trn/pipeline.py | 58 +
python/tvm/backend/trn/script.py | 58 +
.../trn/target_tags.py} | 31 +-
.../trn => backend/trn/transform}/__init__.py | 27 +-
.../trn/transform}/naive_allocator.py | 3 +-
.../trn/transform}/private_buffer_alloc.py | 0
python/tvm/backend/vulkan/__init__.py | 69 +
.../_ffi_api.py => backend/webgpu/__init__.py} | 10 +-
python/tvm/base.py | 85 +-
python/tvm/contrib/hexagon/build.py | 11 +-
python/tvm/contrib/hexagon/pytest_plugin.py | 44 +-
python/tvm/contrib/hexagon/session.py | 2 +-
python/tvm/contrib/tvmjs.py | 7 +-
python/tvm/error.py | 12 +-
python/tvm/ffi.py | 21 -
python/tvm/ir/__init__.py | 2 +-
python/tvm/ir/base.py | 2 +-
python/tvm/ir/diagnostics/__init__.py | 121 -
python/tvm/ir/diagnostics/_ffi_api.py | 21 -
python/tvm/ir/module.py | 2 +-
python/tvm/ir/utils.py | 6 +-
python/tvm/libinfo.py | 311 +-
python/tvm/relax/analysis/__init__.py | 1 +
python/tvm/relax/analysis/analysis.py | 37 +-
.../tvm/relax/backend/adreno/transform/_ffi_api.py | 4 +-
.../relax/backend/contrib/example_npu/README.md | 2 +-
.../relax/backend/contrib/example_npu/patterns.py | 11 +-
python/tvm/relax/backend/metal/coreml.py | 12 +-
python/tvm/relax/binding_rewrite.py | 2 +-
python/tvm/relax/distributed/struct_info.py | 3 +-
python/tvm/relax/expr.py | 4 +-
python/tvm/relax/frontend/nn/exporter.py | 2 +-
python/tvm/relax/frontend/onnx/onnx_frontend.py | 24 +-
.../tvm/relax/frontend/tflite/tflite_frontend.py | 4 +
.../frontend/torch/base_fx_graph_translator.py | 58 +
.../frontend/torch/exported_program_translator.py | 28 +-
python/tvm/relax/frontend/torch/fx_translator.py | 3 +
python/tvm/relax/ir/instrument.py | 2 +-
python/tvm/relax/op/_op_gradient.py | 5 +-
python/tvm/relax/training/setup_trainer.py | 7 +-
python/tvm/relax/training/trainer.py | 6 +-
python/tvm/rpc/base.py | 4 +-
python/tvm/rpc/client.py | 5 +-
python/tvm/rpc/minrpc.py | 14 +-
python/tvm/rpc/proxy.py | 3 +-
python/tvm/rpc/server.py | 16 +-
python/tvm/rpc/tracker.py | 3 +-
python/tvm/runtime/disco/session.py | 15 +-
python/tvm/runtime/module.py | 10 +-
python/tvm/s_tir/dlight/gpu/fallback.py | 33 +-
.../s_tir/meta_schedule/cost_model/cost_model.py | 15 +-
.../s_tir/meta_schedule/cost_model/mlp_model.py | 3 +-
.../feature_extractor/feature_extractor.py | 9 +-
.../measure_callback/measure_callback.py | 9 +-
python/tvm/s_tir/meta_schedule/mutator/mutator.py | 15 +-
.../tvm/s_tir/meta_schedule/postproc/postproc.py | 15 +-
.../meta_schedule/schedule_rule/schedule_rule.py | 15 +-
.../meta_schedule/space_generator/__init__.py | 2 -
python/tvm/s_tir/meta_schedule/utils.py | 34 +-
python/tvm/s_tir/pipeline.py | 2 -
python/tvm/s_tir/schedule/schedule.py | 4 +-
python/tvm/s_tir/tensor_intrin/cuda.py | 4 +-
python/tvm/s_tir/tensor_intrin/metal.py | 8 +-
python/tvm/s_tir/transform/transform.py | 3 +-
python/tvm/script/__init__.py | 33 +-
python/tvm/script/parser/core/diagnostics.py | 126 +-
python/tvm/script/parser/core/entry.py | 6 +-
python/tvm/script/parser/core/parser.py | 3 +-
python/tvm/support/cc.py | 15 +-
python/tvm/support/clang.py | 3 +-
python/tvm/support/emcc.py | 49 +-
python/tvm/support/ndk.py | 7 +-
python/tvm/support/nvcc.py | 9 +-
python/tvm/support/rocm.py | 5 +-
python/tvm/support/tar.py | 5 +-
python/tvm/support/xcode.py | 5 +-
python/tvm/target/__init__.py | 1 -
python/tvm/target/datatype.py | 379 --
python/tvm/target/detect_target.py | 100 +-
python/tvm/target/tag_registry/__init__.py | 4 -
python/tvm/target/x86.py | 39 -
python/tvm/testing/utils.py | 18 +-
python/tvm/tirx/__init__.py | 17 +-
python/tvm/tirx/backend/__init__.py | 4 +-
python/tvm/tirx/backend/adreno/__init__.py | 17 -
python/tvm/tirx/bench.py | 8 +-
python/tvm/tirx/compilation_pipeline.py | 41 +-
python/tvm/tirx/lang/alloc_pool.py | 527 +-
python/tvm/tirx/lang/pipeline.py | 242 +-
python/tvm/tirx/lang/smem_desc.py | 53 +-
python/tvm/tirx/lang/tile_scheduler.py | 814 +--
python/tvm/tirx/lang/warp_role.py | 142 +-
python/tvm/tirx/op.py | 5285 +-------------------
python/tvm/tirx/operator/intrinsics/_common.py | 2 +-
.../tvm/tirx/operator/tile_primitive/__init__.py | 11 +-
.../operator/tile_primitive/dispatch_context.py | 4 +
python/tvm/tirx/script/builder/ir.py | 656 +--
python/tvm/tirx/script/builder/triton.py | 25 +-
python/tvm/tirx/transform/__init__.py | 1 -
python/tvm/tirx/transform/transform.py | 27 +-
src/arith/const_int_bound.cc | 7 +-
src/arith/ir_mutator_with_analyzer.cc | 9 +-
src/arith/ir_visitor_with_analyzer.cc | 5 +-
src/arith/rewrite_simplify.cc | 12 +-
.../cuda => backend/cuda/codegen}/codegen_cuda.cc | 74 +-
.../cuda => backend/cuda/codegen}/codegen_cuda.h | 2 +-
.../cuda/codegen}/cuda_fallback_module.cc | 2 +-
.../cuda/codegen}/cuda_fallback_module.h | 4 +-
.../cuda/codegen}/intrin_rule_cuda.cc | 49 +-
.../cuda/codegen}/literal/cuda_half_t.h | 0
.../cuda/codegen}/literal/cuda_int8_t.h | 0
.../cuda/codegen}/llvm/codegen_nvptx.cc | 14 +-
.../cuda/codegen}/llvm/intrin_rule_nvptx.cc | 3 +-
src/{target/cuda => backend/cuda/codegen}/ptx.cc | 28 +-
src/{target/cuda => backend/cuda/codegen}/ptx.h | 0
src/backend/cuda/codegen/target_kind.cc | 143 +
.../cuda.cc => backend/cuda/op/target_builtin.cc} | 73 +-
.../cuda/runtime}/cuda_device_api.cc | 140 +-
.../cuda => backend/cuda/runtime}/cuda_module.cc | 24 +-
.../hexagon/codegen}/hexagon_fallback_module.cc | 4 +-
.../hexagon/codegen}/hexagon_fallback_module.h | 4 +-
.../hexagon/codegen}/llvm/codegen_hexagon.cc | 16 +-
.../hexagon/codegen}/llvm/intrin_rule_hexagon.cc | 35 +-
src/backend/hexagon/codegen/target_kind.cc | 64 +
.../hexagon => backend/hexagon/runtime}/README.md | 0
.../hexagon/runtime}/hexagon_buffer.cc | 0
.../hexagon/runtime}/hexagon_buffer.h | 0
.../hexagon/runtime}/hexagon_buffer_manager.h | 0
.../hexagon/runtime}/hexagon_common.cc | 0
.../hexagon/runtime}/hexagon_common.h | 0
.../hexagon/runtime}/hexagon_device_api.cc | 2 +-
.../hexagon/runtime}/hexagon_device_api.h | 0
.../hexagon/runtime}/hexagon_htp.cc | 0
.../hexagon/runtime}/hexagon_htp.h | 0
.../hexagon/runtime}/hexagon_hvx.cc | 0
.../hexagon/runtime}/hexagon_hvx.h | 0
.../hexagon/runtime}/hexagon_module.cc | 10 +-
.../hexagon/runtime}/hexagon_power_manager.cc | 0
.../hexagon/runtime}/hexagon_power_manager.h | 0
.../hexagon/runtime}/hexagon_thread_manager.cc | 0
.../hexagon/runtime}/hexagon_thread_manager.h | 0
.../hexagon/runtime}/hexagon_user_dma.cc | 0
.../hexagon/runtime}/hexagon_user_dma.h | 0
.../runtime}/hexagon_user_dma_descriptors.h | 0
.../runtime}/hexagon_user_dma_instructions.h | 0
.../hexagon/runtime}/hexagon_user_dma_registers.h | 0
.../hexagon/runtime}/hexagon_vtcm_pool.cc | 0
.../hexagon/runtime}/hexagon_vtcm_pool.h | 0
.../hexagon/runtime}/ops/conv2d.h | 0
.../hexagon/runtime}/ops/conv2d_fp16_hvx.cc | 0
.../hexagon/runtime}/ops/conv2d_quant_hvx.cc | 0
.../hexagon/runtime}/ops/conv_utils.cc | 0
.../hexagon/runtime}/profiler/README.md | 0
.../hexagon/runtime}/profiler/lwp_handler.S | 0
.../hexagon/runtime}/profiler/prof_utils.cc | 0
.../hexagon/runtime}/profiler/prof_utils.h | 0
.../hexagon/runtime}/qhl/qhl_wrapper.cc | 0
.../hexagon/runtime}/ring_buffer.h | 0
.../hexagon/runtime}/rpc/android/session.cc | 6 +-
.../hexagon/runtime}/rpc/android_bash.sh.template | 0
.../hexagon/runtime}/rpc/hexagon/rpc_server.cc | 6 +-
.../hexagon/runtime}/rpc/hexagon_rpc.idl | 0
.../runtime}/rpc/simulator/hexagon_sim_proto.h | 0
.../hexagon/runtime}/rpc/simulator/rpc_server.cc | 2 +-
.../hexagon/runtime}/rpc/simulator/session.cc | 6 +-
.../metal/codegen}/codegen_metal.cc | 24 +-
.../metal/codegen}/codegen_metal.h | 2 +-
.../metal/codegen}/intrin_rule_metal.cc | 22 +-
.../metal/codegen}/metal_fallback_module.cc | 2 +-
.../metal/codegen}/metal_fallback_module.h | 4 +-
src/backend/metal/codegen/target_kind.cc | 63 +
src/backend/metal/op/target_builtin.cc | 66 +
.../metal => backend/metal/runtime}/metal_common.h | 2 +-
.../metal/runtime}/metal_device_api.mm | 0
.../metal/runtime}/metal_module.mm | 10 +-
.../opencl/codegen}/codegen_opencl.cc | 18 +-
.../opencl/codegen}/codegen_opencl.h | 2 +-
.../opencl/codegen}/intrin_rule_opencl.cc | 38 +-
.../opencl/codegen}/opencl_fallback_module.cc | 4 +-
.../opencl/codegen}/opencl_fallback_module.h | 4 +-
src/backend/opencl/codegen/target_kind.cc | 67 +
.../opencl/runtime}/opencl_common.h | 8 +-
.../opencl/runtime}/opencl_device_api.cc | 4 +-
.../opencl/runtime}/opencl_module.cc | 13 +-
.../opencl/runtime}/opencl_wrapper/README.md | 0
.../runtime}/opencl_wrapper/opencl_wrapper.cc | 0
.../opencl/runtime}/source_utils.h | 0
.../opencl => backend/opencl/runtime}/texture.h | 0
.../rocm/codegen}/llvm/codegen_amdgpu.cc | 14 +-
.../rocm/codegen}/llvm/intrin_rule_rocm.cc | 8 +-
.../rocm/codegen}/rocm_fallback_module.cc | 2 +-
.../rocm/codegen}/rocm_fallback_module.h | 4 +-
src/backend/rocm/codegen/target_kind.cc | 148 +
.../rocm => backend/rocm/runtime}/rocm_common.h | 2 +-
.../rocm/runtime}/rocm_device_api.cc | 0
.../rocm => backend/rocm/runtime}/rocm_module.cc | 8 +-
.../source => backend/trn/codegen}/codegen_trn.cc | 59 +-
.../source => backend/trn/codegen}/codegen_trn.h | 2 +-
.../trn/codegen/target_kind.cc} | 59 +-
.../trn.cc => backend/trn/op/target_builtin.cc} | 38 +-
.../trn/transform/lower_trainium_layout.cc} | 91 +-
.../vulkan/codegen}/build_vulkan.cc | 14 +-
.../vulkan/codegen}/codegen_spirv.cc | 21 +-
.../vulkan/codegen}/codegen_spirv.h | 4 +-
.../vulkan/codegen}/intrin_rule_spirv.cc | 26 +-
.../vulkan/codegen}/ir_builder.cc | 0
.../vulkan => backend/vulkan/codegen}/ir_builder.h | 0
.../vulkan/codegen}/spirv_support.cc | 0
.../vulkan/codegen}/spirv_support.h | 0
.../vulkan/codegen}/spirv_utils.cc | 4 +-
.../vulkan/codegen}/spirv_utils.h | 2 +-
src/backend/vulkan/codegen/target_kind.cc | 93 +
.../vulkan/codegen}/vulkan_fallback_module.cc | 8 +-
.../vulkan/codegen}/vulkan_fallback_module.h | 4 +-
.../vulkan => backend/vulkan/runtime}/README.md | 0
.../vulkan/runtime}/spirv_shader.h | 0
.../vulkan => backend/vulkan/runtime}/thread_map.h | 0
.../vulkan/runtime}/vulkan_amdrgp.cc | 0
.../vulkan/runtime}/vulkan_amdrgp.h | 0
.../vulkan/runtime}/vulkan_buffer.cc | 0
.../vulkan/runtime}/vulkan_buffer.h | 0
.../vulkan/runtime}/vulkan_common.cc | 0
.../vulkan/runtime}/vulkan_common.h | 0
.../vulkan/runtime}/vulkan_device.cc | 2 +-
.../vulkan/runtime}/vulkan_device.h | 0
.../vulkan/runtime}/vulkan_device_api.cc | 0
.../vulkan/runtime}/vulkan_device_api.h | 2 +-
.../vulkan/runtime}/vulkan_instance.cc | 2 +-
.../vulkan/runtime}/vulkan_instance.h | 0
.../vulkan/runtime}/vulkan_module.cc | 8 +-
.../vulkan/runtime}/vulkan_stream.cc | 2 +-
.../vulkan/runtime}/vulkan_stream.h | 0
.../vulkan/runtime}/vulkan_wrapped_func.cc | 4 +-
.../vulkan/runtime}/vulkan_wrapped_func.h | 6 +-
.../webgpu/codegen}/codegen_webgpu.cc | 20 +-
.../webgpu/codegen}/codegen_webgpu.h | 2 +-
.../webgpu/codegen}/intrin_rule_webgpu.cc | 20 +-
src/backend/webgpu/codegen/target_kind.cc | 80 +
.../webgpu/codegen}/webgpu_fallback_module.cc | 2 +-
.../webgpu/codegen}/webgpu_fallback_module.h | 4 +-
src/ir/diagnostic.cc | 368 --
src/ir/transform.cc | 115 +-
src/relax/analysis/struct_info_analysis.cc | 22 +-
src/relax/analysis/well_formed.cc | 207 +-
src/relax/ir/block_builder.cc | 11 -
src/relax/ir/transform.cc | 58 +-
src/relax/op/ccl/ccl.cc | 10 +-
src/relax/op/distributed/binary.h | 5 +-
src/relax/op/distributed/distributed.cc | 33 +-
src/relax/op/distributed/linear_algebra.cc | 24 +-
src/relax/op/distributed/manipulate.cc | 40 +-
src/relax/op/distributed/nn.cc | 11 +-
src/relax/op/distributed/op.cc | 8 +-
src/relax/op/distributed/statistical.cc | 7 +-
src/relax/op/distributed/unary.h | 7 +-
src/relax/op/image/resize.cc | 107 +-
src/relax/op/memory/view.cc | 13 +-
src/relax/op/nn/attention.cc | 38 +-
src/relax/op/nn/convolution.cc | 107 +-
src/relax/op/nn/nn.cc | 210 +-
src/relax/op/op.cc | 106 +-
src/relax/op/op_common.cc | 49 +-
src/relax/op/op_common.h | 82 +-
src/relax/op/tensor/binary.cc | 5 +-
src/relax/op/tensor/create.cc | 88 +-
src/relax/op/tensor/grad.cc | 9 +-
src/relax/op/tensor/index.cc | 25 +-
src/relax/op/tensor/linear_algebra.cc | 49 +-
src/relax/op/tensor/manipulate.cc | 522 +-
src/relax/op/tensor/qdq.cc | 69 +-
src/relax/op/tensor/sampling.cc | 71 +-
src/relax/op/tensor/search.cc | 14 +-
src/relax/op/tensor/ternary.cc | 18 +-
src/relax/op/vision/multibox_transform_loc.cc | 80 +-
src/relax/op/vision/nms.cc | 99 +-
src/relax/op/vision/roi_align.cc | 47 +-
src/relax/op/vision/roi_pool.cc | 43 +-
src/relax/transform/run_codegen.cc | 2 +-
src/relax/transform/static_plan_block_memory.cc | 2 +-
src/runtime/cuda/cuda_common.h | 69 -
src/runtime/cuda/l2_cache_flush.cc | 49 -
src/runtime/extra/contrib/clml/clml_runtime.h | 2 +-
src/runtime/extra/contrib/coreml/coreml_runtime.mm | 30 +-
.../extra/contrib/cublas/cublas_json_runtime.cc | 4 +-
src/runtime/extra/contrib/cublas/cublas_utils.cc | 5 +-
src/runtime/extra/contrib/cudnn/conv_backward.cc | 4 +-
src/runtime/extra/contrib/cudnn/conv_forward.cc | 2 +-
.../contrib/cudnn/cudnn_frontend/attention.cc | 4 +-
.../extra/contrib/cudnn/cudnn_json_runtime.cc | 8 +-
src/runtime/extra/contrib/cudnn/cudnn_utils.cc | 2 +-
src/runtime/extra/contrib/cudnn/cudnn_utils.h | 3 +-
src/runtime/extra/contrib/cudnn/softmax.cc | 2 +-
src/runtime/extra/contrib/curand/curand.cc | 2 +-
.../cutlass/fp16_group_gemm_runner_sm100.cuh | 3 +-
.../cutlass/fp16_group_gemm_runner_sm90.cuh | 3 +-
.../fp8_groupwise_scaled_gemm_runner_sm100.cuh | 4 +-
.../fp8_groupwise_scaled_gemm_runner_sm90.cuh | 4 +-
...p8_groupwise_scaled_group_gemm_runner_sm100.cuh | 4 +-
src/runtime/extra/contrib/cutlass/gemm_runner.cuh | 3 +-
.../extra/contrib/hipblas/hipblas_json_runtime.cc | 2 +-
src/runtime/extra/contrib/hipblas/hipblas_utils.cc | 2 +-
src/runtime/extra/contrib/nvshmem/dist_gemm.cu | 25 +-
src/runtime/extra/contrib/nvshmem/init.cc | 7 +-
.../extra/contrib/nvshmem/memory_allocator.cc | 2 +-
.../extra/contrib/random/mt_random_engine.cc | 28 +-
.../extra/contrib/tensorrt/tensorrt_calibrator.h | 19 +-
src/runtime/extra/contrib/thrust/thrust.cu | 5 +-
.../extra/disco/cuda_ipc/cuda_ipc_memory.cc | 39 +-
src/runtime/extra/disco/nccl/nccl_context.h | 19 +-
src/runtime/extra/disco/protocol.h | 4 +-
src/runtime/rpc/rpc_endpoint.cc | 4 +-
src/runtime/vm/attn_utils.h | 2 +-
src/runtime/vm/cuda/cuda_graph_builtin.cc | 20 +-
src/runtime/vm/hexagon/builtin.cc | 2 +-
src/s_tir/backend/adreno/inject_texture_alloc.cc | 2 +-
src/s_tir/backend/adreno/texture_flatten.cc | 2 +-
src/s_tir/meta_schedule/cost_model/cost_model.cc | 10 +-
.../feature_extractor/feature_extractor.cc | 4 +-
.../measure_callback/measure_callback.cc | 6 +-
src/s_tir/meta_schedule/mutator/mutator.cc | 4 +-
src/s_tir/meta_schedule/postproc/postproc.cc | 4 +-
.../postproc/rewrite_cooperative_fetch.cc | 7 +-
.../meta_schedule/schedule_rule/schedule_rule.cc | 4 +-
.../meta_schedule/task_scheduler/task_scheduler.cc | 13 +-
src/s_tir/schedule/analysis/analysis.cc | 5 +-
src/s_tir/transform/compact_buffer_region.cc | 2 +
src/s_tir/transform/inject_permuted_layout.cc | 11 +-
src/s_tir/transform/inject_ptx_async_copy.cc | 10 +-
src/s_tir/transform/inject_ptx_ldg32.cc | 4 +-
src/s_tir/transform/inject_software_pipeline.cc | 17 +-
src/s_tir/transform/memhammer_coalesce.cc | 1 -
src/s_tir/transform/memhammer_lower_auto_copy.cc | 6 +-
.../transform/memhammer_tensorcore_rewrite.cc | 7 +-
.../transform/merge_shared_memory_allocations.cc | 4 +-
src/s_tir/transform/tensorcore_infer_fragment.cc | 13 +-
src/target/datatype/myfloat/myfloat.cc | 144 -
src/target/datatype/posit/posit-wrapper.cc | 242 -
src/target/datatype/registry.cc | 138 -
src/target/datatype/registry.h | 182 -
src/target/llvm/codegen_llvm.cc | 7 +-
src/target/llvm/codegen_params.cc | 3 +-
src/target/opt/README | 1 -
src/target/tag.cc | 13 -
src/target/target_kind.cc | 318 --
src/tirx/analysis/filter_canonical.cc | 5 +-
src/tirx/ir/data_type_rewriter.cc | 9 +-
src/tirx/ir/exec_scope.cc | 4 +-
src/tirx/ir/stmt.cc | 5 +-
src/tirx/op/builtin.cc | 153 -
src/tirx/op/op.cc | 72 +-
src/tirx/script/builder/frame.cc | 3 +-
src/tirx/transform/common_subexpr_elim.cc | 26 +-
src/tirx/transform/lower_custom_datatypes.cc | 266 -
src/tirx/transform/lower_intrin.cc | 4 +-
src/tirx/transform/lower_tvm_builtin.cc | 24 +-
src/tirx/transform/lower_warp_memory.cc | 51 +-
src/tirx/transform/remove_no_op.cc | 11 +-
src/tirx/transform/tile_primitive_dispatch.cc | 5 +-
.../test_runtime_ndarray.py | 77 -
tests/python/arith/test_arith_analyzer_object.py | 2 +-
tests/python/arith/test_arith_rewrite_simplify.py | 8 +-
tests/python/arith/test_arith_simplify.py | 4 +-
tests/python/ci/test_utils.py | 4 +
tests/python/codegen/test_target_codegen.py | 8 +-
.../python/codegen/test_target_codegen_aarch64.py | 27 +-
tests/python/codegen/test_target_codegen_blob.py | 3 +
.../codegen/test_target_codegen_cross_llvm.py | 2 +-
tests/python/codegen/test_target_codegen_cuda.py | 2 +-
tests/python/codegen/test_target_codegen_llvm.py | 8 +-
.../python/codegen/test_target_codegen_llvm_vla.py | 7 +-
tests/python/codegen/test_target_codegen_vulkan.py | 76 +-
.../contrib/test_android/test_meta_schedule.py | 2 +
tests/python/contrib/test_cblas.py | 2 +
tests/python/contrib/test_hexagon/conftest.py | 12 +-
.../test_hexagon/test_2d_physical_buffers.py | 355 --
.../test_hexagon/test_benchmark_maxpool2d.py | 2 +
.../test_hexagon/test_fixed_point_conversion.py | 70 -
.../contrib/test_hexagon/test_meta_schedule.py | 2 +
.../contrib/test_hexagon/test_relax_integration.py | 2 +
tests/python/contrib/test_hexagon/test_take.py | 6 +-
tests/python/contrib/test_hexagon/test_vtcm.py | 2 +-
tests/python/contrib/test_memoize.py | 8 +-
tests/python/contrib/test_random.py | 3 +-
tests/python/contrib/test_rpc_tracker.py | 2 +-
.../python/contrib/test_tir_triton_integration.py | 10 +-
tests/python/disco/test_ccl.py | 5 +-
tests/python/disco/test_custom_allreduce.py | 5 +-
tests/python/disco/test_loader.py | 10 +
tests/python/disco/test_nvshmem.py | 206 +-
tests/python/disco/test_session.py | 109 +-
tests/python/ir/test_roundtrip_runtime_module.py | 1 -
tests/python/nightly/test_nnapi/conftest.py | 24 +-
tests/python/nightly/test_nnapi/test_network.py | 5 +-
.../relax/backend/adreno/test_texture_network.py | 5 +-
.../distributed/test_distributed_dtensor_sinfo.py | 3 +-
.../test_runtime_builtin_kv_cache_transfer.py | 3 +
.../relax/test_analysis_struct_info_analysis.py | 8 +-
.../test_analysis_suggest_layout_transforms.py | 4 +-
tests/python/relax/test_analysis_well_formed.py | 142 +-
tests/python/relax/test_bind_params.py | 4 +-
tests/python/relax/test_bind_symbolic_vars.py | 10 +-
tests/python/relax/test_binding_rewrite.py | 3 +-
tests/python/relax/test_blockbuilder_core.py | 14 +-
tests/python/relax/test_codegen_coreml.py | 48 +-
tests/python/relax/test_codegen_cublas.py | 3 +
tests/python/relax/test_codegen_cudnn.py | 3 +
tests/python/relax/test_codegen_cutlass.py | 3 +
tests/python/relax/test_codegen_hipblas.py | 3 +
tests/python/relax/test_e2e_op_dynamic.py | 3 +
tests/python/relax/test_expr.py | 2 +-
tests/python/relax/test_fast_math_transform.py | 2 +-
.../relax/test_frontend_from_exported_program.py | 330 +-
tests/python/relax/test_frontend_from_fx.py | 78 +-
.../python/relax/test_frontend_nn_extern_module.py | 16 +-
tests/python/relax/test_frontend_onnx.py | 74 +-
tests/python/relax/test_frontend_onnx_backend.py | 4 +
.../relax/test_meta_schedule_relax_integration.py | 2 +
tests/python/relax/test_op_binary.py | 4 +-
tests/python/relax/test_op_ccl.py | 2 +-
tests/python/relax/test_op_create.py | 74 +-
tests/python/relax/test_op_datatype.py | 6 +-
tests/python/relax/test_op_distributed.py | 5 +-
tests/python/relax/test_op_grad.py | 9 +-
tests/python/relax/test_op_image.py | 75 +-
tests/python/relax/test_op_index.py | 54 +-
tests/python/relax/test_op_linear_algebra.py | 18 +-
tests/python/relax/test_op_manipulate.py | 362 +-
tests/python/relax/test_op_nn.py | 168 +-
tests/python/relax/test_op_nn_convolution.py | 176 +-
tests/python/relax/test_op_nn_pooling.py | 164 +-
tests/python/relax/test_op_search.py | 30 +-
tests/python/relax/test_op_set.py | 12 +-
tests/python/relax/test_op_sort.py | 6 +-
tests/python/relax/test_op_statistical.py | 24 +-
tests/python/relax/test_op_ternary.py | 14 +-
tests/python/relax/test_op_unary.py | 10 +-
tests/python/relax/test_op_view.py | 18 +-
tests/python/relax/test_op_vision.py | 69 +-
tests/python/relax/test_relax_operators.py | 3 +-
...runtime_builtin_paged_attention_kv_cache_cpu.py | 3 +
..._builtin_paged_attention_kv_cache_flashinfer.py | 3 +-
.../python/relax/test_runtime_builtin_rnn_state.py | 2 +-
tests/python/relax/test_struct_info.py | 6 +-
tests/python/relax/test_training_append_loss.py | 15 +-
tests/python/relax/test_training_setup_trainer.py | 4 +-
.../python/relax/test_training_trainer_numeric.py | 6 +-
.../relax/test_transform_bind_symbolic_vars.py | 2 +-
.../relax/test_transform_bundle_model_params.py | 4 +
tests/python/relax/test_transform_codegen_pass.py | 4 +-
.../relax/test_transform_error_enrichment.py | 103 +
tests/python/relax/test_transform_fold_constant.py | 2 +-
.../relax/test_transform_fuse_ops_by_pattern.py | 6 +-
tests/python/relax/test_transform_fuse_tir.py | 2 +-
tests/python/relax/test_transform_gradient.py | 31 +-
.../relax/test_transform_gradient_te_register.py | 2 +-
tests/python/relax/test_transform_lambda_lift.py | 2 +-
.../test_transform_legalize_ops_manipulate.py | 2 +-
...st_transform_legalize_ops_search_statistical.py | 32 +
.../relax/test_transform_lift_transform_params.py | 9 +-
.../relax/test_transform_normalize_global_var.py | 8 +-
...st_transform_operator_specific_normalization.py | 6 +-
.../relax/test_transform_rewrite_cuda_graph.py | 4 +-
.../test_transform_static_plan_block_memory.py | 6 +-
tests/python/relax/test_tvmscript_parser.py | 4 +-
tests/python/relax/test_vm_build.py | 9 +-
tests/python/relax/test_vm_builtin.py | 2 +-
tests/python/relax/test_vm_builtin_lower.py | 2 +-
tests/python/relax/test_vm_cuda_graph.py | 2 +-
tests/python/relax/test_vm_execbuilder.py | 4 +-
tests/python/relax/test_vm_multi_device.py | 4 +
.../python/runtime/test_evaluator_with_preproc.py | 60 -
tests/python/runtime/test_runtime_module_load.py | 4 +
tests/python/runtime/test_runtime_rpc.py | 4 +-
tests/python/s_tir/dlight/test_benchmark.py | 2 +
.../s_tir/dlight/test_gpu_general_reduction.py | 16 +-
.../s_tir/dlight/test_gpu_matmul_tensorize.py | 20 +-
.../meta_schedule/test_meta_schedule_builder.py | 1 +
.../meta_schedule/test_meta_schedule_cost_model.py | 37 +-
.../test_meta_schedule_feature_extractor.py | 17 -
.../test_meta_schedule_measure_callback.py | 20 -
.../test_meta_schedule_post_order_apply.py | 3 +-
.../test_meta_schedule_postproc_rewrite_layout.py | 4 +-
...e_postproc_rewrite_parallel_vectorize_unroll.py | 14 +-
.../meta_schedule/test_meta_schedule_runner.py | 2 +
...test_meta_schedule_schedule_rule_add_rfactor.py | 28 +-
...test_meta_schedule_schedule_rule_auto_inline.py | 4 +-
...chedule_schedule_rule_cross_thread_reduction.py | 36 +-
.../meta_schedule/test_meta_schedule_space_cuda.py | 18 +-
.../test_meta_schedule_space_cuda_async.py | 2 +-
.../test_meta_schedule_space_generator.py | 3 +-
.../meta_schedule/test_meta_schedule_tune_tir.py | 2 +
.../schedule/test_tir_schedule_compute_inline.py | 8 +-
.../s_tir/schedule/test_tir_schedule_rfactor.py | 2 +-
.../schedule/test_tir_schedule_transform_layout.py | 4 +-
.../s_tir/schedule/test_tir_schedule_utilities.py | 4 +-
.../test_s_tir_transform_compact_buffer_region.py | 18 +-
...est_s_tir_transform_convert_blocks_to_opaque.py | 2 +-
.../test_s_tir_transform_default_gpu_schedule.py | 4 +-
.../test_s_tir_transform_hoist_expression.py | 10 +-
.../test_s_tir_transform_inject_ptx_async_copy.py | 216 +-
...est_s_tir_transform_inject_software_pipeline.py | 2 +-
...s_tir_transform_lower_cross_thread_reduction.py | 30 +-
.../test_s_tir_transform_lower_match_buffer.py | 7 +-
.../test_s_tir_transform_lower_opaque_block.py | 41 +-
...test_s_tir_transform_lower_thread_all_reduce.py | 14 +-
.../transform/test_s_tir_transform_remove_undef.py | 9 +-
tests/python/support/test_popen_pool.py | 4 +-
tests/python/target/test_target_target.py | 2 +-
tests/python/te/test_te_verify_compute.py | 13 +-
tests/python/tirx-base/test_tir_base.py | 5 +-
tests/python/tirx-base/test_tir_constructor.py | 3 +-
tests/python/tirx-base/test_tir_imm_values.py | 33 +-
tests/python/tirx-base/test_tir_index_map.py | 2 +-
tests/python/tirx-base/test_tir_intrin.py | 6 +-
tests/python/tirx-base/test_tir_nodes.py | 22 +-
tests/python/tirx-base/test_tir_op_types.py | 15 +-
tests/python/tirx-base/test_tir_ops.py | 2 +-
.../python/tirx-base/test_tir_scalable_datatype.py | 5 +-
.../test_tir_unsafe_hide_buffer_access.py | 4 +-
...test_tir_transform_force_narrow_index_to_i32.py | 5 +-
.../test_tir_transform_lower_tvm_builtin.py | 4 +-
.../tirx-transform/test_tir_transform_vectorize.py | 7 +-
.../tile_primitive/cuda/copy/test_fallback.py | 4 +-
.../tile_primitive/cuda/copy/test_gmem_smem.py | 2 +-
.../tile_primitive/cuda/copy/test_swizzle_iter.py | 10 +-
.../tile_primitive/cuda/copy_async/test_dsmem.py | 2 +-
.../cuda/copy_async/test_smem_tmem.py | 2 +-
.../tile_primitive/cuda/copy_async/test_tma.py | 8 +-
.../tile_primitive/cuda/elementwise/test_unary.py | 10 +-
.../cuda/gemm_async/test_gemm_async.py | 92 +-
.../cuda/permute_layout/test_permute_layout.py | 4 +-
.../tile_primitive/trn/test_compose_op_trn.py | 14 +-
.../operator/tile_primitive/trn/test_copy_trn.py | 8 +-
.../operator/tile_primitive/trn/test_gemm_trn.py | 4 +-
.../tile_primitive/trn/test_private_alloc_trn.py | 2 +-
.../tile_primitive/trn/test_reduction_trn.py | 4 +-
.../operator/tile_primitive/trn/test_unary_trn.py | 2 +-
tests/python/tirx/test_alloc_pool.py | 6 +-
tests/python/tirx/test_bench_utils.py | 2 +
tests/python/tirx/test_layout.py | 10 +-
tests/python/tirx/test_op_namespace_cleanup.py | 121 +
tests/python/tirx/test_printer_tir_namespaces.py | 198 +-
...t_expr_functor.py => test_tirx_expr_functor.py} | 0
.../transform/test_transform_naive_allocator.py | 2 +-
.../tvmscript/test_tvmscript_error_report.py | 127 +-
.../python/tvmscript/test_tvmscript_parser_tir.py | 2 +-
.../python/tvmscript/test_tvmscript_printer_ir.py | 4 +-
tests/scripts/release/test_release_package.sh | 46 +-
tests/scripts/task_python_unittest.sh | 1 +
web/package-lock.json | 139 +-
727 files changed, 9025 insertions(+), 21605 deletions(-)
delete mode 100644 3rdparty/nvbench/l2_cache_flush.h
delete mode 100644 cmake/modules/contrib/Posit.cmake
delete mode 100644 include/tvm/ir/diagnostic.h
delete mode 100644 include/tvm/tirx/target_builtin/cuda.h
delete mode 100644 include/tvm/tirx/target_builtin/trn.h
delete mode 100644 licenses/LICENSE.l2_cache_flush.txt
copy python/tvm/{topi/cpp/vision => backend}/__init__.py (75%)
create mode 100644 python/tvm/backend/_autoload_backends.py
copy python/tvm/{tirx/operator/tile_primitive/trn/reduction/default.py =>
backend/adreno/__init__.py} (59%)
rename python/tvm/{target/tag_registry/adreno.py =>
backend/adreno/target_tags.py} (97%)
create mode 100644 python/tvm/backend/cuda/__init__.py
create mode 100644 python/tvm/backend/cuda/lang/__init__.py
copy python/tvm/{tirx => backend/cuda}/lang/alloc_pool.py (97%)
copy python/tvm/{tirx => backend/cuda}/lang/pipeline.py (100%)
copy python/tvm/{tirx => backend/cuda}/lang/smem_desc.py (95%)
copy python/tvm/{tirx => backend/cuda}/lang/tile_scheduler.py (100%)
copy python/tvm/{tirx => backend/cuda}/lang/warp_role.py (100%)
copy python/tvm/{tirx => backend/cuda}/op.py (56%)
copy python/tvm/{contrib/hexagon => backend/cuda/operator}/__init__.py (88%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/__init__.py (100%)
rename python/tvm/{tirx => backend/cuda}/operator/intrinsics/_schema.py (97%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/cp_async.py (97%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/header.py (100%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/math.py (99%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/memory.py (99%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/misc.py (99%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/mma.py (99%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/nvshmem.py (99%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/registry.py (100%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/sync.py (99%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/tcgen05.py (99%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/types.py (100%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/utils.py (100%)
rename python/tvm/{tirx/operator/intrinsics/cuda =>
backend/cuda/operator/intrinsics}/wgmma.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/__init__.py (89%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/common.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy/_common.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy/_swizzle_iter.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy/fallback.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy/gmem_smem.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy/ld_stmatrix.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy/reg.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy/utils.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy_async/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy_async/dsmem.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy_async/ldgsts.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy_async/tcgen05_cp.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy_async/tcgen05_ldst.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy_async/tma.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/copy_async/utils.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/__init__.py (95%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/_common.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/ops/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/ops/binary.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/ops/cast.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/ops/fma.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/ops/unary.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/reg.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/register.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/smem.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/vec_emit/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/vec_emit/binary_f32x2.py
(100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/vec_emit/cast_vec2.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/elementwise/vec_emit/fma_f32x2.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/exec_scope_utils.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/gemm/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/gemm/mma_m16n8k_.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/gemm_async/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/gemm_async/tcgen05.py (83%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/gemm_utils.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/layout_utils.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/permute_layout/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/permute_layout/warp_xor_swizzle.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/reduction/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/reduction/local.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/reduction/shared.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/reduction/sm100_packed.py (99%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/reduction/utils.py (97%)
rename python/tvm/{tirx/operator/tile_primitive/cuda =>
backend/cuda/operator/tile_primitive}/tma_utils.py (100%)
create mode 100644 python/tvm/backend/cuda/script.py
rename python/tvm/{target/tag_registry/cuda.py => backend/cuda/target_tags.py}
(99%)
create mode 100644 python/tvm/backend/hexagon/__init__.py
rename python/tvm/{target/tag_registry/hexagon.py =>
backend/hexagon/target_tags.py} (98%)
create mode 100644 python/tvm/backend/loader.py
create mode 100644 python/tvm/backend/metal/__init__.py
create mode 100644 python/tvm/backend/metal/op.py
create mode 100644 python/tvm/backend/metal/script.py
rename python/tvm/{target/tag_registry/metal.py =>
backend/metal/target_tags.py} (94%)
create mode 100644 python/tvm/backend/opencl/__init__.py
create mode 100644 python/tvm/backend/rocm/__init__.py
create mode 100644 python/tvm/backend/trn/__init__.py
create mode 100644 python/tvm/backend/trn/layout.py
create mode 100644 python/tvm/backend/trn/op.py
copy python/tvm/{relax/frontend/onnx => backend/trn/operator}/__init__.py (87%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/binary/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/binary/default.py (97%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/binary/utils.py (97%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/common.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/compose_op/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/compose_op/binary_chain.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/compose_op/binary_reduce.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/compose_op/compose_op.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/compose_op/reduce_negate.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/compose_op/unary_reduce.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/compose_op/utils.py (95%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/copy/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/copy/default.py (96%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/dim_utils.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/gemm/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/gemm/default.py (97%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/instruction_generator.py (98%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/private_alloc.py (96%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/reduction/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/reduction/default.py (94%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/reduction/utils.py (95%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/select/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/select/default.py (97%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/unary/__init__.py (100%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/unary/default.py (96%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/unary/utils.py (96%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/unary/with_bias_scale.py (96%)
rename python/tvm/{tirx/operator/tile_primitive/trn =>
backend/trn/operator/tile_primitive}/workspace_utils.py (100%)
create mode 100644 python/tvm/backend/trn/pipeline.py
create mode 100644 python/tvm/backend/trn/script.py
copy python/tvm/{relax/testing/runtime_builtin.py =>
backend/trn/target_tags.py} (62%)
rename python/tvm/{tirx/transform/trn => backend/trn/transform}/__init__.py
(68%)
rename python/tvm/{tirx/transform/trn =>
backend/trn/transform}/naive_allocator.py (98%)
rename python/tvm/{tirx/transform/trn =>
backend/trn/transform}/private_buffer_alloc.py (100%)
create mode 100644 python/tvm/backend/vulkan/__init__.py
copy python/tvm/{s_tir/backend/adreno/transform/_ffi_api.py =>
backend/webgpu/__init__.py} (83%)
delete mode 100644 python/tvm/ffi.py
delete mode 100644 python/tvm/ir/diagnostics/__init__.py
delete mode 100644 python/tvm/ir/diagnostics/_ffi_api.py
delete mode 100644 python/tvm/target/datatype.py
delete mode 100644 python/tvm/target/x86.py
delete mode 100644 python/tvm/tirx/backend/adreno/__init__.py
rename src/{target/cuda => backend/cuda/codegen}/codegen_cuda.cc (96%)
rename src/{target/cuda => backend/cuda/codegen}/codegen_cuda.h (99%)
rename src/{target/cuda => backend/cuda/codegen}/cuda_fallback_module.cc (99%)
rename src/{target/cuda => backend/cuda/codegen}/cuda_fallback_module.h (98%)
rename src/{target/cuda => backend/cuda/codegen}/intrin_rule_cuda.cc (89%)
rename src/{target/cuda => backend/cuda/codegen}/literal/cuda_half_t.h (100%)
rename src/{target/cuda => backend/cuda/codegen}/literal/cuda_int8_t.h (100%)
rename src/{target/cuda => backend/cuda/codegen}/llvm/codegen_nvptx.cc (97%)
rename src/{target => backend/cuda/codegen}/llvm/intrin_rule_nvptx.cc (98%)
rename src/{target/cuda => backend/cuda/codegen}/ptx.cc (97%)
rename src/{target/cuda => backend/cuda/codegen}/ptx.h (100%)
create mode 100644 src/backend/cuda/codegen/target_kind.cc
rename src/{tirx/op/target_builtin/cuda.cc =>
backend/cuda/op/target_builtin.cc} (94%)
rename src/{runtime/cuda => backend/cuda/runtime}/cuda_device_api.cc (84%)
rename src/{runtime/cuda => backend/cuda/runtime}/cuda_module.cc (94%)
rename src/{target/hexagon =>
backend/hexagon/codegen}/hexagon_fallback_module.cc (97%)
rename src/{target/hexagon =>
backend/hexagon/codegen}/hexagon_fallback_module.h (98%)
rename src/{target/hexagon => backend/hexagon/codegen}/llvm/codegen_hexagon.cc
(98%)
rename src/{target/hexagon =>
backend/hexagon/codegen}/llvm/intrin_rule_hexagon.cc (89%)
create mode 100644 src/backend/hexagon/codegen/target_kind.cc
rename src/{runtime/hexagon => backend/hexagon/runtime}/README.md (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_buffer.cc
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_buffer.h (100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/hexagon_buffer_manager.h (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_common.cc
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_common.h (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_device_api.cc
(99%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_device_api.h
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_htp.cc (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_htp.h (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_hvx.cc (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_hvx.h (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_module.cc (94%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/hexagon_power_manager.cc (100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/hexagon_power_manager.h (100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/hexagon_thread_manager.cc (100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/hexagon_thread_manager.h (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_user_dma.cc
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_user_dma.h
(100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/hexagon_user_dma_descriptors.h (100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/hexagon_user_dma_instructions.h (100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/hexagon_user_dma_registers.h (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_vtcm_pool.cc
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/hexagon_vtcm_pool.h
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/ops/conv2d.h (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/ops/conv2d_fp16_hvx.cc
(100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/ops/conv2d_quant_hvx.cc (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/ops/conv_utils.cc
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/profiler/README.md
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/profiler/lwp_handler.S
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/profiler/prof_utils.cc
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/profiler/prof_utils.h
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/qhl/qhl_wrapper.cc
(100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/ring_buffer.h (100%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/rpc/android/session.cc
(96%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/rpc/android_bash.sh.template (100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/rpc/hexagon/rpc_server.cc (98%)
rename src/{runtime/hexagon => backend/hexagon/runtime}/rpc/hexagon_rpc.idl
(100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/rpc/simulator/hexagon_sim_proto.h (100%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/rpc/simulator/rpc_server.cc (99%)
rename src/{runtime/hexagon =>
backend/hexagon/runtime}/rpc/simulator/session.cc (99%)
rename src/{target/metal => backend/metal/codegen}/codegen_metal.cc (95%)
rename src/{target/metal => backend/metal/codegen}/codegen_metal.h (98%)
rename src/{target/metal => backend/metal/codegen}/intrin_rule_metal.cc (91%)
rename src/{target/metal => backend/metal/codegen}/metal_fallback_module.cc
(99%)
rename src/{target/metal => backend/metal/codegen}/metal_fallback_module.h
(98%)
create mode 100644 src/backend/metal/codegen/target_kind.cc
create mode 100644 src/backend/metal/op/target_builtin.cc
rename src/{runtime/metal => backend/metal/runtime}/metal_common.h (99%)
rename src/{runtime/metal => backend/metal/runtime}/metal_device_api.mm (100%)
rename src/{runtime/metal => backend/metal/runtime}/metal_module.mm (98%)
rename src/{target/opencl => backend/opencl/codegen}/codegen_opencl.cc (98%)
rename src/{target/opencl => backend/opencl/codegen}/codegen_opencl.h (98%)
rename src/{target/opencl => backend/opencl/codegen}/intrin_rule_opencl.cc
(93%)
rename src/{target/opencl => backend/opencl/codegen}/opencl_fallback_module.cc
(97%)
rename src/{target/opencl => backend/opencl/codegen}/opencl_fallback_module.h
(98%)
create mode 100644 src/backend/opencl/codegen/target_kind.cc
rename src/{runtime/opencl => backend/opencl/runtime}/opencl_common.h (99%)
rename src/{runtime/opencl => backend/opencl/runtime}/opencl_device_api.cc
(99%)
rename src/{runtime/opencl => backend/opencl/runtime}/opencl_module.cc (97%)
rename src/{runtime/opencl => backend/opencl/runtime}/opencl_wrapper/README.md
(100%)
rename src/{runtime/opencl =>
backend/opencl/runtime}/opencl_wrapper/opencl_wrapper.cc (100%)
rename src/{runtime/opencl => backend/opencl/runtime}/source_utils.h (100%)
rename src/{runtime/opencl => backend/opencl/runtime}/texture.h (100%)
rename src/{target/rocm => backend/rocm/codegen}/llvm/codegen_amdgpu.cc (97%)
rename src/{target/rocm => backend/rocm/codegen}/llvm/intrin_rule_rocm.cc (97%)
rename src/{target/rocm => backend/rocm/codegen}/rocm_fallback_module.cc (99%)
rename src/{target/rocm => backend/rocm/codegen}/rocm_fallback_module.h (98%)
create mode 100644 src/backend/rocm/codegen/target_kind.cc
rename src/{runtime/rocm => backend/rocm/runtime}/rocm_common.h (98%)
rename src/{runtime/rocm => backend/rocm/runtime}/rocm_device_api.cc (100%)
rename src/{runtime/rocm => backend/rocm/runtime}/rocm_module.cc (98%)
rename src/{target/source => backend/trn/codegen}/codegen_trn.cc (91%)
rename src/{target/source => backend/trn/codegen}/codegen_trn.h (98%)
copy src/{tirx/transform/skip_assert.cc => backend/trn/codegen/target_kind.cc}
(54%)
rename src/{tirx/op/target_builtin/trn.cc => backend/trn/op/target_builtin.cc}
(84%)
copy src/{tirx/transform/lower_tirx_cleanup.cc =>
backend/trn/transform/lower_trainium_layout.cc} (83%)
rename src/{target/vulkan => backend/vulkan/codegen}/build_vulkan.cc (88%)
rename src/{target/vulkan => backend/vulkan/codegen}/codegen_spirv.cc (98%)
rename src/{target/vulkan => backend/vulkan/codegen}/codegen_spirv.h (98%)
rename src/{target/vulkan => backend/vulkan/codegen}/intrin_rule_spirv.cc (92%)
rename src/{target/vulkan => backend/vulkan/codegen}/ir_builder.cc (100%)
rename src/{target/vulkan => backend/vulkan/codegen}/ir_builder.h (100%)
rename src/{target/vulkan => backend/vulkan/codegen}/spirv_support.cc (100%)
rename src/{target/vulkan => backend/vulkan/codegen}/spirv_support.h (100%)
rename src/{target/vulkan => backend/vulkan/codegen}/spirv_utils.cc (98%)
rename src/{target/vulkan => backend/vulkan/codegen}/spirv_utils.h (97%)
create mode 100644 src/backend/vulkan/codegen/target_kind.cc
rename src/{target/vulkan => backend/vulkan/codegen}/vulkan_fallback_module.cc
(95%)
rename src/{target/vulkan => backend/vulkan/codegen}/vulkan_fallback_module.h
(98%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/README.md (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/spirv_shader.h (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/thread_map.h (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_amdrgp.cc (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_amdrgp.h (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_buffer.cc (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_buffer.h (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_common.cc (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_common.h (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_device.cc (99%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_device.h (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_device_api.cc
(100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_device_api.h (99%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_instance.cc (99%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_instance.h (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_module.cc (92%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_stream.cc (99%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_stream.h (100%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_wrapped_func.cc
(99%)
rename src/{runtime/vulkan => backend/vulkan/runtime}/vulkan_wrapped_func.h
(97%)
rename src/{target/webgpu => backend/webgpu/codegen}/codegen_webgpu.cc (98%)
rename src/{target/webgpu => backend/webgpu/codegen}/codegen_webgpu.h (98%)
rename src/{target/webgpu => backend/webgpu/codegen}/intrin_rule_webgpu.cc
(92%)
create mode 100644 src/backend/webgpu/codegen/target_kind.cc
rename src/{target/webgpu => backend/webgpu/codegen}/webgpu_fallback_module.cc
(99%)
rename src/{target/webgpu => backend/webgpu/codegen}/webgpu_fallback_module.h
(98%)
delete mode 100644 src/ir/diagnostic.cc
delete mode 100644 src/runtime/cuda/cuda_common.h
delete mode 100644 src/runtime/cuda/l2_cache_flush.cc
delete mode 100644 src/target/datatype/myfloat/myfloat.cc
delete mode 100644 src/target/datatype/posit/posit-wrapper.cc
delete mode 100644 src/target/datatype/registry.cc
delete mode 100644 src/target/datatype/registry.h
delete mode 100644 src/target/opt/README
delete mode 100644 src/tirx/transform/lower_custom_datatypes.cc
delete mode 100644
tests/python/all-platform-minimal-test/test_runtime_ndarray.py
delete mode 100644
tests/python/contrib/test_hexagon/test_2d_physical_buffers.py
delete mode 100644
tests/python/contrib/test_hexagon/test_fixed_point_conversion.py
create mode 100644 tests/python/relax/test_transform_error_enrichment.py
delete mode 100644 tests/python/runtime/test_evaluator_with_preproc.py
rename tests/python/tirx/transform/{test_expr_functor.py =>
test_tirx_expr_functor.py} (100%)