This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a change to branch unity-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git


    from 9a3bbc1d37 [MERGE] merge main into unity 2023-05-20
     add 7d2579e62e [Unity][WebGPU] Try F16 support for  WebGPU Backend (#14904)
     add afa53ded93 [Unity][Transform] Fix scalar case in DefaultGPUSchedule 
(#14928)
     add 6bb531fcb1 [Unity] Allow eliminating only call nodes in CSE pass 
(#14895)
     add 17be2d71ad [Unity] Support causal mask for `R.nn.attention` (#14907)
     add 4b072e05f6 [Unity][WebGPU] Fix WebGPU adapter requiring features 
(#14933)
     add eb60d4cac1 [Unity] Cleanup Web runtime wasi (#14929)
     add f15f612aa7 [Unity] Improve caching logic in webruntime (#14940)
     add 286c6722e0 [Unity] Enhance web cache to add scopes (#14953)
     add 66863f53f6 [Unity] Fix MergeCompositeFunctions for non-CallNode 
dataflow inputs  (#14959)
     add ec127cb0d6 [Unity][BYOC] Make CUTLASS attention rewriting aware of 
fp16 <-> f32 casting  (#14957)
     add dd2f452e4d [Unity][Transform] Fix bug for tir expression in shape in 
fuse_tir (#14931)
     add 7d78b278d3 [Unity] support update KV cache (#14964)
     add 240111fe64 [Unity][CUTLASS] Attention hot fix (#14966)
     add be9f0781a9 [Unity][Frontend] Translate StableHLO to Relax (#14460)
     add cb3bf4014a [Unity][Training] Enhance op gradient (#14932)
     add c448c50cd1 [Unity][CUTLASS] Support more residual input shape (#14968)
     add 9c0b41bc74 [Unity] Allow filtering out unwanted branches in matmul 
combining pass (#14971)
     add a2bddcf3b2 [Unity] Add popn to kvcache (#14970)
     add f796b69a3b [Unity] Fix LazyTransformParams use-def analysis and 
binding emission (#14974)
     add 34e87ae7f8 [Unity] Fix ConvertLayout on binary elemwise ops involving 
scalar input (#14961)
     add 44c849f144 [Unity] Reset match state when backtracking (#14984)
     add 0a0b11982e [Unity][Training] Avoid problematic inputs to nll_loss in 
test_op_gradient_numeric (#14987)
     add f360556a0f [Unity][CUTLASS]  Fix circular import bug in relax cutlass 
backend (#15001)
     add 68800fa810 [Contrib] Use f-strings for string formatting, NFC (#14893)
     add 5a094bce74 [TIR] Expand unit tests for ConvertSSA (#14892)
     add 53cee4bca3 [TVMScript] Round-trip DeclBuffer with undefined data 
pointer (#14900)
     add cd4551353b [LLVM] Codegen subroutine call when CallNode::op is 
GlobalVar (#14901)
     add dddc339d2f [Test Cases][BugFix] Add some version check to make test 
cases run in all PyTorch versions (#14903)
     add 4f99750611 [TypoFix] fix some typo problem in keras frontend (#14916)
     add d776bccc1e [Relay] add a dimension check to reject invalid input 
(#14925)
     add 35ac2262eb [Docker] Add polly package (#14912)
     add 172120a952 [TestCases] fix unreachable test cases due to outside the 
for-loop (#14934)
     add d9c1ba60f4 [microNPU][ETHOSU] Add offloading to the NPU the 
nn.avg_pool2d operator with a stride > 3 (#14861)
     add 1b9678cee0 [TVMScript] Allow T.target("device", host="host") to 
specify host (#14915)
     add dbcd198670 [BUILD] Enable log before throw message in windows (#14937)
     add e11913be06 [Target] Add target to all TVM callbacks (#14939)
     add 1c39613811 [LLVM] Expose Host CPU Feature Detection (#14946)
     add bcf7abba29 Fix pytorch axis (#14930)
     add 41a616ffba [TIR] Handle subroutine calls in MakePackedAPI (#14913)
     add 94f4e25a40 [TIR] Handle subroutine calls in MakeUnpackedAPI (#14914)
     add 5fd49f78ad [TIR] Restrict tir.transform.CombineContextCall to host 
functions (#14945)
     add 86ba26d854 [Bugfix][TIR] Avoid symbol conflicts in 
MakePackedAPI/MakeUnpackedAPI (#14950)
     add 81056cccd5 [TIR] Preserve existing kTarget function attribute in 
BindTarget (#14942)
     add 875217c79b [TIR] Restrict tir.transform.InstallDebugSpans to host 
functions (#14943)
     add 3a15eafd14 [TVMScript] Prevent bool to int conversion in T.Assert 
condition (#14941)
     add 1aeb34af33 [Codegen][LLVM] Allow void return type from PackedFunc  
(#14958)
     add 6198c7fd8a [METAL] Fix int8 vectorized cast (#14962)
     add 6eb0779442 [TIR] SplitHostDevice, handle subroutines (#14918)
     add 227c4e87f3 [CI] Update the expected CI jobs list in the update_branch 
script (#14908)
     add 4f041c96e7 [Runtime,RPC] Use f-strings for string formatting, NFC 
(#14967)
     add c47d36e77c [Hexagon] Add support for v73, make v68 default (#14965)
     add 5f1421dd0f [Bugfix][PyTorch] Support use_input_stats in instance_norm 
(#14963)
     add 94c1b89abc [TVMScript][TIR] Parse subroutine calls with no arguments 
(#14919)
     add 43f06ca42a [TIR] Avoid re-defining `var = arg_var` in ArgBinder 
(#14952)
     add 4267fbf6a1 [IR][SIBuilder] (#14574)
     add 7131411f0a [Bugfix][TIR][VTA] Update host-side target, even without 
device func (#14982)
     add c8f97aa877 [TIR] Fix incorrect construction of block frames (#14993)
     add c98e29bbf4 [Bugfix] Make ThreadAllReduce pass compatible with int64 
(#14991)
     add ea57778087 [CMAKE] Add a dummy target to defer libtvm dep (#14975)
     add 4eb1a4fdf0 [Hexagon] Update instructions to compile hexagon runtime 
(#14948)
     add 7fe58a181c [OpenCL] Improve diagnostic message (#14995)
     add 443fa20e20 [RUNTIME] Update Module and Registry to use String 
Container (#14902)
     add 9da026194f [OpenCL][Adreno] Fix conv2d when output channels < 4 
(#14996)
     add 8543cec133 [Hexagon] Remove "c" as aot_host_target 
tvm/contrib/hexagon/pytest_pl… (#14997)
     add b13be936a0 [DataType] Initial support of fp8 (e4m3/e5m2) (#14863)
     add 1608ca82ce [CMAKE] Introduce dummy build as an option (#15000)
     add ca30b13668 [DOCKER][ADRENO] Docker setup changes for multi user 
environments (#15003)
     add 7f02606af2 [OPENCL] Always use convert_T for type conversion (#14972)
     new 153f70aa60 [MERGE] Merge main into unity 2023-06-01

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CMakeLists.txt                                     |   21 +-
 apps/dso_plugin_module/plugin_module.cc            |    3 +-
 apps/hexagon_launcher/README.md                    |   14 +-
 apps/ios_rpc/tests/ios_rpc_mobilenet.py            |   27 +-
 apps/ios_rpc/tests/ios_rpc_test.py                 |   13 +-
 apps/topi_recipe/broadcast/test_broadcast_map.py   |   12 +-
 apps/topi_recipe/conv/depthwise_conv2d_test.py     |   13 +-
 apps/topi_recipe/conv/test_conv2d_hwcn_map.py      |    9 +-
 apps/topi_recipe/reduce/test_reduce_map.py         |   10 +-
 apps/topi_recipe/rnn/lstm.py                       |    9 +-
 apps/topi_recipe/rnn/matexp.py                     |   13 +-
 ci/scripts/github/update_branch.py                 |   19 +-
 cmake/config.cmake                                 |    4 +-
 cmake/modules/HexagonSDK.cmake                     |    9 +-
 cmake/modules/LibInfo.cmake                        |    1 +
 docker/install/ubuntu2204_install_llvm.sh          |    2 +-
 docker/install/ubuntu_install_python_package.sh    |    3 +-
 include/tvm/ir/si_builder.h                        |  103 +
 include/tvm/ir/source_map.h                        |   46 +-
 include/tvm/relax/attrs/nn.h                       |    3 +
 include/tvm/relax/transform.h                      |    3 +-
 include/tvm/runtime/data_type.h                    |   34 +
 include/tvm/runtime/logging.h                      |    6 +-
 include/tvm/runtime/module.h                       |   19 +-
 include/tvm/runtime/packed_func.h                  |    2 +-
 include/tvm/runtime/registry.h                     |   12 +-
 include/tvm/runtime/relax_vm/executable.h          |    6 +-
 include/tvm/runtime/vm/executable.h                |    4 +-
 include/tvm/runtime/vm/vm.h                        |    2 +-
 include/tvm/target/target.h                        |   13 +
 include/tvm/tir/op.h                               |    3 +-
 include/tvm/tir/transform.h                        |   52 +
 jvm/core/src/test/scripts/test_add_gpu.py          |    4 +-
 python/gen_requirements.py                         |    1 +
 python/tvm/_ffi/runtime_ctypes.py                  |   22 +
 python/tvm/contrib/clang.py                        |    6 +-
 python/tvm/contrib/cudnn.py                        |    6 +-
 python/tvm/contrib/cutlass/attention_operation.py  |    1 +
 python/tvm/contrib/cutlass/build.py                |   87 +-
 python/tvm/contrib/cutlass/conv2d_operation.py     |   36 +-
 python/tvm/contrib/cutlass/gemm_operation.py       |   22 +-
 python/tvm/contrib/cutlass/gen_conv2d.py           |    2 +-
 python/tvm/contrib/cutlass/gen_gemm.py             |    2 +-
 python/tvm/contrib/cutlass/gen_tensor_op.py        |   54 +-
 python/tvm/contrib/cutlass/library.py              |   23 +-
 python/tvm/contrib/graph_executor.py               |    2 +-
 .../contrib/hexagon/profiling/process_lwp_data.py  |    2 +-
 python/tvm/contrib/hexagon/pytest_plugin.py        |    2 +-
 python/tvm/contrib/hexagon/tools.py                |    8 +-
 python/tvm/contrib/nvcc.py                         |   29 +-
 python/tvm/contrib/peak.py                         |    4 +-
 python/tvm/contrib/pickle_memoize.py               |    4 +-
 python/tvm/contrib/pipeline_executor.py            |   20 +-
 python/tvm/contrib/pipeline_executor_build.py      |   22 +-
 python/tvm/contrib/rocm.py                         |    4 +-
 python/tvm/contrib/sdaccel.py                      |   10 +-
 python/tvm/contrib/sparse.py                       |    8 +-
 python/tvm/contrib/tar.py                          |    2 +-
 python/tvm/contrib/target/coreml.py                |   23 +-
 python/tvm/contrib/target/onnx.py                  |   79 +-
 python/tvm/contrib/target/vitis_ai.py              |    2 +-
 python/tvm/contrib/tf_op/module.py                 |    2 +-
 python/tvm/contrib/utils.py                        |    2 +-
 python/tvm/contrib/xcode.py                        |    6 +-
 python/tvm/ir/__init__.py                          |    1 +
 python/tvm/ir/base.py                              |   17 +
 python/tvm/relax/backend/contrib/__init__.py       |   20 -
 python/tvm/relax/backend/contrib/cutlass.py        |   23 +-
 python/tvm/relax/backend/patterns.py               |   41 +-
 python/tvm/relax/dpl/rewrite.py                    |    9 +-
 .../tvm/relax/frontend/stablehlo}/__init__.py      |    6 +-
 .../frontend/stablehlo/stablehlo_translator.py     |  445 +
 python/tvm/relax/frontend/torch/fx_translator.py   |    9 +-
 python/tvm/relax/op/_op_gradient.py                |   74 +-
 python/tvm/relax/op/nn/nn.py                       |   31 +-
 .../tvm/relax/transform/lazy_transform_params.py   |   72 +-
 python/tvm/relax/transform/legalize_ops/nn.py      |   26 +-
 python/tvm/relax/transform/transform.py            |   21 +-
 python/tvm/relay/__init__.py                       |    1 +
 .../tvm/relay/backend/contrib/ethosu/legalize.py   |   11 +-
 python/tvm/relay/base.py                           |    2 +-
 python/tvm/relay/frontend/keras.py                 |    4 +-
 python/tvm/relay/frontend/pytorch.py               |   26 +-
 python/tvm/relay/op/contrib/ethosu.py              |   27 +-
 python/tvm/relay/op/strategy/adreno.py             |   47 +-
 python/tvm/rpc/base.py                             |    6 +-
 python/tvm/rpc/client.py                           |   23 +-
 python/tvm/rpc/minrpc.py                           |    4 +-
 python/tvm/rpc/proxy.py                            |   22 +-
 python/tvm/rpc/server.py                           |   12 +-
 python/tvm/rpc/testing.py                          |    4 +-
 python/tvm/rpc/tracker.py                          |   16 +-
 python/tvm/runtime/container.py                    |   14 +-
 python/tvm/runtime/executor/aot_executor.py        |    2 +-
 python/tvm/runtime/module.py                       |   38 +-
 python/tvm/runtime/ndarray.py                      |   36 +-
 python/tvm/runtime/object.py                       |    7 +-
 python/tvm/runtime/object_generic.py               |    6 +-
 python/tvm/runtime/vm.py                           |   15 +-
 python/tvm/script/ir_builder/tir/ir.py             |   24 +-
 python/tvm/script/parser/tir/parser.py             |   14 +-
 python/tvm/target/target.py                        |    8 +-
 python/tvm/tir/op.py                               |    2 +-
 python/tvm/tir/transform/transform.py              |   84 +
 src/driver/driver_api.cc                           |    5 +
 src/ir/expr.cc                                     |   14 +-
 src/ir/si_builder.cc                               |  325 +
 src/ir/source_map.cc                               |   60 +
 src/relax/ir/dataflow_matcher.cc                   |   30 +-
 src/relax/op/nn/attention.cc                       |    7 +-
 src/relax/op/nn/attention.h                        |    3 +-
 src/relax/op/tensor/binary.cc                      |    9 +
 src/relax/transform/combine_parallel_matmul.cc     |   41 +-
 src/relax/transform/convert_layout.cc              |    2 +-
 src/relax/transform/eliminate_common_subexpr.cc    |   21 +-
 src/relax/transform/fuse_ops.cc                    |    3 +-
 src/relax/transform/fuse_tir.cc                    |    3 +-
 src/relax/transform/merge_composite_functions.cc   |   11 +
 src/relay/backend/aot_executor_codegen.cc          |    2 +-
 src/relay/backend/build_module.cc                  |    2 +-
 src/relay/backend/contrib/ethosu/source_module.cc  |    8 +-
 src/relay/backend/graph_executor_codegen.cc        |    2 +-
 src/relay/backend/vm/compiler.cc                   |    2 +-
 src/relay/backend/vm/compiler.h                    |    2 +-
 src/relay/printer/model_library_format_printer.cc  |    2 +-
 src/runtime/aot_executor/aot_executor.cc           |    3 +-
 src/runtime/aot_executor/aot_executor.h            |    2 +-
 src/runtime/aot_executor/aot_executor_factory.cc   |    2 +-
 src/runtime/aot_executor/aot_executor_factory.h    |    2 +-
 src/runtime/const_loader_module.cc                 |    2 +-
 src/runtime/contrib/coreml/coreml_runtime.h        |    2 +-
 src/runtime/contrib/coreml/coreml_runtime.mm       |    3 +-
 src/runtime/contrib/dnnl/dnnl_json_runtime.cc      |    2 +-
 src/runtime/contrib/ethosn/ethosn_runtime.cc       |    5 +-
 src/runtime/contrib/ethosn/ethosn_runtime.h        |    4 +-
 src/runtime/contrib/json/json_runtime.h            |    4 +-
 src/runtime/contrib/libtorch/libtorch_runtime.cc   |    4 +-
 src/runtime/contrib/onnx/onnx_module.cc            |    6 +-
 src/runtime/contrib/tflite/tflite_runtime.cc       |    3 +-
 src/runtime/contrib/tflite/tflite_runtime.h        |    2 +-
 src/runtime/contrib/vitis_ai/vitis_ai_runtime.cc   |    3 +-
 src/runtime/contrib/vitis_ai/vitis_ai_runtime.h    |    2 +-
 src/runtime/cuda/cuda_module.cc                    |   11 +-
 .../cuda_graph/graph_runtime_cuda_graph.cc         |    4 +-
 .../graph_executor/debug/graph_executor_debug.cc   |    2 +-
 .../graph_executor/debug/graph_executor_debug.h    |    2 +-
 src/runtime/graph_executor/graph_executor.cc       |    3 +-
 src/runtime/graph_executor/graph_executor.h        |    2 +-
 .../graph_executor/graph_executor_factory.cc       |    2 +-
 .../graph_executor/graph_executor_factory.h        |    2 +-
 src/runtime/hexagon/README.md                      |    9 +-
 src/runtime/hexagon/hexagon_module.cc              |    6 +-
 src/runtime/hexagon/hexagon_module.h               |    6 +-
 src/runtime/hexagon/rpc/simulator/session.cc       |   30 +
 src/runtime/library_module.cc                      |    5 +-
 src/runtime/metadata.cc                            |    2 +-
 src/runtime/metal/metal_module.mm                  |    9 +-
 src/runtime/module.cc                              |   15 +-
 src/runtime/opencl/opencl_common.h                 |    8 +-
 src/runtime/opencl/opencl_module.cc                |   14 +-
 src/runtime/opencl/opencl_module_spirv.cc          |    6 +-
 src/runtime/opencl/sdaccel/sdaccel_module.cc       |    2 +-
 src/runtime/pipeline/pipeline_executor.cc          |    2 +-
 src/runtime/pipeline/pipeline_executor.h           |    2 +-
 src/runtime/registry.cc                            |   18 +-
 src/runtime/relax_vm/executable.cc                 |    6 +-
 src/runtime/relax_vm/lm_support.cc                 |   35 +
 src/runtime/relax_vm/ndarray_cache_support.cc      |    4 +-
 src/runtime/relax_vm/vm.cc                         |    6 +-
 src/runtime/rocm/rocm_module.cc                    |    9 +-
 src/runtime/rpc/rpc_module.cc                      |    4 +-
 src/runtime/stackvm/stackvm_module.cc              |    9 +-
 src/runtime/static_library.cc                      |    4 +-
 src/runtime/vm/executable.cc                       |    6 +-
 src/runtime/vm/profiler/vm.cc                      |    2 +-
 src/runtime/vm/profiler/vm.h                       |    2 +-
 src/runtime/vm/vm.cc                               |    3 +-
 src/runtime/vulkan/vulkan_module.cc                |    2 +-
 src/runtime/vulkan/vulkan_wrapped_func.cc          |    6 +-
 src/runtime/vulkan/vulkan_wrapped_func.h           |    6 +-
 src/script/printer/tir/block.cc                    |    3 +-
 src/script/printer/tir/buffer.cc                   |   29 +-
 src/script/printer/tir/function.cc                 |    6 +-
 src/script/printer/tir/stmt.cc                     |   24 +-
 src/script/printer/tir/utils.h                     |   28 +-
 src/support/ffi_testing.cc                         |    4 +-
 src/support/libinfo.cc                             |    1 +
 src/support/scalars.h                              |   12 +
 src/target/llvm/codegen_amdgpu.cc                  |   10 +-
 src/target/llvm/codegen_cpu.cc                     |   51 +-
 src/target/llvm/codegen_cpu.h                      |    2 +-
 src/target/llvm/codegen_hexagon.cc                 |    4 +-
 src/target/llvm/codegen_llvm.cc                    |  110 +-
 src/target/llvm/codegen_llvm.h                     |   61 +-
 src/target/llvm/codegen_nvptx.cc                   |   14 +-
 src/target/llvm/llvm_module.cc                     |   33 +-
 src/target/opt/build_cuda_on.cc                    |    4 +-
 src/target/source/codegen_aocl.cc                  |    2 +-
 src/target/source/codegen_cuda.cc                  |   17 +
 src/target/source/codegen_cuda.h                   |    5 +-
 src/target/source/codegen_metal.cc                 |    7 +-
 src/target/source/codegen_opencl.cc                |   12 +-
 src/target/source/codegen_vhls.cc                  |    5 +-
 src/target/source/codegen_webgpu.cc                |   28 +-
 src/target/source/codegen_webgpu.h                 |    8 +
 src/target/source/interface_c.cc                   |    4 +-
 src/target/source/source_module.cc                 |   28 +-
 src/target/spirv/spirv_utils.cc                    |    2 +-
 src/target/target.cc                               |   15 +
 src/tir/ir/stmt.cc                                 |    3 +
 src/tir/op/op.cc                                   |    7 +
 src/tir/schedule/state.cc                          |    2 +-
 src/tir/transforms/annotate_device_regions.cc      |   81 +
 src/tir/transforms/combine_context_call.cc         |    7 +-
 src/tir/transforms/default_gpu_schedule.cc         |   10 +
 src/tir/transforms/dtype_conversion.cc             |  101 +
 src/tir/transforms/dtype_conversion.h              |  165 +
 src/tir/transforms/install_debug_spans.cc          |   36 +-
 src/tir/transforms/ir_utils.cc                     |   32 +-
 src/tir/transforms/ir_utils.h                      |   13 +
 src/tir/transforms/lower_device_kernel_launch.cc   |  305 +
 src/tir/transforms/lower_thread_allreduce.cc       |    3 +-
 src/tir/transforms/make_packed_api.cc              |  160 +-
 src/tir/transforms/make_unpacked_api.cc            |  113 +-
 src/tir/transforms/primfunc_utils.cc               |   30 +-
 src/tir/transforms/split_host_device.cc            |  272 +-
 ...6_legalize.cc => unsupported_dtype_legalize.cc} |  264 +-
 tests/cpp/si_builder_test.cc                       |  399 +
 tests/lint/check_file_type.py                      |    2 +
 tests/python/ci/test_ci.py                         |   49 +-
 tests/python/contrib/test_ethosu/test_codegen.py   |   32 +
 tests/python/contrib/test_ethosu/test_legalize.py  |  211 +
 tests/python/contrib/test_hexagon/README.md        |    2 +-
 tests/python/frontend/keras/test_forward.py        |    2 +-
 tests/python/frontend/pytorch/test_forward.py      |   29 +-
 tests/python/integration/test_ewise.py             |    5 +-
 tests/python/integration/test_ewise_fpga.py        |    3 +-
 tests/python/relax/test_codegen_cutlass.py         |  235 +-
 tests/python/relax/test_dataflow_pattern.py        |  112 +-
 tests/python/relax/test_frontend_from_fx.py        |   36 +
 tests/python/relax/test_frontend_stablehlo.py      |  355 +
 tests/python/relax/test_op_gradient_numeric.py     |  104 +-
 .../test_transform_combine_parallel_matmul.py      |   52 +
 .../python/relax/test_transform_convert_layout.py  |   45 +-
 tests/python/relax/test_transform_cse.py           |   32 +-
 tests/python/relax/test_transform_fuse_tir.py      |   76 +
 tests/python/relax/test_transform_gradient.py      |    6 +-
 .../relax/test_transform_lazy_transform_params.py  |   66 +-
 .../python/relax/test_transform_legalize_ops_nn.py |  105 +-
 .../test_transform_merge_composite_functions.py    |  110 +
 .../opencl_texture/test_conv2d_nchw_texture.py     |   30 +
 .../opencl_texture/test_conv2d_nhwc_texture.py     |   30 +
 .../test_depthwise_conv2d_nchw_texture.py          |   31 +
 .../test_depthwise_conv2d_nhwc_texture.py          |   31 +
 tests/python/tir/test_debug_info.py                |   50 +-
 tests/python/unittest/test_datatype_nv_fp8.py      |  104 +
 tests/python/unittest/test_target_codegen_llvm.py  |   89 +-
 tests/python/unittest/test_target_codegen_metal.py |   30 +-
 .../python/unittest/test_target_codegen_opencl.py  |   80 +-
 .../unittest/test_target_texture_codegen_opencl.py |   38 +-
 .../unittest/test_te_schedule_bound_inference.py   |   56 +-
 .../test_tir_transform_annotate_device_regions.py  |   58 +
 .../unittest/test_tir_transform_bf16_legalize.py   |    5 +-
 .../test_tir_transform_combine_context_call.py     |  102 +-
 .../unittest/test_tir_transform_convert_ssa.py     |  253 +
 .../test_tir_transform_device_kernel_launch.py     |  193 +
 .../unittest/test_tir_transform_fp8_legalize.py    |  224 +
 .../python/unittest/test_tir_transform_helpers.py  |  112 +
 .../test_tir_transform_inject_ptx_async_copy.py    |    4 +-
 .../python/unittest/test_tir_transform_ir_utils.py |   40 -
 .../test_tir_transform_lower_warp_memory.py        |   37 +-
 .../unittest/test_tir_transform_make_packed_api.py |  128 +-
 .../test_tir_transform_make_unpacked_api.py        |  158 +-
 .../test_tir_transform_split_host_device.py        |  129 +-
 .../unittest/test_tir_transform_thread_sync.py     |    5 +-
 .../test_transform_default_gpu_schedule.py         |   35 +
 .../python/unittest/test_tvmscript_printer_tir.py  |    7 +-
 tests/python/unittest/test_tvmscript_roundtrip.py  |   80 +
 .../python/unittest/test_tvmscript_syntax_sugar.py |   35 +
 tests/scripts/setup-adreno-env.sh                  |   13 +-
 tests/scripts/task_config_build_cpu.sh             |    3 -
 tests/scripts/task_config_build_i386.sh            |    3 -
 tests/scripts/task_python_adreno.sh                |   13 +-
 tests/scripts/task_python_vta_fsim.sh              |    3 +
 web/.gitignore                                     |    1 +
 web/Makefile                                       |   12 +-
 web/apps/node/example.js                           |    5 +-
 web/emcc/decorate_as_wasi.py                       |   20 +-
 web/emcc/webgpu_runtime.cc                         |    6 +-
 web/package-lock.json                              | 8609 +++-----------------
 web/package.json                                   |   37 +-
 web/{rollup.config.js => rollup.config.mjs}        |   22 +-
 web/src/compact.ts                                 |   10 +
 web/src/index.ts                                   |    8 +-
 web/src/memory.ts                                  |    3 +-
 web/src/runtime.ts                                 |   82 +-
 web/src/tvmjs_runtime_wasi.d.ts                    |    8 +
 web/src/webgpu.ts                                  |   10 +-
 web/tests/node/test_module_load.js                 |    3 +-
 web/tests/node/test_ndarray.js                     |    7 +-
 web/tests/node/test_object.js                      |    5 +-
 web/tests/node/test_packed_func.js                 |    7 +-
 web/tests/node/test_relax_vm.js                    |    4 +-
 web/tsconfig.json                                  |    5 +-
 web/typedoc.json                                   |    5 +-
 305 files changed, 9012 insertions(+), 9375 deletions(-)
 create mode 100644 include/tvm/ir/si_builder.h
 delete mode 100644 python/tvm/relax/backend/contrib/__init__.py
 copy {apps/uma/_template => python/tvm/relax/frontend/stablehlo}/__init__.py 
(86%)
 create mode 100644 python/tvm/relax/frontend/stablehlo/stablehlo_translator.py
 create mode 100644 src/ir/si_builder.cc
 create mode 100644 src/tir/transforms/annotate_device_regions.cc
 create mode 100644 src/tir/transforms/dtype_conversion.cc
 create mode 100644 src/tir/transforms/dtype_conversion.h
 create mode 100644 src/tir/transforms/lower_device_kernel_launch.cc
 rename src/tir/transforms/{bf16_legalize.cc => unsupported_dtype_legalize.cc} 
(69%)
 create mode 100644 tests/cpp/si_builder_test.cc
 create mode 100644 tests/python/relax/test_frontend_stablehlo.py
 create mode 100644 tests/python/unittest/test_datatype_nv_fp8.py
 create mode 100644 
tests/python/unittest/test_tir_transform_annotate_device_regions.py
 create mode 100644 tests/python/unittest/test_tir_transform_convert_ssa.py
 create mode 100644 
tests/python/unittest/test_tir_transform_device_kernel_launch.py
 create mode 100644 tests/python/unittest/test_tir_transform_fp8_legalize.py
 delete mode 100644 tests/python/unittest/test_tir_transform_ir_utils.py
 rename web/{rollup.config.js => rollup.config.mjs} (71%)
 create mode 100644 web/src/tvmjs_runtime_wasi.d.ts

Reply via email to