This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a change to branch unity-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git


    from d6af4af23e [MERGE] Merge main into unity 2023-04-23
     add 129fd8f84f [Unity] Reduce cast to fp32 for constant input in AMP 
(#14679)
     add e47bf761d4 [Unity] Smart parameter fetching (#14708)
     add d468419ea0 [Unity][Training] Trainer and SetupTrainer (#14706)
     add 24480ea573 [Unity][CUTLASS] Fixed memory leak in attention kernel 
offload (#14723)
     add 84e286be71 [Unity] Cache ndarray-cache.json (#14722)
     add 9bff70ab06 [Unity] Allow KVCache Access without Shape (#14726)
     add 94e3d512dd [Unity][CUTLASS] Fixed stacked attention offload when QKV 
reshape uses the same shape expression (#14728)
     add 9c36056a12 [Unity] Allow modifying function signature by AMP to accept 
fp16 inputs  (#14719)
     add 2232207afd [Unity] Fix ForceNarrowIndexToI32 so it ignores i16 (#14733)
     add b40391961a [Unity] Add system lib build option to relax (#14734)
     add 9914a77b0f [Unity] Process all Relax functions in 
CompositeFunctionAnnotator (#14736)
     add 126f8d082b [Unity] Add options to MS tuning pass to enable more 
fine-grained tuning (#14730)
     add dd02791e5a [Uniy][Op] Expand support of attention bias layout (#14737)
     add 65d99bd751 [Unity][CUTLASS] Add layer norm support (#14731)
     add 2bd506fa92 [Unity][BYOC] Support implicit attention patterns (#14744)
     add 107ef2b65e [Unity][Relax] Memory planning for call_tir_dyn (#14750)
     add 7a13d1e807 Fix super().visit_var_binding_ in PyExprVisitor and 
PyExprMutator (#14754)
     add f9b07ae9ed [Unity][VM] `kill_tensor` and `kill_storage` releasing 
NDArray in VM at runtime (#14753)
     add a0e0f275ab [Unity] Limit number of characters in logger names (#14752)
     add bda94c3544 [Unity] Fix FX translator no output issue (#14761)
     add e10ddf110d [Unity] Enhance CopyWithNewVars utility (#14764)
     add 56516c2bc5 [Unity][CI] Use the upgraded images (#14768)
     add 40fab65be3 [Unity] Update LM Sample builtins (#14793)
     add 98a25dfec9 [Unity][Pass] Lazy transform params (#14769)
     add bf11516632 [Unity][Training] More Relax operators gradient supported 
(#14777)
     add 1a171396fa [Torch] fix typo in new_full (#14699)
     add f5ab3f05eb [TIR] [Analysis] Calculate allocated memory at module level 
(#14711)
     add 6314b256cd [ci] Add PLATFORM env var to builds (#14713)
     add 486c49895d [TIR] [Hexagon] Add get_vtcm_allocation_sizes with lowering 
(#14720)
     add 1a70083711 [microNPU][ETHOSU] Softmax int8 legalization support 
(#14629)
     add d32dea800b [METAL] Update metal runtime to directly store kernel map 
(#14727)
     add f9ae487aee [Keras][Bugfix] fix a bug about alpha attribute in 
LeakyReLU which lead to passes conflict (#14707)
     add c254b75705 [AutoTVM] Added Droplet algorithm in TVM (#14683)
     add 30b34d2521 [TIR] More flexible buffer compaction (#14021)
     add e2e1696c74 [Relay] Handle pad value coming from Tensor instead of 
scalar (#14735)
     add a8d26f4452 [Relay][Bugfix] Fix stride in LpPool for default (#14740)
     add 9530a8de1e [PyTorch]  Add aten::new_zeros (#14747)
     add c1e88b365a [Relay] Simplify the square of a binomial (#14580)
     add d1e1b4c8b2 [bugfix][relay] Fix alpha attribute with None in ELU  
(#14742)
     add cc8cce50fa [TOPI] Vectorize depthwise conv2d output operator (#14519)
     add 4d37a0a078 [CI] Upgrade CI (#14635)
     add f1033adf04 [TIR] Keep block annotations from tensorization (#14741)
     add 706a5b2dd5 [Target][Minor] Add A6000 Target Tag (#14762)
     add eca6edf7fa [CI] sccache support while using ci.py under multi user 
environments (#14666)
     add 133807fb77 [COMMUNITY] Add new key for release signing (#14772)
     add aa7d2bff6b [CI] Modify test cases to accommodate the CI upgrades 
(#14651)
     add 129492650e [Arith] Implement statistics counters for RewriteSimplifier 
(#14532)
     add ddd2e81e49 [TVMScript] Preserve LetStmt of constants (#14531)
     add a954742fba [TIR][Driver] Move ShouldAnnotateEntryFunc logic into 
transform (#14562)
     add 6c53d85f79 [CI] Fix doc deploy issue (#14780)
     add ca7c3d1066 [Runtime] Added __str__ for 
tvm._ffi.runtime_ctypes.TVMArray (#14775)
     add 298a0a4834 [Bugfix][MetaSchedule] RPC port needs to be an integer 
(#14781)
     add c224a12e47 [Bugfix][Codegen][CUDA] Wrong casting in ASM (#14782)
     add 20ded77c25 fix: use arm on demand instead of spot (#14779)
     add bf1be357ae [MetaSchedule] Clear screen only when specified (#14785)
     add 571eff9223 [BugFix][Relay] fix `scatter_nd` type relation (#14773)
     add c265cdae97 [TVMScript] Add `__name__` attr for parsed PrimFunc and 
IRModule (#14786)
     add f9890335da More clear ref of thirdparty license (#14790)
     add 01324ef97c [MetaSchedule] Add an API to dump a pruned database (#14783)
     add e08caef289 [MetaSchedule] Enhance `tune_tir` to tune IRModule of TIR 
Collections (#14784)
     new f7eed214d1 [MERGE] Merge main into unity 2023-05-07

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 KEYS                                               |   59 +
 LICENSE                                            |    1 +
 apps/bundle_deploy/demo.cc                         |    2 +-
 apps/bundle_deploy/test.cc                         |    2 +-
 apps/microtvm/reference-vm/base-box-tool.py        |    2 +-
 .../reference-vm/base-box/base_box_setup_common.sh |    6 +-
 .../src/host_driven/fvp/semihost.c                 |    4 +-
 ci/jenkins/docker-images.ini                       |   20 +-
 ci/jenkins/generated/arm_jenkinsfile.groovy        |   27 +-
 ci/jenkins/generated/cortexm_jenkinsfile.groovy    |   35 +-
 ci/jenkins/generated/cpu_jenkinsfile.groovy        |   23 +-
 ci/jenkins/generated/docker_jenkinsfile.groovy     |    4 +-
 ci/jenkins/generated/gpu_jenkinsfile.groovy        |   37 +-
 ci/jenkins/generated/hexagon_jenkinsfile.groovy    |   27 +-
 ci/jenkins/generated/i386_jenkinsfile.groovy       |   17 +-
 ci/jenkins/generated/lint_jenkinsfile.groovy       |    4 +-
 .../generated/minimal_cross_isa_jenkinsfile.groovy |   13 +-
 ci/jenkins/generated/minimal_jenkinsfile.groovy    |   13 +-
 ci/jenkins/generated/riscv_jenkinsfile.groovy      |   17 +-
 ci/jenkins/generated/wasm_jenkinsfile.groovy       |   11 +-
 ci/jenkins/templates/arm_jenkinsfile.groovy.j2     |    1 +
 ci/jenkins/templates/cortexm_jenkinsfile.groovy.j2 |    1 +
 ci/jenkins/templates/cpu_jenkinsfile.groovy.j2     |    1 +
 ci/jenkins/templates/gpu_jenkinsfile.groovy.j2     |    1 +
 ci/jenkins/templates/hexagon_jenkinsfile.groovy.j2 |    1 +
 ci/jenkins/templates/i386_jenkinsfile.groovy.j2    |    1 +
 .../minimal_cross_isa_jenkinsfile.groovy.j2        |    1 +
 ci/jenkins/templates/minimal_jenkinsfile.groovy.j2 |    1 +
 ci/jenkins/templates/riscv_jenkinsfile.groovy.j2   |    5 +-
 ci/jenkins/templates/utils/base.groovy.j2          |    2 +-
 ci/jenkins/templates/utils/macros.j2               |   11 +-
 ci/jenkins/templates/wasm_jenkinsfile.groovy.j2    |    1 +
 ci/jenkins/unity_jenkinsfile.groovy                |    6 +-
 ci/scripts/jenkins/s3.py                           |    5 +-
 cmake/modules/contrib/VitisAI.cmake                |    2 +-
 docker/Dockerfile.ci_arm                           |    6 +-
 docker/Dockerfile.ci_cortexm                       |   10 +-
 docker/Dockerfile.ci_cpu                           |   15 +-
 docker/Dockerfile.ci_gpu                           |   10 +-
 docker/Dockerfile.ci_i386                          |   20 +-
 docker/Dockerfile.ci_jekyll                        |    2 +-
 docker/Dockerfile.ci_lint                          |   15 +-
 docker/Dockerfile.ci_minimal                       |   18 +-
 docker/Dockerfile.ci_riscv                         |   10 +-
 docker/Dockerfile.ci_wasm                          |   10 +-
 docker/Dockerfile.conda_cpu                        |    2 +-
 docker/Dockerfile.demo_android                     |   12 +-
 docker/Dockerfile.demo_opencl                      |    2 +-
 docker/Dockerfile.demo_rocm                        |   10 +-
 docker/Dockerfile.demo_vitis_ai                    |    4 +-
 docker/Dockerfile.docs                             |    2 +-
 ..._install_core.sh => ubuntu2004_install_core.sh} |    5 +-
 docker/install/ubuntu2004_install_llvm.sh          |   35 -
 docker/install/ubuntu2004_install_python.sh        |    2 +-
 ...age.sh => ubuntu2004_install_python_package.sh} |    1 -
 ...nstall_redis.sh => ubuntu2004_install_redis.sh} |    3 +-
 ...=> ubuntu2204_install_aarch64_cross_compile.sh} |   14 +-
 ..._install_llvm.sh => ubuntu2204_install_llvm.sh} |   38 +-
 ...l_llvm.sh => ubuntu2204_manual_install_llvm.sh} |    2 +-
 ...buntu2204_manual_install_llvm_cross_aarch64.sh} |    2 +-
 docker/install/ubuntu_install_boost.sh             |    6 +-
 ...ng_format.sh => ubuntu_install_clang_format.sh} |    6 +-
 docker/install/ubuntu_install_cmake_source.sh      |    3 +-
 docker/install/ubuntu_install_dgl.sh               |    2 +-
 docker/install/ubuntu_install_emscripten.sh        |    4 +-
 .../install/ubuntu_install_ethosn_driver_stack.sh  |   12 +-
 .../install/ubuntu_install_ethosu_driver_stack.sh  |    4 +-
 docker/install/ubuntu_install_golang.sh            |    2 +-
 docker/install/ubuntu_install_jax.sh               |   10 +-
 docker/install/ubuntu_install_libtorch.sh          |    4 +-
 docker/install/ubuntu_install_llvm.sh              |   26 +-
 docker/install/ubuntu_install_llvm_from_source.sh  |    2 +-
 docker/install/ubuntu_install_onnx.sh              |    4 +-
 docker/install/ubuntu_install_python.sh            |    8 +-
 docker/install/ubuntu_install_rocm.sh              |    6 +-
 .../install/ubuntu_install_tensorflow_aarch64.sh   |    1 +
 docker/install/ubuntu_install_tflite.sh            |    2 +-
 docker/install/ubuntu_install_vela.sh              |    2 +-
 docker/install/ubuntu_install_verilator.sh         |    2 +-
 docker/install/ubuntu_install_zephyr.sh            |    6 +-
 docs/Doxyfile                                      |   30 +-
 docs/legacy_redirect.py                            |    4 -
 gallery/how_to/deploy_models/deploy_ssd_gluoncv.py |  132 --
 .../tune_conv2d_layer_cuda.py                      |    8 +-
 .../tune_with_autoscheduler/tune_sparse_x86.py     |   77 +-
 .../how_to/tune_with_autotvm/tune_conv2d_cuda.py   |   22 +-
 gallery/how_to/work_with_relay/build_gcn.py        |   66 +-
 include/tvm/arith/analyzer.h                       |   21 +
 include/tvm/auto_scheduler/transform_step.h        |    2 -
 include/tvm/ir/attrs.h                             |    1 -
 include/tvm/ir/transform.h                         |    2 -
 include/tvm/meta_schedule/database.h               |    8 +-
 include/tvm/node/reflection.h                      |    1 -
 include/tvm/node/structural_hash.h                 |    2 -
 include/tvm/relax/transform.h                      |    5 +-
 include/tvm/runtime/container/adt.h                |    3 -
 include/tvm/runtime/crt/microtvm_rpc_server.h      |    1 +
 include/tvm/runtime/relax_vm/bytecode.h            |    4 +-
 include/tvm/runtime/vm/vm.h                        |    1 -
 include/tvm/target/target.h                        |    1 -
 include/tvm/target/virtual_device.h                |    1 -
 include/tvm/te/schedule.h                          |    1 -
 include/tvm/tir/analysis.h                         |   28 +-
 include/tvm/tir/data_type_rewriter.h               |    3 +
 include/tvm/tir/op.h                               |   16 +-
 include/tvm/tir/transform.h                        |    5 +-
 include/tvm/topi/transform.h                       |    1 -
 licenses/{LICENSE.libcrc.txt => LICENSE.cnpy.txt}  |   12 +-
 python/tvm/_ffi/_ctypes/packed_func.py             |    3 +-
 python/tvm/_ffi/runtime_ctypes.py                  |   13 +
 python/tvm/arith/analyzer.py                       |   10 +
 python/tvm/autotvm/database.py                     |    2 +-
 python/tvm/autotvm/env.py                          |    1 +
 python/tvm/autotvm/graph_tuner/pbqp_tuner.py       |    2 +-
 python/tvm/autotvm/task/space.py                   |    2 +-
 python/tvm/autotvm/task/topi_integration.py        |    4 +
 python/tvm/autotvm/tophub.py                       |    2 +-
 python/tvm/autotvm/tuner/__init__.py               |    1 +
 python/tvm/autotvm/tuner/droplet_turner.py         |  127 ++
 python/tvm/contrib/cc.py                           |   30 +
 python/tvm/contrib/cutlass/attention_operation.py  |   38 +-
 python/tvm/contrib/cutlass/build.py                |   25 +-
 python/tvm/contrib/cutlass/gemm_operation.py       |    2 +-
 python/tvm/contrib/cutlass/gen_tensor_op.py        |   28 +-
 python/tvm/contrib/cutlass/layer_norm_operation.py |   44 +
 python/tvm/contrib/debugger/debug_executor.py      |    2 +-
 python/tvm/contrib/debugger/debug_result.py        |    1 +
 python/tvm/contrib/download.py                     |    1 +
 .../contrib/ethosu/cascader/cascader_options.py    |    1 +
 python/tvm/contrib/hexagon/build.py                |    1 +
 python/tvm/contrib/hexagon/hexagon_profiler.py     |    1 +
 python/tvm/contrib/hexagon/session.py              |    5 +-
 python/tvm/contrib/hexagon/tools.py                |    2 +-
 python/tvm/contrib/pipeline_executor_build.py      |    1 +
 python/tvm/contrib/relay_viz/interface.py          |    2 +-
 python/tvm/contrib/tedd.py                         |    2 +-
 python/tvm/contrib/torch/as_torch.py               |    3 +-
 python/tvm/driver/tvmc/main.py                     |    4 +-
 python/tvm/driver/tvmc/micro.py                    |    1 +
 python/tvm/driver/tvmc/model.py                    |    1 +
 python/tvm/driver/tvmc/runner.py                   |    3 +-
 python/tvm/ir/function.py                          |    2 +
 python/tvm/meta_schedule/database/database.py      |   12 +
 .../feature_extractor/per_store_feature.py         |    1 +
 python/tvm/meta_schedule/logging.py                |    3 +-
 python/tvm/meta_schedule/profiler.py               |    1 +
 python/tvm/meta_schedule/relax_integration.py      |   29 +-
 python/tvm/meta_schedule/runner/config.py          |   11 +-
 .../space_generator/space_generator.py             |    2 +
 .../meta_schedule/testing/dataset_extract_tasks.py |    2 +-
 .../testing/dataset_sample_candidates.py           |    2 +-
 python/tvm/meta_schedule/testing/torchbench/run.py |    1 +
 .../tvm/meta_schedule/testing/torchbench/utils.py  |    1 +
 python/tvm/meta_schedule/testing/tune_te.py        |    1 -
 python/tvm/meta_schedule/tir_integration.py        |   67 +-
 python/tvm/micro/debugger.py                       |    1 +
 python/tvm/micro/model_library_format.py           |    1 +
 python/tvm/micro/project_api/client.py             |    1 +
 python/tvm/micro/project_api/server.py             |    3 +-
 python/tvm/relax/backend/contrib/cutlass.py        |   56 +-
 python/tvm/relax/backend/patterns.py               |  182 +-
 python/tvm/relax/frontend/torch/fx_translator.py   |    3 +-
 python/tvm/relax/op/_op_gradient.py                |  125 +-
 python/tvm/relax/op/binary.py                      |    5 +-
 python/tvm/relax/op/nn/nn.py                       |    4 +-
 python/tvm/relax/op/vm/vm.py                       |   17 +
 python/tvm/relax/training/__init__.py              |    5 +-
 python/tvm/relax/training/setup_trainer.py         |  214 +++
 python/tvm/relax/training/trainer.py               |  393 +++++
 python/tvm/relax/transform/__init__.py             |    1 +
 .../tvm/relax/transform/lazy_transform_params.py   |  206 +++
 python/tvm/relax/transform/legalize_ops/nn.py      |    4 -
 python/tvm/relax/transform/transform.py            |   22 +-
 python/tvm/relax/vm_build.py                       |   24 +-
 .../tvm/relay/backend/contrib/ethosu/__init__.py   |    1 +
 .../tvm/relay/backend/contrib/ethosu/legalize.py   |    5 +-
 .../backend/contrib/ethosu/softmax_rewriter.py     |  516 ++++++
 .../relay/backend/contrib/ethosu/te/depthwise.py   |    2 +-
 .../backend/contrib/ethosu/te/unary_elementwise.py |    3 +-
 .../backend/contrib/ethosu/tir/convolution.py      |    2 +-
 .../tvm/relay/backend/contrib/ethosu/tir/passes.py |   10 +-
 .../backend/contrib/ethosu/tir_to_cs_translator.py |   17 +-
 python/tvm/relay/backend/contrib/ethosu/util.py    |    1 +
 python/tvm/relay/backend/contrib/uma/api/utils.py  |    1 +
 python/tvm/relay/backend/contrib/uma/backend.py    |    3 +-
 python/tvm/relay/backend/interpreter.py            |    2 +-
 python/tvm/relay/dataflow_pattern/__init__.py      |    2 +-
 python/tvm/relay/debug.py                          |    2 +-
 python/tvm/relay/frontend/caffe.py                 |    4 +-
 python/tvm/relay/frontend/keras.py                 |    4 +
 python/tvm/relay/frontend/mxnet.py                 |    1 +
 python/tvm/relay/frontend/nnvm_common.py           |    2 +-
 python/tvm/relay/frontend/oneflow.py               |    2 +-
 python/tvm/relay/frontend/onnx.py                  |    4 +-
 python/tvm/relay/frontend/paddlepaddle.py          |    4 +-
 python/tvm/relay/frontend/pytorch.py               |   26 +-
 python/tvm/relay/frontend/tensorflow2.py           |    3 +-
 python/tvm/relay/frontend/tflite.py                |    3 +-
 python/tvm/relay/frontend/tflite_flexbuffer.py     |    1 +
 python/tvm/relay/op/contrib/clml.py                |    2 +-
 python/tvm/relay/op/contrib/dnnl.py                |    2 +-
 python/tvm/relay/op/contrib/ethosu.py              |   64 +-
 python/tvm/relay/testing/inception_v3.py           |    2 +-
 python/tvm/relay/testing/tflite.py                 |    4 +-
 python/tvm/relay/transform/recast.py               |    1 +
 python/tvm/rpc/client.py                           |    1 +
 python/tvm/script/parser/ir/entry.py               |    4 +-
 python/tvm/script/parser/tir/entry.py              |    4 +-
 python/tvm/script/parser/tir/parser.py             |    4 +-
 python/tvm/te/hybrid/utils.py                      |    1 +
 python/tvm/testing/aot.py                          |    3 +-
 python/tvm/testing/plugin.py                       |    1 +
 python/tvm/testing/utils.py                        |    4 +-
 python/tvm/tir/analysis/analysis.py                |   34 +-
 python/tvm/tir/function.py                         |    1 +
 python/tvm/tir/schedule/_type_checker.py           |    1 +
 python/tvm/tir/schedule/state.py                   |    1 +
 python/tvm/tir/schedule/testing.py                 |    1 +
 python/tvm/tir/stmt.py                             |    1 +
 python/tvm/tir/transform/transform.py              |   12 +-
 python/tvm/topi/adreno/utils.py                    |    2 +-
 python/tvm/topi/arm_cpu/bitserial_dense.py         |    2 +-
 python/tvm/topi/arm_cpu/depthwise_conv2d.py        |   11 +-
 .../topi/arm_cpu/mprofile/dsp/micro_kernel/gemm.py |    2 +-
 python/tvm/topi/cuda/dense.py                      |    3 +-
 python/tvm/topi/cuda/dense_tensorcore.py           |    3 +-
 python/tvm/topi/hexagon/injective.py               |    4 +-
 python/tvm/topi/hexagon/pad.py                     |    3 +-
 python/tvm/topi/hexagon/slice_ops/avg_pool2d.py    |    2 +-
 python/tvm/topi/hexagon/slice_ops/max_pool2d.py    |    2 +-
 python/tvm/topi/hexagon/tensor_intrin.py           |    2 +-
 python/tvm/topi/hexagon/utils.py                   |   52 +-
 python/tvm/topi/nn/depthwise_conv2d.py             |    2 +-
 python/tvm/topi/nn/softmax.py                      |    2 +-
 python/tvm/topi/x86/bitserial_dense.py             |    2 +-
 python/tvm/topi/x86/concat.py                      |    2 +-
 rust/.rustfmt.toml                                 |    2 +-
 src/arith/analyzer.cc                              |    7 +
 src/arith/rewrite_simplify.cc                      |   35 +
 src/arith/rewrite_simplify.h                       |   62 +-
 src/driver/driver_api.cc                           |   11 +-
 src/meta_schedule/database/database.cc             |   25 +
 src/meta_schedule/database/json_database.cc        |    4 -
 src/meta_schedule/database/memory_database.cc      |    4 -
 src/meta_schedule/module_equality.cc               |    3 +
 src/meta_schedule/module_equality.h                |    1 +
 src/meta_schedule/postproc/verify_vtcm_limit.cc    |   44 +-
 src/meta_schedule/utils.h                          |   16 +-
 src/relax/backend/vm/codegen_vm.cc                 |   16 +-
 src/relax/backend/vm/codegen_vm_tir.cc             |   22 +
 src/relax/backend/vm/vm_builtin_lower.cc           |   17 +-
 src/relax/ir/py_expr_functor.cc                    |   16 +-
 src/relax/op/nn/attention.cc                       |   37 +-
 src/relax/op/op.cc                                 |   14 +
 src/relax/transform/fold_constant.cc               |    3 +-
 src/relax/transform/fuse_ops.cc                    |   19 +-
 src/relax/transform/meta_schedule.cc               |   22 +-
 src/relax/transform/static_plan_block_memory.cc    |    6 +-
 src/relax/transform/to_mixed_precision.cc          |  119 +-
 src/relax/transform/utils.h                        |   31 +
 src/relax/utils.cc                                 |   30 +-
 src/relay/op/nn/pad.cc                             |    2 +-
 src/relay/op/tensor/transform.cc                   |    6 +-
 src/relay/transforms/simplify_expr.cc              |   82 +
 src/runtime/crt/common/crt_runtime_api.c           |    6 +-
 src/runtime/metal/metal_module.h                   |   11 +-
 src/runtime/metal/metal_module.mm                  |  105 +-
 src/runtime/relax_vm/builtin.cc                    |    7 +-
 src/runtime/relax_vm/lm_support.cc                 |  111 +-
 src/runtime/relax_vm/ndarray_cache_support.cc      |    3 +-
 src/runtime/relax_vm/vm.cc                         |    6 +-
 src/target/opt/build_metal_off.cc                  |    7 +-
 src/target/source/codegen_cuda.cc                  |    4 +-
 src/target/source/codegen_metal.cc                 |   31 +-
 src/target/tag.cc                                  |    7 +
 src/tir/analysis/block_access_region_detector.cc   |   10 +-
 src/tir/analysis/calculate_allocated_memory.cc     |   69 +-
 src/tir/analysis/control_flow_graph.cc             |    7 +-
 src/tir/analysis/control_flow_graph.h              |    6 +-
 src/tir/ir/data_type_rewriter.cc                   |    9 +-
 src/tir/schedule/primitive.h                       |    5 +-
 src/tir/schedule/primitive/block_annotate.cc       |    1 +
 src/tir/schedule/primitive/blockize_tensorize.cc   |    9 +
 src/tir/transforms/compact_buffer_region.cc        |  487 ++++--
 src/tir/transforms/force_narrow_index_to_i32.cc    |    4 +-
 src/tir/transforms/ir_utils.cc                     |  101 +-
 src/tir/transforms/ir_utils.h                      |   44 +-
 src/tir/transforms/lower_opaque_block.cc           |   34 +-
 src/tir/transforms/primfunc_utils.cc               |   47 +-
 src/tir/transforms/remove_no_op.cc                 |   18 +-
 tests/cpp/llvm_codegen_test.cc                     |    9 +-
 tests/crt/framing_test.cc                          |    4 +-
 tests/lint/git-clang-format.sh                     |   12 +-
 tests/lint/pylintrc                                |   74 +-
 tests/python/contrib/test_cudnn.py                 |    2 +-
 tests/python/contrib/test_ethosu/test_codegen.py   |   18 +
 tests/python/contrib/test_ethosu/test_legalize.py  |  120 ++
 .../test_hexagon/test_benchmark_maxpool2d.py       |    4 +-
 .../test_hexagon/test_fixed_point_conversion.py    |    2 +-
 .../contrib/test_hexagon/test_vtcm_bandwidth.py    |    2 +-
 .../contrib/test_hexagon/topi/test_resize2d.py     |    1 +
 tests/python/frontend/caffe2/test_graph.py         |    2 +-
 tests/python/frontend/coreml/test_forward.py       |   11 +-
 tests/python/frontend/keras/test_forward.py        |   34 +-
 tests/python/frontend/oneflow/test_forward.py      |    3 +-
 .../python/frontend/oneflow/test_vision_models.py  |    9 +-
 tests/python/frontend/onnx/test_forward.py         |   14 +-
 tests/python/frontend/pytorch/test_forward.py      |   48 +-
 tests/python/frontend/pytorch/test_fx_quant.py     |    8 +-
 tests/python/frontend/tensorflow/test_forward.py   |   23 +-
 tests/python/frontend/tflite/test_forward.py       |   14 +-
 tests/python/relax/test_codegen_cutlass.py         |  314 +++-
 tests/python/relax/test_expr_functor.py            |   23 +
 tests/python/relax/test_op_gradient_numeric.py     |   27 +
 tests/python/relax/test_op_misc.py                 |    8 +
 tests/python/relax/test_runtime_builtin.py         |    2 +-
 tests/python/relax/test_training_setup_trainer.py  |  229 +++
 .../python/relax/test_training_trainer_numeric.py  |  170 ++
 .../relax/test_transform_fuse_ops_by_pattern.py    |   86 +-
 .../relax/test_transform_lazy_transform_params.py  |  105 ++
 .../relax/test_transform_meta_schedule_tuning.py   |   20 +
 .../test_transform_static_plan_block_memory.py     |  124 ++
 .../relax/test_transform_to_mixed_precision.py     |  142 +-
 tests/python/relax/test_vm_codegen_only.py         |   65 +
 .../relay/aot/test_crt_forward_declarations.py     |    2 +-
 tests/python/relay/test_op_level2.py               |   19 +
 tests/python/relay/test_op_level3.py               |   20 +
 tests/python/relay/test_pass_simplify_expr.py      |   99 +-
 tests/python/topi/python/test_topi_math.py         |    4 +-
 tests/python/unittest/test_arith_domain_touched.py |    2 +-
 .../python/unittest/test_autotvm_droplet_tuner.py  |   57 +
 tests/python/unittest/test_crt.py                  |    2 +-
 .../test_meta_schedule_relay_integration.py        |   17 +
 tests/python/unittest/test_runtime_rpc.py          |    1 +
 ...test_tir_analysis_calculate_allocated_memory.py |   69 +-
 tests/python/unittest/test_tir_buffer.py           |   23 -
 .../python/unittest/test_tir_schedule_tensorize.py |   40 +-
 .../test_tir_transform_compact_buffer_region.py    | 1730 +++++++++++---------
 ...test_tir_transform_force_narrow_index_to_i32.py |   22 +
 .../unittest/test_tir_transform_remove_no_op.py    |   16 +
 tests/python/unittest/test_tvmscript_parser_ir.py  |    1 +
 tests/python/unittest/test_tvmscript_parser_tir.py |   16 +-
 .../python/unittest/test_tvmscript_syntax_sugar.py |   19 +-
 tests/scripts/ci.py                                |    1 +
 tests/scripts/task_build.py                        |   16 +-
 tests/scripts/task_config_build_arm.sh             |    2 +-
 tests/scripts/task_config_build_cortexm.sh         |    2 +-
 tests/scripts/task_config_build_cpu.sh             |    4 +-
 tests/scripts/task_config_build_gpu.sh             |    2 +-
 tests/scripts/task_config_build_i386.sh            |    2 +-
 .../scripts/task_config_build_minimal_cross_isa.sh |    2 +-
 tests/scripts/task_config_build_riscv.sh           |    2 +-
 tests/scripts/task_config_build_wasm.sh            |    2 +-
 tests/scripts/task_python_docs.sh                  |    1 +
 vta/python/vta/testing/simulator.py                |    1 +
 vta/python/vta/top/bitpack.py                      |    2 +-
 vta/python/vta/top/graphpack.py                    |    2 +-
 web/src/runtime.ts                                 |   23 +-
 358 files changed, 7278 insertions(+), 2295 deletions(-)
 copy docker/install/{ubuntu_install_core.sh => ubuntu2004_install_core.sh} 
(95%)
 mode change 100755 => 100644
 delete mode 100755 docker/install/ubuntu2004_install_llvm.sh
 copy docker/install/{ubuntu_install_python_package.sh => 
ubuntu2004_install_python_package.sh} (99%)
 mode change 100755 => 100644
 copy docker/install/{ubuntu_install_redis.sh => ubuntu2004_install_redis.sh} 
(93%)
 mode change 100755 => 100644
 rename docker/install/{ubuntu1804_install_aarch64_cross_compile.sh => 
ubuntu2204_install_aarch64_cross_compile.sh} (65%)
 rename docker/install/{ubuntu1804_install_llvm.sh => 
ubuntu2204_install_llvm.sh} (55%)
 mode change 100755 => 100644
 rename docker/install/{ubuntu1804_manual_install_llvm.sh => 
ubuntu2204_manual_install_llvm.sh} (94%)
 rename docker/install/{ubuntu1804_manual_install_llvm_cross_aarch64.sh => 
ubuntu2204_manual_install_llvm_cross_aarch64.sh} (97%)
 rename docker/install/{ubuntu1804_install_clang_format.sh => 
ubuntu_install_clang_format.sh} (88%)
 mode change 100755 => 100644
 delete mode 100644 gallery/how_to/deploy_models/deploy_ssd_gluoncv.py
 copy licenses/{LICENSE.libcrc.txt => LICENSE.cnpy.txt} (87%)
 create mode 100644 python/tvm/autotvm/tuner/droplet_turner.py
 create mode 100644 python/tvm/contrib/cutlass/layer_norm_operation.py
 create mode 100644 python/tvm/relax/training/setup_trainer.py
 create mode 100644 python/tvm/relax/training/trainer.py
 create mode 100644 python/tvm/relax/transform/lazy_transform_params.py
 create mode 100644 python/tvm/relay/backend/contrib/ethosu/softmax_rewriter.py
 create mode 100644 tests/python/relax/test_training_setup_trainer.py
 create mode 100644 tests/python/relax/test_training_trainer_numeric.py
 create mode 100644 tests/python/relax/test_transform_lazy_transform_params.py
 create mode 100644 tests/python/unittest/test_autotvm_droplet_tuner.py

Reply via email to