This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a change to branch unity-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git


    from d837fd0738 [MERGE] Fix lint after lint image upgrade
     add 96aca9d85f [Unity] Fix Unary Op Legalization (#14789)
     add ff333acaa9 [Unity] Introduce FewShotTuning Pass (#14624)
     add 7c56517545 [Unity][BYOC] Fix incorrect bias stride in matmul cutlass 
offload (#14807)
     add 5ad8941252 [Unity] Add pass to allocate big workspace and pass it to 
all functions that need temp storage   (#14802)
     add 0c9aa58907 [Unity][BYOC] Add shape validation for bias arg in cuBLAS 
(#14809)
     add aab076e2bf [Unity] NDArray Cache Efficient Load in OpenCL (#14816)
     add b5928e775c [Unity] Fix CUDA graph rewrite var used before def (#14800)
     add 3b33caf757 [Unity] Cover all Relax functions in implicit attention 
rewrite (#14818)
     add 28567bdc24 [Unity] Improve Error Message loading NDArray Shards 
(#14823)
     add 25a491d52b [Unity][Training] Categorical cross entropy loss (#14757)
     add 773620f267 [Unity][CI] Update CPU image to install PyTorch (#14848)
     add d6e0f1d001 [NDArray] Allow arbitrary stride when the corresponding 
shape is 1 (#14797)
     add e01cb479b8 [bugfix][relay] fix wrong calculate logic about celu 
(#14796)
     add 6c689eed9c [CUTLASS] Add NDEBUG option to CUTLASS compile to speed up 
attention kernel (#14798)
     add b299ece341 [LINT] Fix clang-format script for newest clang-format 
(#14799)
     add 182de86b2c [OpenCL] Refactor OpenCL runtime to support SPIRV binary 
ingestion (#14792)
     add 440aae25fb [Testing] Use TVMScript's "name" argument for error 
messages (#14808)
     add afa1f63ea5 [TIR] Improved parameter name in DLTensor unpacking error 
messages (#14776)
     add 3829ebb68c [TVMScript] Handle AllocatedPoolInfo, ConstantPoolInfo, 
ConstantInfo (#14812)
     add cca7d78334 [OpenCLML] Transposed convolution support and other fixes 
(#14767)
     add a1c1ccafa1 [SUPPORT] Fix RingBuffer ReadWithCallback (#14743)
     add 48200fc3d7 [TOPI] Use f-strings for string formatting, NFC (#14822)
     add fd2a510ccf [Relay][BugFix] fix a bug about ReLu in the threshold 
attribute which causes a different results with keras (#14824)
     add 2cafa87b10 [Bugfix][Relay] Fix threshold calculation logic in PyTorch 
frontend (#14820)
     add 42aaae143c [BUILD] Removed the duplicated MACROs in config.cmake 
(#14813)
     add a3991b32ca [Relay/Op] Use f-strings for string formatting, NFC (#14831)
     add ae9209bf55 [Frontend] Use f-strings for string formatting, NFC (#14830)
     add 483b87de62 [Bugfix][Relay] Fix softplus about the wrong calculation 
formula in Relay PyTorch frontend (#14821)
     add 3756b716d5 [Relay] Use f-strings for string formatting, NFC (#14838)
     add e54bbc73a7 [Docker] Install PyTorch on cpu image (#14842)
     add 05001bed64 [LLVM][RUNTIME] Enable multi systemlib with device code 
(#14843)
     add 71d3262e90 [TOPI] Use f-strings for string formatting, NFC (#14839)
     add 5eff44aea8 [Bugfix][Relay] Fix AdaptiveAvgPool2d about wrong dtype 
prasing (#14837)
     add 318f894f04 [TVMScript] Print Executor, Runtime, and FunctionInfo as 
metadata (#14840)
     add 9f0c642273 [Bugfix][TVMScript] Capture fails if var appears only in 
annotation (#14849)
     add 28c85f0dc2 [METAL] Fix vectorized select (#14846)
     add 9999114e70 [Codegen][OpenCL] fix amibiguous selection operator call 
(#14833)
     new fd20ca2f6c [MERGE] Merge main into unity 2023-05-14

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 apps/cpp_rtvm/tvm_runner.h                         |    3 +-
 ci/jenkins/docker-images.ini                       |    2 +-
 cmake/config.cmake                                 |   13 -
 cmake/modules/LibInfo.cmake                        |    1 +
 cmake/modules/OpenCL.cmake                         |    1 +
 cmake/modules/Vulkan.cmake                         |    1 +
 docker/Dockerfile.ci_cpu                           |    4 +
 include/tvm/ir/module.h                            |   29 +
 include/tvm/relax/expr.h                           |    2 +
 include/tvm/relax/transform.h                      |   12 +
 include/tvm/runtime/module.h                       |    2 -
 include/tvm/target/codegen.h                       |    9 +-
 include/tvm/tir/usmp/utils.h                       |    2 +-
 jvm/pom.xml                                        |    4 +-
 python/tvm/contrib/cutlass/attention_operation.py  |    9 +-
 python/tvm/contrib/cutlass/build.py                |   20 +-
 python/tvm/contrib/cutlass/gemm_operation.py       |    6 +-
 python/tvm/contrib/cutlass/gen_tensor_op.py        |    9 +-
 python/tvm/ir/memory_pools.py                      |    2 +-
 python/tvm/meta_schedule/builder/local_builder.py  |   12 +
 python/tvm/meta_schedule/runner/local_runner.py    |   12 +
 python/tvm/relax/backend/contrib/cublas.py         |    8 +
 python/tvm/relax/backend/contrib/cutlass.py        |   68 +-
 python/tvm/relax/testing/matmul.py                 |   13 +-
 python/tvm/relax/training/loss.py                  |   94 +-
 python/tvm/relax/transform/legalize_ops/unary.py   |   13 +-
 python/tvm/relax/transform/transform.py            |   39 +
 python/tvm/relay/backend/interpreter.py            |   11 +-
 python/tvm/relay/backend/te_compiler.py            |    4 +-
 python/tvm/relay/build_module.py                   |    2 +-
 python/tvm/relay/expr.py                           |   69 +-
 python/tvm/relay/expr_functor.py                   |    8 +-
 python/tvm/relay/frontend/caffe.py                 |   16 +-
 python/tvm/relay/frontend/caffe2.py                |   24 +-
 python/tvm/relay/frontend/common.py                |   28 +-
 python/tvm/relay/frontend/coreml.py                |   22 +-
 python/tvm/relay/frontend/darknet.py               |    9 +-
 python/tvm/relay/frontend/keras.py                 |  119 +-
 python/tvm/relay/frontend/mxnet.py                 |   62 +-
 python/tvm/relay/frontend/mxnet_qnn_op_utils.py    |    4 +-
 python/tvm/relay/frontend/nnvm_common.py           |    2 +-
 python/tvm/relay/frontend/oneflow.py               |   72 +-
 python/tvm/relay/frontend/onnx.py                  |  408 ++-----
 python/tvm/relay/frontend/paddlepaddle.py          |   48 +-
 python/tvm/relay/frontend/pytorch.py               |  183 ++-
 python/tvm/relay/frontend/qnn_torch.py             |   25 +-
 python/tvm/relay/frontend/tensorflow.py            |   38 +-
 python/tvm/relay/frontend/tensorflow2.py           |   31 +-
 python/tvm/relay/frontend/tensorflow2_ops.py       |    2 +-
 python/tvm/relay/frontend/tensorflow_ops.py        |  136 ++-
 python/tvm/relay/frontend/tflite.py                |   73 +-
 python/tvm/relay/loops.py                          |    2 +-
 python/tvm/relay/op/annotation/annotation.py       |    2 +-
 python/tvm/relay/op/contrib/clml.py                |   45 +-
 python/tvm/relay/op/contrib/cutlass.py             |    2 +-
 python/tvm/relay/op/contrib/dnnl.py                |   19 +-
 python/tvm/relay/op/nn/_nn.py                      |   15 +-
 python/tvm/relay/op/nn/utils.py                    |    6 +-
 python/tvm/relay/op/op.py                          |   10 +-
 python/tvm/relay/op/strategy/adreno.py             |   10 +-
 python/tvm/relay/op/strategy/arm_cpu.py            |   32 +-
 python/tvm/relay/op/strategy/bifrost.py            |   10 +-
 python/tvm/relay/op/strategy/cuda.py               |   37 +-
 python/tvm/relay/op/strategy/generic.py            |   83 +-
 python/tvm/relay/op/strategy/hexagon.py            |    4 +-
 python/tvm/relay/op/strategy/hls.py                |    6 +-
 python/tvm/relay/op/strategy/intel_graphics.py     |    4 +-
 python/tvm/relay/op/strategy/mali.py               |   12 +-
 python/tvm/relay/op/strategy/x86.py                |   28 +-
 python/tvm/relay/op/tensor.py                      |    2 +-
 python/tvm/relay/op/transform.py                   |    4 +-
 python/tvm/relay/op/vision/_rcnn.py                |    4 +-
 python/tvm/relay/prelude.py                        |   76 +-
 python/tvm/relay/qnn/op/layout_conversions.py      |    4 +-
 python/tvm/relay/qnn/op/qnn.py                     |   81 +-
 python/tvm/relay/quantize/_calibrate.py            |    4 +-
 python/tvm/relay/quantize/quantize.py              |    4 +-
 python/tvm/relay/testing/dcgan.py                  |    4 +-
 python/tvm/relay/testing/densenet.py               |   12 +-
 python/tvm/relay/testing/inception_v3.py           |  100 +-
 python/tvm/relay/testing/init.py                   |   11 +-
 python/tvm/relay/testing/layers.py                 |   12 +-
 python/tvm/relay/testing/lstm.py                   |   20 +-
 python/tvm/relay/testing/mobilenet.py              |    2 +-
 python/tvm/relay/testing/py_converter.py           |   21 +-
 python/tvm/relay/testing/resnet.py                 |    8 +-
 python/tvm/relay/testing/resnet_3d.py              |    8 +-
 python/tvm/relay/testing/squeezenet.py             |   17 +-
 python/tvm/relay/testing/tf.py                     |    8 +-
 python/tvm/relay/testing/tflite.py                 |    2 +-
 python/tvm/relay/testing/vgg.py                    |    8 +-
 .../transform/fake_quantization_to_integer.py      |   30 +-
 python/tvm/relay/type_functor.py                   |    2 +-
 python/tvm/runtime/module.py                       |   27 +-
 python/tvm/script/parser/core/entry.py             |   18 +-
 .../tvm/script/parser/core/error.py                |   16 +-
 python/tvm/script/parser/core/evaluator.py         |    7 +-
 python/tvm/script/parser/core/parser.py            |   23 +-
 python/tvm/script/parser/tir/parser.py             |   52 +-
 python/tvm/testing/utils.py                        |   17 +-
 python/tvm/topi/adreno/conv2d_alter_op.py          |   77 +-
 python/tvm/topi/adreno/pooling.py                  |    4 +-
 python/tvm/topi/arm_cpu/bitserial_dense.py         |    2 +-
 python/tvm/topi/arm_cpu/conv2d.py                  |   15 +-
 python/tvm/topi/arm_cpu/conv2d_alter_op.py         |   26 +-
 python/tvm/topi/arm_cpu/tensor_intrin.py           |   69 +-
 python/tvm/topi/bifrost/conv2d.py                  |    2 +-
 python/tvm/topi/cuda/batch_matmul.py               |    2 +-
 python/tvm/topi/cuda/batch_matmul_tensorcore.py    |    2 +-
 python/tvm/topi/cuda/conv2d.py                     |    2 +-
 python/tvm/topi/cuda/conv2d_hwcn.py                |    2 +-
 python/tvm/topi/cuda/conv2d_int8.py                |    4 +-
 python/tvm/topi/cuda/conv3d.py                     |    2 +-
 python/tvm/topi/cuda/conv3d_direct.py              |    2 +-
 python/tvm/topi/cuda/dense.py                      |   18 +-
 python/tvm/topi/cuda/dense_tensorcore.py           |    2 +-
 python/tvm/topi/cuda/group_conv2d_nchw.py          |   12 +-
 python/tvm/topi/cuda/pooling.py                    |    4 +-
 python/tvm/topi/cuda/reduction.py                  |    4 +-
 python/tvm/topi/cuda/scan.py                       |    2 +-
 python/tvm/topi/cuda/softmax.py                    |    5 +-
 python/tvm/topi/cuda/sparse.py                     |    7 +-
 python/tvm/topi/generic/conv2d.py                  |   34 +-
 python/tvm/topi/generic/default.py                 |    2 +-
 python/tvm/topi/generic/injective.py               |    2 +-
 python/tvm/topi/hexagon/conv2d_alter_op.py         |    6 +-
 python/tvm/topi/hexagon/qnn/conv2d_alter_op.py     |    6 +-
 python/tvm/topi/hls/nn.py                          |   15 +-
 python/tvm/topi/image/resize.py                    |   55 +-
 python/tvm/topi/intel_graphics/conv2d_alter_op.py  |   10 +-
 python/tvm/topi/mali/conv2d.py                     |   18 +-
 python/tvm/topi/nn/conv2d.py                       |   70 +-
 python/tvm/topi/nn/depthwise_conv2d.py             |   32 +-
 python/tvm/topi/nn/dilate.py                       |    2 +-
 python/tvm/topi/nn/fifo_buffer.py                  |   10 +-
 python/tvm/topi/nn/pad.py                          |   12 +-
 python/tvm/topi/nn/sparse.py                       |    2 +-
 python/tvm/topi/nn/upsampling.py                   |    4 +-
 python/tvm/topi/nn/utils.py                        |   17 +-
 python/tvm/topi/nn/winograd_util.py                |    4 +-
 python/tvm/topi/random/kernel.py                   |   13 +-
 python/tvm/topi/reduction.py                       |    2 +-
 python/tvm/topi/sparse/csrmm.py                    |    7 +-
 python/tvm/topi/sparse/csrmv.py                    |    7 +-
 python/tvm/topi/sparse/dense.py                    |   10 +-
 python/tvm/topi/testing/dilate_python.py           |    5 +-
 python/tvm/topi/testing/poolnd_python.py           |   12 +-
 python/tvm/topi/testing/resize_python.py           |    2 +-
 python/tvm/topi/transform.py                       |   11 +-
 python/tvm/topi/utils.py                           |   21 +-
 python/tvm/topi/x86/binarize_pack.py               |    2 +-
 python/tvm/topi/x86/binary_dense.py                |    2 +-
 python/tvm/topi/x86/bitserial_dense.py             |    2 +-
 python/tvm/topi/x86/conv2d.py                      |   13 +-
 python/tvm/topi/x86/conv2d_alter_op.py             |   18 +-
 python/tvm/topi/x86/conv2d_avx_1x1.py              |    4 +-
 python/tvm/topi/x86/conv2d_avx_common.py           |    2 +-
 python/tvm/topi/x86/conv2d_int8.py                 |    6 +-
 python/tvm/topi/x86/conv3d.py                      |   31 +-
 python/tvm/topi/x86/dense_alter_op.py              |   15 +-
 python/tvm/topi/x86/depthwise_conv2d.py            |    6 +-
 python/tvm/topi/x86/nn.py                          |    5 +-
 python/tvm/topi/x86/pooling.py                     |    4 +-
 python/tvm/topi/x86/reduction.py                   |    4 +-
 src/arith/const_int_bound.cc                       |    2 +
 src/relax/backend/vm/codegen_vm.cc                 |    6 +-
 src/relax/ir/block_builder.cc                      |    2 +-
 src/relax/op/op_common.h                           |    3 +
 src/relax/transform/allocate_workspace.cc          |  199 ++++
 src/relax/transform/few_shot_tuning.cc             |  185 +++
 src/relax/transform/rewrite_cuda_graph.cc          |   27 +-
 src/relay/backend/contrib/clml/codegen.cc          |   40 +-
 src/runtime/contrib/clml/clml_runtime.cc           |  176 +--
 src/runtime/contrib/cublas/cublas_json_runtime.cc  |    2 +
 src/runtime/library_module.h                       |    2 +
 src/runtime/opencl/opencl_common.h                 |   60 +-
 src/runtime/opencl/opencl_module.cc                |   36 +-
 src/runtime/opencl/opencl_module.h                 |   14 +-
 src/runtime/opencl/opencl_module_spirv.cc          |  137 +++
 src/runtime/relax_vm/cuda/cuda_graph_builtin.cc    |    1 -
 src/runtime/relax_vm/ndarray_cache_support.cc      |   36 +-
 .../vulkan_shader.h => spirv/spirv_shader.h}       |   16 +-
 src/runtime/system_library.cc                      |   58 +-
 src/runtime/vulkan/vulkan_module.cc                |    6 +-
 src/runtime/vulkan/vulkan_module.h                 |    4 +-
 src/runtime/vulkan/vulkan_wrapped_func.h           |    6 +-
 src/script/printer/ir/relay.cc                     |   48 +
 src/script/printer/tir/usmp.cc                     |   58 +
 src/support/libinfo.cc                             |    1 +
 src/support/ring_buffer.h                          |   12 +-
 src/target/codegen.cc                              |   31 +-
 src/target/llvm/codegen_amdgpu.cc                  |    2 +-
 src/target/llvm/codegen_blob.cc                    |   22 +-
 src/target/llvm/codegen_blob.h                     |    4 +-
 src/target/llvm/codegen_cpu.cc                     |   22 +-
 src/target/llvm/codegen_cpu.h                      |    9 +-
 src/target/llvm/codegen_hexagon.cc                 |   14 +-
 src/target/llvm/codegen_llvm.cc                    |    5 +-
 src/target/llvm/codegen_llvm.h                     |    7 +-
 src/target/llvm/codegen_nvptx.cc                   |    2 +-
 src/target/llvm/llvm_module.cc                     |   53 +-
 src/target/opt/build_opencl_off.cc                 |    7 +
 src/target/source/codegen_metal.cc                 |    5 +
 src/target/source/codegen_metal.h                  |    1 +
 src/target/source/codegen_opencl.cc                |   15 +-
 src/target/spirv/build_vulkan.cc                   |  134 +--
 src/target/spirv/codegen_spirv.cc                  |    5 +-
 src/target/spirv/codegen_spirv.h                   |    6 +-
 src/target/spirv/spirv_support.cc                  |    5 +-
 .../spirv/{build_vulkan.cc => spirv_utils.cc}      |   64 +-
 .../{source/codegen_vhls.h => spirv/spirv_utils.h} |   40 +-
 src/tir/transforms/arg_binder.cc                   |    2 +-
 src/tir/transforms/flatten_buffer.cc               |    2 +
 src/tir/transforms/lower_intrin.cc                 |   10 +-
 src/tir/transforms/make_packed_api.cc              |   22 +-
 tests/cpp/support/ring_buffer_test.cc              |   68 ++
 tests/python/contrib/test_clml/infrastructure.py   |    2 +
 tests/python/contrib/test_clml/test_ops.py         |  107 +-
 tests/python/frontend/keras/test_forward.py        |    1 +
 tests/python/frontend/pytorch/test_forward.py      |   10 +
 tests/python/relax/test_codegen_cublas.py          |   11 +-
 tests/python/relax/test_codegen_cutlass.py         |   91 +-
 tests/python/relax/test_training_loss.py           |   76 ++
 .../relax/test_transform_allocate_workspace.py     |  132 +++
 .../python/relax/test_transform_few_shot_tuning.py |  389 ++++++
 .../relax/test_transform_legalize_ops_unary.py     | 1251 +-------------------
 .../relax/test_transform_rewrite_cuda_graph.py     |   53 +-
 .../relay/aot/test_crt_forward_declarations.py     |    4 +-
 .../python/unittest/test_arith_const_int_bound.py  |   18 +
 ...orm_remove_assume.py => test_runtime_dlpack.py} |   45 +-
 tests/python/unittest/test_runtime_module_load.py  |   58 +-
 tests/python/unittest/test_runtime_rpc.py          |   18 +-
 tests/python/unittest/test_target_codegen_blob.py  |   85 +-
 tests/python/unittest/test_target_codegen_metal.py |   46 +-
 .../python/unittest/test_target_codegen_opencl.py  |   32 +-
 .../unittest/test_tvmscript_meta_programming.py    |   67 +-
 236 files changed, 3883 insertions(+), 3794 deletions(-)
 copy apps/sgx/Cargo.toml => python/tvm/script/parser/core/error.py (77%)
 create mode 100644 src/relax/transform/allocate_workspace.cc
 create mode 100644 src/relax/transform/few_shot_tuning.cc
 create mode 100644 src/runtime/opencl/opencl_module_spirv.cc
 rename src/runtime/{vulkan/vulkan_shader.h => spirv/spirv_shader.h} (82%)
 create mode 100644 src/script/printer/ir/relay.cc
 create mode 100644 src/script/printer/tir/usmp.cc
 copy src/target/spirv/{build_vulkan.cc => spirv_utils.cc} (79%)
 copy src/target/{source/codegen_vhls.h => spirv/spirv_utils.h} (59%)
 create mode 100644 tests/cpp/support/ring_buffer_test.cc
 create mode 100644 tests/python/relax/test_transform_allocate_workspace.py
 create mode 100644 tests/python/relax/test_transform_few_shot_tuning.py
 copy tests/python/unittest/{test_tir_transform_remove_assume.py => 
test_runtime_dlpack.py} (51%)

Reply via email to