This is an automated email from the ASF dual-hosted git repository.

mousius pushed a change to branch ci-docker-staging
in repository https://gitbox.apache.org/repos/asf/tvm.git.


 discard 134edbd  Test new docker images
     add d97274c  [MetaSchedule] Post Processor: Rewrite Reduction Block 
(#10013)
     add 25c8f4c  [frontend][keras] Add support for TimeDistributed (#7006)
     add 751f83b  Auto-discover C/C++ compiler instead of hardcoding g++ 
(#10007)
     add 81b66e6  [Docker] Relax name check (#10011)
     add 1ac01b4  [MetaSchedule] Schedule Rule: Cross Thread Reduction (#9994)
     add e9ee73f  [TOPI,CUDA] Don't enable cudnn conv2d kernel if is not 
supported (#10021)
     add 89fa241  [BugFix][TIR] Fix cross-thread reduction when single 
reduction loop with predicate (#10016)
     add 9a6423c  Add user-configurable backtrace limit (#10025)
     add 64f2939  [MetaSchedule] disallow_dynamic_loop (#9997)
     add d35b858  [CUDNN] Support gradient kernels (#9986)
     add 7bfb11b  [Flaky] Skip test_qlinear_average_pool (#10030)
     add fc1814e  [MetaSchedule] Mutator: Mutate compute location (#10028)
     add cc67040  [MetaSchedule] Post Processor: Rewrite Unbound Block (#10027)
     add de01c3e  [MetaSchedule] Schedule Rule: Parallelize-Vectorize-Unroll 
(#10033)
     add 74a2fa8  [microNPU] Add support for requantize (#9910)
     add d066441  [PTX-MMA] Add full PTX MMA code generation support (#9909)
     add 65b4b09  [CMSIS-NN] Update microNPU demo to include offloading to 
CMSIS-NN (#9979)
     add 6f2b35f  [QNN] Add qnn.rsqrt op (#9982)
     add 73bbfbb  [Hexagon] Do not auto-build apps when building TVM (#9970)
     add 6720d35  [Runtime][PipelineExecutor] Add Pipeline Executor Interface 
(#10010)
     add 2830c96  [skip ci][Docker, CI] Update DGL installation, temp disable 
DGL tutorial (#10067)
     add 1b9b05e  [CUTLASS] Profile only the largest-possible alignment by 
default (#10036)
     add 94c4e0e  [Meta Schedule] Add `ApplyHisotryBest` Meta Schedule Context 
(#10049)
     add 88cbf1b  [MetaSchedule] Mutator Rule: Mutate Unroll (#10045)
     add 5e7438f  [TIR][Schedule] Blockize and Tensorize (#9871)
     add 884fee4  [microTVM][tutorial] Add ENV variable to enable testing on 
physical hardware (#9993)
     add b972877  [microNPU] Refactor base address determination to codegen 
(#9929)
     add ffff8dd  Add FP requantize flow. Set float32 flow by default for llvm 
x86 targets with (#9637)
     add abdccf9  [Relay][DefuseOps pass] bug fix: To support function body 
types other than call node (#10069)
     add 92cd754  [Fix Bug]fix the bug of tensorflow frontend when parsing 
Range layer (#9999)
     add ffbe491  [MetaSchedule][M4a] Schedule Rule: Multi-Level-Tiling (#10043)
     add 095b639  Revert "[Frontend] Add Span filling for frontends to Relay 
(#9723)" (#10072)
     add cc8a7a2  Improve the tensorflow frontend _test_spop_resource_variables 
to support tensoflow 2.6 (#9978)
     add 1935341  [MetaSchedule] postproc: rewrite_parallel_vectorize_unroll 
(#10071)
     add d29c801  Clear warnings when building with MSVC. (#10059)
     add 622a03c  [Makefile] Fixed error in "make clean" (#10048)
     add 90e454a  [Relay] QLinearMatMul allows 1D weight_scale, 
weight_zero_point inputs (#10047)
     add 09daa88  Don't explicitly link libgcc.a into libtvm_runtime.so on 
Android (#10052)
     add 248ad45  Change function constructors to WithFields (#9690)
     add a40de47  Document missing qnn operators (#10077)
     add e42b9a3  Add temp git dir to test_cc_reviewers test case (#10058)
     add f161bc2  [CI] Fix Rust permissions for wasmtime and sccache (#10015)
     add 4b0558c  [EZ][Typo] Correct gather, scatter type rel error message 
(#10023)
     add f93f2a6  [microTVM][tvmc] Add TVMC Micro tutorial for Zephyr (#10024)
     add fa317ed  [CI][Fix] Remove additional qnn.op.transpose_conv2d from docs 
(#10083)
     add e6af874  [PyTorch] Fix rsub type (#10090)
     new 96ef1d0  Test new Docker images

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (134edbd)
            \
             N -- N -- N   refs/heads/ci-docker-staging (96ef1d0)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CMakeLists.txt                                     |   28 +-
 Jenkinsfile                                        |   10 +-
 Makefile                                           |    4 +-
 apps/hexagon_api/CMakeLists.txt                    |   90 ++
 apps/hexagon_api/README.md                         |   58 +
 apps/hexagon_launcher/README.md                    |   37 -
 apps/microtvm/ethosu/Makefile                      |   13 +-
 apps/microtvm/ethosu/README.md                     |   19 +-
 apps/microtvm/ethosu/convert_image.py              |   11 +-
 apps/microtvm/ethosu/run_demo.sh                   |   15 +-
 apps/microtvm/ethosu/src/demo.c                    |    4 +-
 cmake/modules/Hexagon.cmake                        |  312 +----
 docker/install/ubuntu_install_dgl.sh               |    2 +-
 docker/install/ubuntu_install_rust.sh              |    2 +-
 docker/install/ubuntu_install_sccache.sh           |    3 +
 docker/install/ubuntu_install_wasmtime.sh          |    3 +
 docker/with_the_same_user                          |    4 +-
 docs/conf.py                                       |    1 +
 docs/reference/langref/relay_op.rst                |   14 +-
 .../how_to/work_with_microtvm/micro_autotune.py    |    7 +-
 gallery/how_to/work_with_microtvm/micro_tflite.py  |   34 +-
 gallery/how_to/work_with_microtvm/micro_tvmc.py    |  198 +++
 gallery/how_to/work_with_relay/build_gcn.py        |    5 +
 include/tvm/arith/iter_affine_map.h                |    7 +
 include/tvm/ir/expr.h                              |    6 +
 include/tvm/meta_schedule/schedule_rule.h          |   26 +-
 include/tvm/relay/qnn/attrs.h                      |    8 +-
 include/tvm/runtime/logging.h                      |    7 +
 include/tvm/tir/builtin.h                          |   11 +
 include/tvm/tir/function.h                         |   52 +
 include/tvm/tir/schedule/schedule.h                |   19 +
 include/tvm/tir/stmt.h                             |   26 +
 python/tvm/contrib/cc.py                           |   57 +-
 python/tvm/contrib/cudnn.py                        |  460 ++++++-
 python/tvm/contrib/cutlass/build.py                |   46 +-
 python/tvm/contrib/cutlass/gen_conv2d.py           |   36 +-
 python/tvm/contrib/cutlass/gen_gemm.py             |   50 +-
 python/tvm/contrib/cutlass/gen_tensor_op.py        |   44 +-
 python/tvm/contrib/pipeline_executor.py            |   44 +-
 python/tvm/driver/tvmc/target.py                   |    2 +-
 python/tvm/ir/json_compact.py                      |    1 +
 python/tvm/meta_schedule/integration.py            |    9 +-
 python/tvm/meta_schedule/mutator/__init__.py       |    2 +
 .../mutator/mutate_compute_location.py             |   21 +-
 .../tvm/meta_schedule/mutator/mutate_unroll.py     |   23 +-
 python/tvm/meta_schedule/postproc/__init__.py      |    4 +
 .../postproc/disallow_dynamic_loop.py              |   21 +-
 .../postproc/rewrite_parallel_vectorize_unroll.py  |   23 +-
 .../postproc/rewrite_reduction_block.py            |   21 +-
 .../postproc/rewrite_unbound_block.py              |   21 +-
 python/tvm/meta_schedule/schedule_rule/__init__.py |    5 +-
 .../schedule_rule/cross_thread_reduction.py        |   41 +
 .../schedule_rule/multi_level_tiling.py            |   84 ++
 .../schedule_rule/parallel_vectorize_unroll.py     |   64 +
 python/tvm/meta_schedule/testing/schedule_rule.py  |   73 ++
 python/tvm/relay/backend/contrib/ethosu/codegen.py |   12 +-
 .../tvm/relay/backend/contrib/ethosu/legalize.py   |   44 +
 .../backend/contrib/ethosu/tir_to_cs_translator.py |   62 +-
 python/tvm/relay/backend/contrib/ethosu/util.py    |   39 +-
 python/tvm/relay/expr.py                           |    7 +-
 python/tvm/relay/frontend/common.py                |   53 -
 python/tvm/relay/frontend/keras.py                 |  347 +++--
 python/tvm/relay/frontend/onnx.py                  |   28 +-
 python/tvm/relay/frontend/pytorch.py               |   24 +-
 python/tvm/relay/frontend/tensorflow.py            |   17 +-
 python/tvm/relay/frontend/tensorflow2.py           |   17 +-
 python/tvm/relay/frontend/tensorflow_ops.py        |    4 +-
 python/tvm/relay/frontend/tflite.py                |   16 +-
 python/tvm/relay/op/contrib/ethosu.py              |   59 +
 python/tvm/relay/op/nn/_nn.py                      |    4 +
 python/tvm/relay/op/strategy/cuda.py               |   36 +-
 python/tvm/relay/op/strategy/generic.py            |   40 +-
 .../tvm/relay/qnn/op/_requantize.py                |   10 +-
 python/tvm/relay/qnn/op/qnn.py                     |  141 +-
 .../transform/fake_quantization_to_integer.py      |   17 +
 python/tvm/rpc/server.py                           |    8 +-
 python/tvm/runtime/module.py                       |   13 +-
 python/tvm/tir/__init__.py                         |    2 +-
 python/tvm/tir/function.py                         |   48 +
 python/tvm/tir/schedule/schedule.py                |  229 ++++
 python/tvm/topi/cuda/conv2d.py                     |   19 +
 python/tvm/topi/cuda/conv2d_transpose_nchw.py      |    8 +
 python/tvm/topi/nn/conv2d_transpose.py             |    1 -
 python/tvm/topi/testing/__init__.py                |    2 +-
 .../topi/testing/conv2d_backcward_weight_python.py |   44 +-
 python/tvm/topi/testing/conv2d_transpose_python.py |    4 +-
 python/tvm/topi/x86/utils.py                       |   22 +
 src/arith/int_set.cc                               |    2 +-
 src/ir/error.cc                                    |    4 +-
 src/meta_schedule/integration.cc                   |   22 +-
 .../mutator/mutate_compute_location.cc             |  131 ++
 src/meta_schedule/mutator/mutate_unroll.cc         |  141 ++
 .../postproc/disallow_dynamic_loop.cc              |   85 ++
 .../postproc/rewrite_parallel_vectorize_unroll.cc  |  399 ++++++
 .../postproc/rewrite_reduction_block.cc            |  157 +++
 .../postproc/rewrite_unbound_block.cc              |  218 ++++
 .../schedule_rule/cross_thread_reduction.cc        |  285 ++++
 .../schedule_rule/multi_level_tiling.cc            |  416 ++++++
 .../schedule_rule/parallel_vectorize_unroll.cc     |  129 ++
 src/meta_schedule/utils.h                          |   29 +
 src/printer/relay_text_printer.cc                  |   23 +-
 src/printer/text_printer.h                         |    2 +-
 .../backend/contrib/cmsisnn/extract_constants.cc   |    7 +-
 src/relay/backend/contrib/cmsisnn/relay_to_tir.cc  |    9 +-
 src/relay/backend/contrib/ethosu/codegen.cc        |    8 +-
 src/relay/backend/contrib/ethosu/source_module.cc  |   93 +-
 src/relay/backend/contrib/ethosu/utils.cc          |   49 +-
 src/relay/backend/contrib/ethosu/utils.h           |   86 +-
 .../contrib/example_target_hooks/relay_to_tir.cc   |    9 +-
 src/relay/backend/te_compiler.cc                   |   12 +-
 src/relay/backend/vm/lambda_lift.cc                |    6 +-
 src/relay/ir/expr.cc                               |    5 +-
 src/relay/op/nn/convolution.cc                     |    1 -
 src/relay/op/tensor/transform.cc                   |    9 +-
 src/relay/qnn/op/op_common.h                       |   53 +
 src/relay/qnn/op/requantize.cc                     |  223 +++-
 src/relay/qnn/op/requantize_config.cc              |   93 ++
 src/relay/qnn/op/requantize_config.h               |  126 ++
 src/relay/qnn/op/rsqrt.cc                          |  126 ++
 src/relay/qnn/utils.cc                             |   16 +
 src/relay/qnn/utils.h                              |   41 +-
 src/relay/quantize/annotate.cc                     |    2 +-
 src/relay/quantize/calibrate.cc                    |    9 +-
 src/relay/transforms/annotate_target.cc            |    2 +-
 src/relay/transforms/convert_sparse_conv2d.cc      |    4 +-
 src/relay/transforms/convert_sparse_dense.cc       |    4 +-
 src/relay/transforms/de_duplicate.cc               |    9 +-
 src/relay/transforms/defunctionalization.cc        |    7 +-
 src/relay/transforms/defuse_ops.cc                 |   18 +-
 src/relay/transforms/eta_expand.cc                 |    3 +-
 src/relay/transforms/first_order_gradient.cc       |    5 +-
 src/relay/transforms/higher_order_gradient.cc      |   19 +-
 src/relay/transforms/inline.cc                     |    5 +-
 src/relay/transforms/partial_eval.cc               |   24 +-
 src/relay/transforms/partition_graph.cc            |   19 +-
 src/relay/transforms/pass_utils.h                  |    2 +-
 src/relay/transforms/pattern_utils.h               |   30 +
 src/relay/transforms/simplify_fc_transpose.cc      |    4 +-
 src/relay/transforms/to_a_normal_form.cc           |    4 +-
 src/relay/transforms/to_cps.cc                     |   13 +-
 src/runtime/contrib/cudnn/conv_backward.cc         |  265 ++++
 src/runtime/contrib/cudnn/conv_forward.cc          |    4 +-
 src/runtime/contrib/cudnn/cudnn_utils.h            |    4 +-
 src/runtime/hexagon/proxy_rpc/device_api.cc        |  155 ---
 src/runtime/logging.cc                             |   18 +-
 src/runtime/pipeline/pipeline_executor.cc          |   75 +-
 src/runtime/pipeline/pipeline_executor.h           |   31 +
 src/runtime/pipeline/pipeline_scheduler.cc         |   12 +-
 src/runtime/pipeline/pipeline_scheduler.h          |    4 +-
 src/runtime/pipeline/pipeline_struct.h             |  105 ++
 src/support/array.h                                |   23 +
 src/target/source/codegen_cuda.cc                  |   33 +
 src/target/source/ptx_mma.cc                       | 1374 ++++++++++++++++++++
 .../source/ptx_mma.h}                              |   38 +-
 src/tir/ir/function.cc                             |   53 +
 src/tir/op/builtin.cc                              |    3 +
 src/tir/schedule/analysis.h                        |   14 +
 src/tir/schedule/analysis/analysis.cc              |   32 +-
 src/tir/schedule/concrete_schedule.cc              |   23 +
 src/tir/schedule/concrete_schedule.h               |    3 +
 src/tir/schedule/ir_comparator.cc                  |  363 ++++++
 src/tir/schedule/ir_comparator.h                   |  116 ++
 src/tir/schedule/primitive.h                       |   18 +
 src/tir/schedule/primitive/blockize_tensorize.cc   |  698 ++++++++++
 src/tir/schedule/schedule.cc                       |   14 +
 src/tir/schedule/state.cc                          |    4 +-
 src/tir/schedule/trace.cc                          |    6 +-
 src/tir/schedule/traced_schedule.cc                |   31 +
 src/tir/schedule/traced_schedule.h                 |    3 +
 src/tir/schedule/utils.h                           |   59 +-
 src/tir/transforms/lower_cross_thread_reduction.cc |    8 +-
 src/tir/transforms/make_unpacked_api.cc            |    5 +
 tests/micro/common/test_tvmc.py                    |    2 +-
 tests/python/contrib/test_cudnn.py                 |  138 ++
 tests/python/contrib/test_cutlass.py               |    8 +-
 tests/python/contrib/test_ethosu/test_codegen.py   |   35 +-
 tests/python/contrib/test_ethosu/test_legalize.py  |  100 ++
 .../test_ethosu/test_tir_to_cs_translator.py       |   32 +-
 tests/python/driver/tvmc/test_target.py            |   13 +
 tests/python/frontend/keras/test_forward.py        |   17 +
 tests/python/frontend/onnx/test_forward.py         |    1 +
 tests/python/frontend/pytorch/test_forward.py      |   54 +-
 tests/python/frontend/tensorflow/test_forward.py   |   61 +-
 .../frontend/tensorflow2/test_sequential_models.py |   24 +-
 tests/python/frontend/tflite/test_forward.py       |   54 -
 tests/python/relay/test_op_grad_level2.py          |   33 +-
 tests/python/relay/test_op_level2.py               |   17 +-
 tests/python/relay/test_op_qnn_requantize.py       |  750 ++++++-----
 tests/python/relay/test_op_qnn_rsqrt.py            |   93 ++
 tests/python/relay/test_pass_defuse_ops.py         |  151 ++-
 .../test_pass_fake_quantization_to_integer.py      |   13 +
 tests/python/relay/test_pipeline_executor.py       |   11 +-
 tests/python/unittest/test_ci.py                   |   61 +-
 .../unittest/test_meta_schedule_integration.py     |   58 +
 ...eta_schedule_mutator_mutate_compute_location.py |   85 ++
 .../test_meta_schedule_mutator_mutate_unroll.py    |  114 ++
 ...meta_schedule_postproc_disallow_dynamic_loop.py |  100 ++
 ...e_postproc_rewrite_parallel_vectorize_unroll.py |   87 ++
 ...ta_schedule_postproc_rewrite_reduction_block.py |  222 ++++
 ...meta_schedule_postproc_rewrite_unbound_block.py |  140 ++
 ...chedule_schedule_rule_cross_thread_reduction.py |  241 ++++
 ...ta_schedule_schedule_rule_multi_level_tiling.py |  280 ++++
 ...dule_schedule_rule_parallel_vectorize_unroll.py |  105 ++
 tests/python/unittest/test_tir_ptx_mma.py          | 1356 +++++++++++++++++++
 .../python/unittest/test_tir_schedule_blockize.py  |  210 +++
 .../python/unittest/test_tir_schedule_tensorize.py |  431 ++++++
 ...t_tir_transform_lower_cross_thread_reduction.py |  163 +++
 207 files changed, 13818 insertions(+), 1799 deletions(-)
 create mode 100644 apps/hexagon_api/CMakeLists.txt
 create mode 100644 apps/hexagon_api/README.md
 create mode 100644 gallery/how_to/work_with_microtvm/micro_tvmc.py
 copy docker/install/ubuntu_install_sccache.sh => 
python/tvm/meta_schedule/mutator/mutate_compute_location.py (58%)
 copy docker/install/ubuntu_install_rust.sh => 
python/tvm/meta_schedule/mutator/mutate_unroll.py (64%)
 mode change 100755 => 100644
 copy docker/install/ubuntu_install_sccache.sh => 
python/tvm/meta_schedule/postproc/disallow_dynamic_loop.py (58%)
 copy docker/install/ubuntu_install_sccache.sh => 
python/tvm/meta_schedule/postproc/rewrite_parallel_vectorize_unroll.py (54%)
 copy docker/install/ubuntu_install_sccache.sh => 
python/tvm/meta_schedule/postproc/rewrite_reduction_block.py (58%)
 copy docker/install/ubuntu_install_sccache.sh => 
python/tvm/meta_schedule/postproc/rewrite_unbound_block.py (60%)
 create mode 100644 
python/tvm/meta_schedule/schedule_rule/cross_thread_reduction.py
 create mode 100644 python/tvm/meta_schedule/schedule_rule/multi_level_tiling.py
 create mode 100644 
python/tvm/meta_schedule/schedule_rule/parallel_vectorize_unroll.py
 copy docker/install/ubuntu_install_dgl.sh => 
python/tvm/relay/qnn/op/_requantize.py (84%)
 mode change 100755 => 100644
 create mode 100644 src/meta_schedule/mutator/mutate_compute_location.cc
 create mode 100644 src/meta_schedule/mutator/mutate_unroll.cc
 create mode 100644 src/meta_schedule/postproc/disallow_dynamic_loop.cc
 create mode 100644 
src/meta_schedule/postproc/rewrite_parallel_vectorize_unroll.cc
 create mode 100644 src/meta_schedule/postproc/rewrite_reduction_block.cc
 create mode 100644 src/meta_schedule/postproc/rewrite_unbound_block.cc
 create mode 100644 src/meta_schedule/schedule_rule/cross_thread_reduction.cc
 create mode 100644 src/meta_schedule/schedule_rule/multi_level_tiling.cc
 create mode 100644 src/meta_schedule/schedule_rule/parallel_vectorize_unroll.cc
 create mode 100644 src/relay/qnn/op/requantize_config.cc
 create mode 100644 src/relay/qnn/op/requantize_config.h
 create mode 100644 src/relay/qnn/op/rsqrt.cc
 create mode 100644 src/runtime/contrib/cudnn/conv_backward.cc
 delete mode 100644 src/runtime/hexagon/proxy_rpc/device_api.cc
 create mode 100644 src/target/source/ptx_mma.cc
 copy src/{runtime/pipeline/pipeline_scheduler.cc => target/source/ptx_mma.h} 
(50%)
 create mode 100644 src/tir/schedule/ir_comparator.cc
 create mode 100644 src/tir/schedule/ir_comparator.h
 create mode 100644 src/tir/schedule/primitive/blockize_tensorize.cc
 create mode 100644 tests/python/relay/test_op_qnn_rsqrt.py
 create mode 100644 
tests/python/unittest/test_meta_schedule_mutator_mutate_compute_location.py
 create mode 100644 
tests/python/unittest/test_meta_schedule_mutator_mutate_unroll.py
 create mode 100644 
tests/python/unittest/test_meta_schedule_postproc_disallow_dynamic_loop.py
 create mode 100644 
tests/python/unittest/test_meta_schedule_postproc_rewrite_parallel_vectorize_unroll.py
 create mode 100644 
tests/python/unittest/test_meta_schedule_postproc_rewrite_reduction_block.py
 create mode 100644 
tests/python/unittest/test_meta_schedule_postproc_rewrite_unbound_block.py
 create mode 100644 
tests/python/unittest/test_meta_schedule_schedule_rule_cross_thread_reduction.py
 create mode 100644 
tests/python/unittest/test_meta_schedule_schedule_rule_multi_level_tiling.py
 create mode 100644 
tests/python/unittest/test_meta_schedule_schedule_rule_parallel_vectorize_unroll.py
 create mode 100644 tests/python/unittest/test_tir_ptx_mma.py
 create mode 100644 tests/python/unittest/test_tir_schedule_blockize.py
 create mode 100644 tests/python/unittest/test_tir_schedule_tensorize.py

Reply via email to