ysh329 created an issue (apache/tvm#18701) # Introduction
The TVM community has worked since the last release to deliver the following new exciting improvements! The main tags are below (**bold text is with lots of progress**): Relax (especial PyTorch frontend), FFI etc. Please visit the full listing of commits for a complete view: [v0.23.dev0...v0.23.0.rc0](https://github.com/apache/tvm/compare/v0.23.dev0...v0.23.0.rc0). ### Community None. ### RFCs None. ### Adreno * [#18523](https://github.com/apache/tvm/pull/18523) - [TEXTURE] Texture based lowering ### Arith * [#18542](https://github.com/apache/tvm/pull/18542) - Revert "Fix InternalError: Check failed: (eval_vec_) is false" * [#18536](https://github.com/apache/tvm/pull/18536) - Fix InternalError: Check failed: (eval_vec_) is false ### BugFix * [#18628](https://github.com/apache/tvm/pull/18628) - [Fix] Fix typo in file header comment * [#18589](https://github.com/apache/tvm/pull/18589) - [OpenCL] Guard QCOM perf hint behind USE_OPENCL_EXTN_QCOM to avoid undefined symbol on non-QCOM runtimes * [#18534](https://github.com/apache/tvm/pull/18534) - Prevent segfault when instantiating abstract SearchStrategy ### CI * [#18549](https://github.com/apache/tvm/pull/18549) - Remove hardcoded user and repo values * [#18484](https://github.com/apache/tvm/pull/18484) - Update file patterns for specific linting hooks * [#18470](https://github.com/apache/tvm/pull/18470) - Enhance python linting scripts to support revision-based checks * [#18498](https://github.com/apache/tvm/pull/18498) - Use glob for `conda/build-environment.yaml` in cache key * [#18495](https://github.com/apache/tvm/pull/18495) - Update `actions/cache` to v4 in setup action * [#18457](https://github.com/apache/tvm/pull/18457) - Fix crash when grep finds no matches * [#18448](https://github.com/apache/tvm/pull/18448) - Update pre-commit configuration * [#18432](https://github.com/apache/tvm/pull/18432) - Enable username checks in PR title and body * [#18430](https://github.com/apache/tvm/pull/18430) - [TEST][CODEGEN] Fix the test scripts tries to tell numpy a dtype name that it cannot recognise * [#18419](https://github.com/apache/tvm/pull/18419) - [TEST] Refactor: remove the deprecated warning message check from test cases ### Docs * [#18545](https://github.com/apache/tvm/pull/18545) - Improve static shape tuning parameter configuration (follow-up to commit c71aefc) * [#18539](https://github.com/apache/tvm/pull/18539) - Fix e2e_opt_model tutorial for GPU deployment * [#18451](https://github.com/apache/tvm/pull/18451) - Update the merge setting * [#18436](https://github.com/apache/tvm/pull/18436) - Remove prebuilt package references and disable Colab button at tutorials * [#18413](https://github.com/apache/tvm/pull/18413) - Update cross-compilation and RPC tutorial with modern PyTorch deployment workflow * [#18412](https://github.com/apache/tvm/pull/18412) - Update tutorial for exporting and loading back Relax executables * [#18404](https://github.com/apache/tvm/pull/18404) - Add tutorial for exporting and loading back Relax executables ### Frontend * [#18435](https://github.com/apache/tvm/pull/18435) - [ONNX] Fix operator Transpose: TVMError: PermuteDims expects the number of input axes to equal the ndim of the input tensor ### LLVM * [#18586](https://github.com/apache/tvm/pull/18586) - [Codegen] Avoid segfault when `arith::GetVScaleValues` returns empty vector ### MetaSchedule * [#18547](https://github.com/apache/tvm/pull/18547) - Fix tune_tir crash with ScheduleError in RewriteParallelVectorizeUnroll ### Relax * [#18676](https://github.com/apache/tvm/pull/18676) - Implement dynamic output trimming for NMS * [#18664](https://github.com/apache/tvm/pull/18664) - Add FDataDependent operator attribute for LegalizeOps * [#18668](https://github.com/apache/tvm/pull/18668) - [Onnx] Support Local Response Normalization (LRN) * [#18667](https://github.com/apache/tvm/pull/18667) - Add native size operator * [#18675](https://github.com/apache/tvm/pull/18675) - [LAYOUT] Support for dynamic layout specification * [#18652](https://github.com/apache/tvm/pull/18652) - [ONNX] add support for unique optional outputs * [#18665](https://github.com/apache/tvm/pull/18665) - Replace topi.take with relax.op.take * [#18663](https://github.com/apache/tvm/pull/18663) - Fix wrong memory planning when only lower bound was provided * [#18666](https://github.com/apache/tvm/pull/18666) - [Onnx][Resize] Handle non-4D input tensors * [#18658](https://github.com/apache/tvm/pull/18658) - [Onnx][PReLU] Handle slope and axis argument with different slope shapes * [#18649](https://github.com/apache/tvm/pull/18649) - Remove obsolete TODO comments * [#18642](https://github.com/apache/tvm/pull/18642) - Add FRelaxInferLayout for gather_elements operator * [#18643](https://github.com/apache/tvm/pull/18643) - Add FRelaxInferLayout for scatter_nd operator * [#18641](https://github.com/apache/tvm/pull/18641) - [Op] Fixed incorrect output shape of Pool op when ceil_mode = true * [#18638](https://github.com/apache/tvm/pull/18638) - Add FRelaxInferLayout for scatter_elements operator * [#18637](https://github.com/apache/tvm/pull/18637) - Add FRelaxInferLayout for flip operator * [#18633](https://github.com/apache/tvm/pull/18633) - Add FRelaxInferLayout and TMixedPrecisionPolicy for dynamic_strided_slice * [#18635](https://github.com/apache/tvm/pull/18635) - [Onnx] Pass output_padding param in ConvTranspose * [#18632](https://github.com/apache/tvm/pull/18632) - Move GetUsedVars to analysis module * [#18629](https://github.com/apache/tvm/pull/18629) - Add FInferMixedPrecision and FRelaxInferLayout for conv transpose ops * [#18626](https://github.com/apache/tvm/pull/18626) - [Op][PyTorch] Supported Median operator * [#18576](https://github.com/apache/tvm/pull/18576) - Correct YaRN RoPE frequency scaling formula to align with the original paper * [#18615](https://github.com/apache/tvm/pull/18615) - Add gpu-generic fallback for unrecognized GPU targets * [#18621](https://github.com/apache/tvm/pull/18621) - Use weight shape instead of dim in Embedding.forward * [#18613](https://github.com/apache/tvm/pull/18613) - Remove duplicated test case: test_if_branch_var_scope * [#18616](https://github.com/apache/tvm/pull/18616) - Replaced call_pure_packed with tensor_to_shape operator * [#18593](https://github.com/apache/tvm/pull/18593) - feat: Implement FRelaxInferLayout for tile operator * [#18618](https://github.com/apache/tvm/pull/18618) - Add test case for op attributes in AST printer * [#18619](https://github.com/apache/tvm/pull/18619) - [PyTorch] Fix PyTorch Dynamo frontend for Darwin compatibility * [#18575](https://github.com/apache/tvm/pull/18575) - [ONNX] Add edge padding mode * [#18620](https://github.com/apache/tvm/pull/18620) - Fix flaky test_conv2d gradient numeric test * [#18609](https://github.com/apache/tvm/pull/18609) - Fix batch normalization computation logic * [#18574](https://github.com/apache/tvm/pull/18574) - [Torch] AssertionError: Unsupported function types ['mean.default'] * [#18591](https://github.com/apache/tvm/pull/18591) - Chore: Fix the DeprecationWarning: invalid escape sequence \ * [#18577](https://github.com/apache/tvm/pull/18577) - Clean up scatter_elements unknown dtype handling * [#18579](https://github.com/apache/tvm/pull/18579) - Add layout inference support for repeat operator * [#18583](https://github.com/apache/tvm/pull/18583) - [Torch] Fixed issues related to sum op when without dim and keep dim * [#18554](https://github.com/apache/tvm/pull/18554) - Enhance unique block name generation with numeric suffixes * [#18558](https://github.com/apache/tvm/pull/18558) - Add edge padding mode * [#18559](https://github.com/apache/tvm/pull/18559) - Add mod operator support * [#18544](https://github.com/apache/tvm/pull/18544) - [PyTorch] Add support for Custom Ops for ExportedProgram frontend * [#18535](https://github.com/apache/tvm/pull/18535) - [PyTorch] Add support for masked_select * [#18551](https://github.com/apache/tvm/pull/18551) - [Frontend] Introduce ModuleDict * [#18550](https://github.com/apache/tvm/pull/18550) - [PyTorch] Enhance scale_factor handling in interpolation * [#18553](https://github.com/apache/tvm/pull/18553) - [PyTorch] Unify dtype used in conv2d tests * [#18548](https://github.com/apache/tvm/pull/18548) - [PyTroch] Add NHWC layout support * [#18533](https://github.com/apache/tvm/pull/18533) - [PyTorch] Fix index_put with broadcast indices * [#18521](https://github.com/apache/tvm/pull/18521) - [PyTorch] Handle unknown output shapes for _sym_size_int * [#18532](https://github.com/apache/tvm/pull/18532) - [PyTorch] Add support for bidirectional GRU * [#18530](https://github.com/apache/tvm/pull/18530) - [PyTorch] Add boolean tensor support for max operation and corresponding test case * [#18524](https://github.com/apache/tvm/pull/18524) - [PyTorch] Fix InternalError when converting scaled_dot_product_attention with 2D inputs * [#18527](https://github.com/apache/tvm/pull/18527) - [PyTorch] Add support for non-persistent buffers in ExportedProgram frontend * [#18529](https://github.com/apache/tvm/pull/18529) - [PyTorch] Add support for binary scalar operations in ExportedProgram frontend and corresponding tests * [#18522](https://github.com/apache/tvm/pull/18522) - [PyTorch] Unify tests using shared tvm.testing.assert_allclose * [#18516](https://github.com/apache/tvm/pull/18516) - [PyTorch] Add support for bidirectional LSTM * [#18499](https://github.com/apache/tvm/pull/18499) - [PyTorch] Add support for sparse matrix multiplication * [#18518](https://github.com/apache/tvm/pull/18518) - [PyTorch] Fix batch normalization training mode correctness * [#18517](https://github.com/apache/tvm/pull/18517) - [PyTorch] Unify tests using shared verify_model * [#18506](https://github.com/apache/tvm/pull/18506) - [PyTorch] Enhance data type handling in FX graph translator * [#18507](https://github.com/apache/tvm/pull/18507) - [PyTorch] Support specifying decimals for _round * [#18500](https://github.com/apache/tvm/pull/18500) - [PyTorch] Add support for antialiased bilinear upsampling * [#18489](https://github.com/apache/tvm/pull/18489) - [PyTorch] Enhance handling of unbounded upper bound constraints * [#17599](https://github.com/apache/tvm/pull/17599) - [PASS] Annotate Custom Scope layout pass for Adreno GPU * [#18497](https://github.com/apache/tvm/pull/18497) - [PyTorch] Add binary operation dtype promotion following PyTorch rules in ExportedProgram frontend * [#18478](https://github.com/apache/tvm/pull/18478) - Fix the squeeze operator to behave consistently with torch * [#18496](https://github.com/apache/tvm/pull/18496) - [PyTorch] Add `mul` operator in ExportedProgram frontend * [#18494](https://github.com/apache/tvm/pull/18494) - [PyTorch] Add negative slicing support in `slice_scatter` operation * [#18493](https://github.com/apache/tvm/pull/18493) - [PyTorch] Add broadcast support for `copy` operation * [#18490](https://github.com/apache/tvm/pull/18490) - [PyTorch] Add `as_strided` operator in ExportedProgram frontend * [#18487](https://github.com/apache/tvm/pull/18487) - [PyTorch] Add `count_include_pad` support to `avg_pool2d` in PyTorch frontend * [#18488](https://github.com/apache/tvm/pull/18488) - [PyTorch] Enhance index_put support for multi-dimensional indices * [#18486](https://github.com/apache/tvm/pull/18486) - [PyTorch] Fix `batch_norm.default` args handling in ExportedProgram frontend * [#18483](https://github.com/apache/tvm/pull/18483) - [PyTorch] Add support for grid_sample operator * [#18482](https://github.com/apache/tvm/pull/18482) - [PyTorch] Add support for gumbel_softmax * [#18485](https://github.com/apache/tvm/pull/18485) - [PyTorch] Add dynamic shape support to `torch.ops.aten.sym_size.int` in ExportedProgram frontend * [#18473](https://github.com/apache/tvm/pull/18473) - [PyTorch] Add support for `torch.ops.aten.sym_size.int` in ExportedProgram frontend * [#18471](https://github.com/apache/tvm/pull/18471) - [PyTorch] Enable run_ep_decomposition by default * [#18462](https://github.com/apache/tvm/pull/18462) - [PyTorch] Add decomposed operator support for interpolate * [#18455](https://github.com/apache/tvm/pull/18455) - Fix flaky test_conv2d_offload by increasing float32 tolerance * [#18463](https://github.com/apache/tvm/pull/18463) - [PyTorch] Support advanced range constraints (multiplication) * [#18464](https://github.com/apache/tvm/pull/18464) - [PyTorch] Enable decomposition in all tests * [#18461](https://github.com/apache/tvm/pull/18461) - [PyTorch] Fix KeyError: dtype when converting PyTorch model with gradient checkpointing using torch.export * [#18452](https://github.com/apache/tvm/pull/18452) - [PyTorch] Support advanced range constraints (addition) * [#18454](https://github.com/apache/tvm/pull/18454) - [PyTorch]: Fix the sqrt operation requires float dtype but receives int64 in attention scaling * [#18459](https://github.com/apache/tvm/pull/18459) - [PyTorch] Fix MultiheadAttention complie * [#18460](https://github.com/apache/tvm/pull/18460) - [PyTorch] Add decomposed operator support for normalization * [#18458](https://github.com/apache/tvm/pull/18458) - [PyTorch] Add decomposed operator support for Binary * [#18449](https://github.com/apache/tvm/pull/18449) - [PyTorch] Add decomposed operator support for Pad * [#18447](https://github.com/apache/tvm/pull/18447) - [PyTorch] Add lower bound support for range constraints * [#18446](https://github.com/apache/tvm/pull/18446) - [PyTorch] Add decomposed operator support for MaxPool * [#18437](https://github.com/apache/tvm/pull/18437) - [PyTorch] Add decomposed operator support for AdaptiveAvgPool * [#18433](https://github.com/apache/tvm/pull/18433) - [PyTorch] Add decomposed operator support for Conv * [#18429](https://github.com/apache/tvm/pull/18429) - [PyTorch] Support basic range constraints * [#18428](https://github.com/apache/tvm/pull/18428) - [PyTorch] Add support for decomposed operators and fix IR of ops tests(8) * [#18427](https://github.com/apache/tvm/pull/18427) - [PyTorch] Add support for decomposed operators and fix IR of ops tests(7) * [#18420](https://github.com/apache/tvm/pull/18420) - [PyTorch] Add support for decomposed operators and fix IR of ops tests(6) * [#18417](https://github.com/apache/tvm/pull/18417) - [PyTorch] Add support for decomposed operators and fix IR of ops tests(5) * [#18416](https://github.com/apache/tvm/pull/18416) - [ONNX] Fix bug: Unsupported numpy or ml_dtypes dtype('O') when importing ONNX model using Relax frontend * [#18414](https://github.com/apache/tvm/pull/18414) - [PyTorch] Add support for decomposed operators and fix IR of ops tests(4) * [#18410](https://github.com/apache/tvm/pull/18410) - [PyTorch] Add support for decomposed operators and fix IR of ops tests(3) * [#18403](https://github.com/apache/tvm/pull/18403) - [PyTorch] Add support for decomposed operators and fix IR of ops tests(2) * [#18402](https://github.com/apache/tvm/pull/18402) - [PyTorch] Add support for decomposed operators and fix IR of ops tests(1) * [#18401](https://github.com/apache/tvm/pull/18401) - [PyTorch] Enable decomposition for unary ops and refactor tests * [#18400](https://github.com/apache/tvm/pull/18400) - [PyTorch] Add support for decomposed operators in extended unary ops tests * [#18399](https://github.com/apache/tvm/pull/18399) - [PyTorch] Add run_ep_decomposition flag to control PyTorch decomposition ### Runtime * [#18546](https://github.com/apache/tvm/pull/18546) - [MatchShape] Type error: Cannot convert from type ' DLTensor* ' to ' ffi.Shape ' ### TIR * [#18639](https://github.com/apache/tvm/pull/18639) - [Schedule] Fix type checker to support subscripted generics in Python 3.14+ * [#18515](https://github.com/apache/tvm/pull/18515) - [Schedule] FuseReductionEpilogue: Add Clipping pattern support * [#18556](https://github.com/apache/tvm/pull/18556) - [Schedule] Fix bug on bfloat16 conversion * [#18528](https://github.com/apache/tvm/pull/18528) - [Schedule] Fix mma tensorize error * [#18514](https://github.com/apache/tvm/pull/18514) - Fix tir.LowerIntrin check failed additional_info.size() == new_size * [#18505](https://github.com/apache/tvm/pull/18505) - Update function signatures for decompose_reduction * [#18479](https://github.com/apache/tvm/pull/18479) - : Fix VerifyStream::Verify causes dereferencing an invalid pointer * [#18421](https://github.com/apache/tvm/pull/18421) - Add step attribute to ForNode (Initial codes) * [#18418](https://github.com/apache/tvm/pull/18418) - [Schedule] Add FuseReductionEpilogue primitive to fuse epilogue … * [#18466](https://github.com/apache/tvm/pull/18466) - Fix Data Type Mismatch (int64 vs int32) in T.match_buffer when Working with Scalar Buffers in TIR ### TVMScript * [#18504](https://github.com/apache/tvm/pull/18504) - Add test for TIR macro block name suffix handling * [#18465](https://github.com/apache/tvm/pull/18465) - Add block name suffix management for TIR macros ### cuda & cutlass & tensorrt * [#18624](https://github.com/apache/tvm/pull/18624) - [CUDA] Fix cuModuleUnload crash during interpreter shutdown * [#18604](https://github.com/apache/tvm/pull/18604) - [CUDA][FFI] Extend kernel launch config to support Programmatic Dependent Launch and cuLaunchCooperativeKernel ### web * [#18683](https://github.com/apache/tvm/pull/18683) - Fix RPC argument parsing for new FFI string/bytes types * [#18686](https://github.com/apache/tvm/pull/18686) - Fix incorrect FFI export name in runtime.ts * [#18480](https://github.com/apache/tvm/pull/18480) - Bump web runtime version 0.23.0-dev1 * [#18467](https://github.com/apache/tvm/pull/18467) - Replace string with TVMFFIByteArray* to avoid memory issues * [#18450](https://github.com/apache/tvm/pull/18450) - Fix progress reporting when loading from cache * [#18415](https://github.com/apache/tvm/pull/18415) - Fix arrayDecodeStorage scope issue for q0f32 models * [#18385](https://github.com/apache/tvm/pull/18385) - Upgrade web runtime to new FFI ### Misc * [#18681](https://github.com/apache/tvm/pull/18681) - [NVRTC] Add NVSHMEM support to NVRTC compilation path * [#18674](https://github.com/apache/tvm/pull/18674) - fix: MSVC pragma * [#18654](https://github.com/apache/tvm/pull/18654) - [FFI] bump to latest version * [#18656](https://github.com/apache/tvm/pull/18656) - Put options before objects when compiling * [#18519](https://github.com/apache/tvm/pull/18519) - [Compile] accelerate compilation speed using NVRTC * [#18582](https://github.com/apache/tvm/pull/18582) - Fix ACOS precision issue for boundary values (x=±1.0) * [#18557](https://github.com/apache/tvm/pull/18557) - [Attn] Fix calling FlashInfer attention plan function * [#18555](https://github.com/apache/tvm/pull/18555) - Fix duplicate `PresburgerSetNode` registration when `USE_MLIR=ON` and MLIR >= 15.0 * [#18525](https://github.com/apache/tvm/pull/18525) - [Schedule] Fix LocalBuilder Check failed: (index_map_func.has_value()) is false * [#18511](https://github.com/apache/tvm/pull/18511) - [Pass] Add DumpIR pass instrument to save IR snapshots * [#18512](https://github.com/apache/tvm/pull/18512) - Remove unused TVMC configs * [#18509](https://github.com/apache/tvm/pull/18509) - Fix compilation warnings * [#18492](https://github.com/apache/tvm/pull/18492) - Fix BufferError when converting PyTorch models with sparse tensors * [#18469](https://github.com/apache/tvm/pull/18469) - [Contrib] Update RandomFill to use StreamSync for CUDA synchronization * [#18453](https://github.com/apache/tvm/pull/18453) - [DataType] Update to use explicit Bool Type Aligning with DLPack * [#18422](https://github.com/apache/tvm/pull/18422) - Adjusted Longrope embedding function to match Huggingface Implementation * [#18426](https://github.com/apache/tvm/pull/18426) - Support integer type input for log and log2 * [#18411](https://github.com/apache/tvm/pull/18411) - [FFI] Bump tvm-ffi to latest * [#18409](https://github.com/apache/tvm/pull/18409) - Fixing database bug * [#18390](https://github.com/apache/tvm/pull/18390) - Support integer types in TIR expression operators * [#18398](https://github.com/apache/tvm/pull/18398) - fix the 8-bit vector loads/stores problem, which will solve the problem raised in the codegen test for cuda * [#18389](https://github.com/apache/tvm/pull/18389) - Add VisitStmt_ method for AssertStmtNode and StringImmNode * [#18361](https://github.com/apache/tvm/pull/18361) - [WebLLM] Replace int64s with int32s in WebGPU kernels * [#18384](https://github.com/apache/tvm/pull/18384) - Fix crash when multiple PrimFunc objects are present in IRModule * [#18378](https://github.com/apache/tvm/pull/18378) - [release][Dont Squash] Update version to 0.22.0 and 0.23.0.dev on main branch -- Reply to this email directly or view it on GitHub: https://github.com/apache/tvm/issues/18701 You are receiving this because you are subscribed to this thread. Message ID: <apache/tvm/issues/[email protected]>
