# Introduction

The TVM community has worked since the last release to deliver the following 
new exciting improvements!

The main tags are below (**bold text is with lots of progress**): Relax 
(especial PyTorch frontend), TIR etc.

Please visit the full listing of commits for a complete view: 
[v0.23.dev0...v0.23.0.rc0](https://github.com/apache/tvm/compare/v0.23.dev0...v0.23.0.rc0).

### Community

None.

### RFCs

None.

### Adreno
 * [#18523](https://github.com/apache/tvm/pull/18523) - [TEXTURE] Texture based 
lowering

### Arith
 * [#18542](https://github.com/apache/tvm/pull/18542) - Revert "Fix 
InternalError: Check failed: (eval_vec_) is false"
 * [#18536](https://github.com/apache/tvm/pull/18536) - Fix InternalError: 
Check failed: (eval_vec_) is false

### BugFix
 * [#18628](https://github.com/apache/tvm/pull/18628) - [Fix] Fix typo in file 
header comment
 * [#18589](https://github.com/apache/tvm/pull/18589) - [OpenCL] Guard QCOM 
perf hint behind USE_OPENCL_EXTN_QCOM to avoid undefined symbol on non-QCOM 
runtimes
 * [#18534](https://github.com/apache/tvm/pull/18534) - Prevent segfault when 
instantiating abstract SearchStrategy

### CI
 * [#18549](https://github.com/apache/tvm/pull/18549) - Remove hardcoded user 
and repo values
 * [#18484](https://github.com/apache/tvm/pull/18484) - Update file patterns 
for specific linting hooks
 * [#18470](https://github.com/apache/tvm/pull/18470) - Enhance python linting 
scripts to support revision-based checks
 * [#18498](https://github.com/apache/tvm/pull/18498) - Use glob for 
`conda/build-environment.yaml` in cache key
 * [#18495](https://github.com/apache/tvm/pull/18495) - Update `actions/cache` 
to v4 in setup action
 * [#18457](https://github.com/apache/tvm/pull/18457) - Fix crash when grep 
finds no matches
 * [#18448](https://github.com/apache/tvm/pull/18448) - Update pre-commit 
configuration
 * [#18432](https://github.com/apache/tvm/pull/18432) - Enable username checks 
in PR title and body
 * [#18430](https://github.com/apache/tvm/pull/18430) - [TEST][CODEGEN] Fix the 
test scripts tries to tell numpy a dtype name that it cannot recognise
 * [#18419](https://github.com/apache/tvm/pull/18419) - [TEST] Refactor: remove 
the deprecated warning message check from test cases

### Docs
 * [#18545](https://github.com/apache/tvm/pull/18545) - Improve static shape 
tuning parameter configuration (follow-up to commit c71aefc)
 * [#18539](https://github.com/apache/tvm/pull/18539) - Fix e2e_opt_model 
tutorial for GPU deployment
 * [#18451](https://github.com/apache/tvm/pull/18451) - Update the merge setting
 * [#18436](https://github.com/apache/tvm/pull/18436) - Remove prebuilt package 
references and disable Colab button at tutorials
 * [#18413](https://github.com/apache/tvm/pull/18413) - Update 
cross-compilation and RPC tutorial with modern PyTorch deployment workflow
 * [#18412](https://github.com/apache/tvm/pull/18412) - Update tutorial for 
exporting and loading back Relax executables
 * [#18404](https://github.com/apache/tvm/pull/18404) - Add tutorial for 
exporting and loading back Relax executables

### Frontend
 * [#18435](https://github.com/apache/tvm/pull/18435) - [ONNX] Fix operator 
Transpose: TVMError: PermuteDims expects the number of input axes to equal the 
ndim of the input tensor

### LLVM
 * [#18586](https://github.com/apache/tvm/pull/18586) - [Codegen] Avoid 
segfault when `arith::GetVScaleValues` returns empty vector

### MetaSchedule
 * [#18547](https://github.com/apache/tvm/pull/18547) - Fix tune_tir crash with 
ScheduleError in RewriteParallelVectorizeUnroll

### Relax
 * [#18676](https://github.com/apache/tvm/pull/18676) - Implement dynamic 
output trimming for NMS
 * [#18664](https://github.com/apache/tvm/pull/18664) - Add FDataDependent 
operator attribute for LegalizeOps
 * [#18668](https://github.com/apache/tvm/pull/18668) - [Onnx] Support Local 
Response Normalization (LRN)
 * [#18667](https://github.com/apache/tvm/pull/18667) - Add native size operator
 * [#18675](https://github.com/apache/tvm/pull/18675) - [LAYOUT] Support for 
dynamic layout specification
 * [#18652](https://github.com/apache/tvm/pull/18652) - [ONNX] add support for 
unique optional outputs
 * [#18665](https://github.com/apache/tvm/pull/18665) - Replace topi.take with 
relax.op.take
 * [#18663](https://github.com/apache/tvm/pull/18663) - Fix wrong memory 
planning when only lower bound was provided
 * [#18666](https://github.com/apache/tvm/pull/18666) - [Onnx][Resize] Handle 
non-4D input tensors
 * [#18658](https://github.com/apache/tvm/pull/18658) - [Onnx][PReLU] Handle 
slope and axis argument with different slope shapes
 * [#18649](https://github.com/apache/tvm/pull/18649) - Remove obsolete TODO 
comments
 * [#18642](https://github.com/apache/tvm/pull/18642) - Add FRelaxInferLayout 
for gather_elements operator
 * [#18643](https://github.com/apache/tvm/pull/18643) - Add FRelaxInferLayout 
for scatter_nd operator
 * [#18641](https://github.com/apache/tvm/pull/18641) - [Op] Fixed incorrect 
output shape of Pool op when ceil_mode = true
 * [#18638](https://github.com/apache/tvm/pull/18638) - Add FRelaxInferLayout 
for scatter_elements operator
 * [#18637](https://github.com/apache/tvm/pull/18637) - Add FRelaxInferLayout 
for flip operator
 * [#18633](https://github.com/apache/tvm/pull/18633) - Add FRelaxInferLayout 
and TMixedPrecisionPolicy for dynamic_strided_slice
 * [#18635](https://github.com/apache/tvm/pull/18635) - [Onnx] Pass 
output_padding param in ConvTranspose
 * [#18632](https://github.com/apache/tvm/pull/18632) - Move GetUsedVars to 
analysis module
 * [#18629](https://github.com/apache/tvm/pull/18629) - Add 
FInferMixedPrecision and FRelaxInferLayout for conv transpose ops
 * [#18626](https://github.com/apache/tvm/pull/18626) - [Op][PyTorch] Supported 
Median operator
 * [#18576](https://github.com/apache/tvm/pull/18576) - Correct YaRN RoPE 
frequency scaling formula to align with the original paper
 * [#18615](https://github.com/apache/tvm/pull/18615) - Add gpu-generic 
fallback for unrecognized GPU targets
 * [#18621](https://github.com/apache/tvm/pull/18621) - Use weight shape 
instead of dim in Embedding.forward
 * [#18613](https://github.com/apache/tvm/pull/18613) - Remove duplicated test 
case: test_if_branch_var_scope
 * [#18616](https://github.com/apache/tvm/pull/18616) - Replaced 
call_pure_packed with tensor_to_shape operator
 * [#18593](https://github.com/apache/tvm/pull/18593) - feat: Implement 
FRelaxInferLayout for tile operator
 * [#18618](https://github.com/apache/tvm/pull/18618) - Add test case for op 
attributes in AST printer
 * [#18619](https://github.com/apache/tvm/pull/18619) - [PyTorch] Fix PyTorch 
Dynamo frontend for Darwin compatibility
 * [#18575](https://github.com/apache/tvm/pull/18575) - [ONNX] Add edge padding 
mode
 * [#18620](https://github.com/apache/tvm/pull/18620) - Fix flaky test_conv2d 
gradient numeric test
 * [#18609](https://github.com/apache/tvm/pull/18609) - Fix batch normalization 
computation logic
 * [#18574](https://github.com/apache/tvm/pull/18574) - [Torch] AssertionError: 
Unsupported function types ['mean.default']
 * [#18591](https://github.com/apache/tvm/pull/18591) - Chore: Fix the 
DeprecationWarning: invalid escape sequence \
 * [#18577](https://github.com/apache/tvm/pull/18577) - Clean up 
scatter_elements unknown dtype handling
 * [#18579](https://github.com/apache/tvm/pull/18579) - Add layout inference 
support for repeat operator
 * [#18583](https://github.com/apache/tvm/pull/18583) - [Torch] Fixed issues 
related to sum op when without dim and keep dim
 * [#18554](https://github.com/apache/tvm/pull/18554) - Enhance unique block 
name generation with numeric suffixes
 * [#18558](https://github.com/apache/tvm/pull/18558) - Add edge padding mode
 * [#18559](https://github.com/apache/tvm/pull/18559) - Add mod operator support
 * [#18544](https://github.com/apache/tvm/pull/18544) - [PyTorch] Add support 
for Custom Ops for ExportedProgram frontend
 * [#18535](https://github.com/apache/tvm/pull/18535) - [PyTorch] Add support 
for masked_select
 * [#18551](https://github.com/apache/tvm/pull/18551) - [Frontend] Introduce 
ModuleDict
 * [#18550](https://github.com/apache/tvm/pull/18550) - [PyTorch] Enhance 
scale_factor handling in interpolation
 * [#18553](https://github.com/apache/tvm/pull/18553) - [PyTorch] Unify dtype 
used in conv2d tests
 * [#18548](https://github.com/apache/tvm/pull/18548) - [PyTroch] Add NHWC 
layout support
 * [#18533](https://github.com/apache/tvm/pull/18533) - [PyTorch] Fix index_put 
with broadcast indices
 * [#18521](https://github.com/apache/tvm/pull/18521) - [PyTorch] Handle 
unknown output shapes for _sym_size_int
 * [#18532](https://github.com/apache/tvm/pull/18532) - [PyTorch] Add support 
for bidirectional GRU
 * [#18530](https://github.com/apache/tvm/pull/18530) - [PyTorch] Add boolean 
tensor support for max operation and corresponding test case
 * [#18524](https://github.com/apache/tvm/pull/18524) - [PyTorch] Fix 
InternalError when converting scaled_dot_product_attention with 2D inputs
 * [#18527](https://github.com/apache/tvm/pull/18527) - [PyTorch] Add support 
for non-persistent buffers in ExportedProgram frontend
 * [#18529](https://github.com/apache/tvm/pull/18529) - [PyTorch] Add support 
for binary scalar operations in ExportedProgram frontend and corresponding tests
 * [#18522](https://github.com/apache/tvm/pull/18522) - [PyTorch] Unify tests 
using shared tvm.testing.assert_allclose
 * [#18516](https://github.com/apache/tvm/pull/18516) - [PyTorch] Add support 
for bidirectional LSTM
 * [#18499](https://github.com/apache/tvm/pull/18499) - [PyTorch] Add support 
for sparse matrix multiplication
 * [#18518](https://github.com/apache/tvm/pull/18518) - [PyTorch] Fix batch 
normalization training mode correctness
 * [#18517](https://github.com/apache/tvm/pull/18517) - [PyTorch] Unify tests 
using shared verify_model
 * [#18506](https://github.com/apache/tvm/pull/18506) - [PyTorch] Enhance data 
type handling in FX graph translator
 * [#18507](https://github.com/apache/tvm/pull/18507) - [PyTorch] Support 
specifying decimals for _round
 * [#18500](https://github.com/apache/tvm/pull/18500) - [PyTorch] Add support 
for antialiased bilinear upsampling
 * [#18489](https://github.com/apache/tvm/pull/18489) - [PyTorch] Enhance 
handling of unbounded upper bound constraints
 * [#17599](https://github.com/apache/tvm/pull/17599) - [PASS] Annotate Custom 
Scope layout pass for Adreno GPU
 * [#18497](https://github.com/apache/tvm/pull/18497) - [PyTorch] Add binary 
operation dtype promotion following PyTorch rules in ExportedProgram frontend
 * [#18478](https://github.com/apache/tvm/pull/18478) - Fix the squeeze 
operator to behave consistently with torch
 * [#18496](https://github.com/apache/tvm/pull/18496) - [PyTorch] Add `mul` 
operator in ExportedProgram frontend
 * [#18494](https://github.com/apache/tvm/pull/18494) - [PyTorch] Add negative 
slicing support in `slice_scatter` operation
 * [#18493](https://github.com/apache/tvm/pull/18493) - [PyTorch] Add broadcast 
support for `copy` operation
 * [#18490](https://github.com/apache/tvm/pull/18490) - [PyTorch] Add 
`as_strided` operator in ExportedProgram frontend
 * [#18487](https://github.com/apache/tvm/pull/18487) - [PyTorch] Add 
`count_include_pad` support to `avg_pool2d` in PyTorch frontend
 * [#18488](https://github.com/apache/tvm/pull/18488) - [PyTorch] Enhance 
index_put support for multi-dimensional indices
 * [#18486](https://github.com/apache/tvm/pull/18486) - [PyTorch] Fix 
`batch_norm.default` args handling in ExportedProgram frontend
 * [#18483](https://github.com/apache/tvm/pull/18483) - [PyTorch] Add support 
for grid_sample operator
 * [#18482](https://github.com/apache/tvm/pull/18482) - [PyTorch] Add support 
for gumbel_softmax
 * [#18485](https://github.com/apache/tvm/pull/18485) - [PyTorch] Add dynamic 
shape support to `torch.ops.aten.sym_size.int` in ExportedProgram frontend
 * [#18473](https://github.com/apache/tvm/pull/18473) - [PyTorch] Add support 
for `torch.ops.aten.sym_size.int` in ExportedProgram frontend
 * [#18471](https://github.com/apache/tvm/pull/18471) - [PyTorch] Enable 
run_ep_decomposition by default
 * [#18462](https://github.com/apache/tvm/pull/18462) - [PyTorch] Add 
decomposed operator support for interpolate
 * [#18455](https://github.com/apache/tvm/pull/18455) - Fix flaky 
test_conv2d_offload by increasing float32 tolerance
 * [#18463](https://github.com/apache/tvm/pull/18463) - [PyTorch] Support 
advanced range constraints (multiplication)
 * [#18464](https://github.com/apache/tvm/pull/18464) - [PyTorch] Enable 
decomposition in all tests
 * [#18461](https://github.com/apache/tvm/pull/18461) - [PyTorch] Fix KeyError: 
dtype when converting PyTorch model with gradient checkpointing using 
torch.export
 * [#18452](https://github.com/apache/tvm/pull/18452) - [PyTorch] Support 
advanced range constraints (addition)
 * [#18454](https://github.com/apache/tvm/pull/18454) - [PyTorch]: Fix the sqrt 
operation requires float dtype but receives int64 in attention scaling
 * [#18459](https://github.com/apache/tvm/pull/18459) - [PyTorch] Fix 
MultiheadAttention complie
 * [#18460](https://github.com/apache/tvm/pull/18460) - [PyTorch] Add 
decomposed operator support for normalization
 * [#18458](https://github.com/apache/tvm/pull/18458) - [PyTorch] Add 
decomposed operator support for Binary
 * [#18449](https://github.com/apache/tvm/pull/18449) - [PyTorch] Add 
decomposed operator support for Pad
 * [#18447](https://github.com/apache/tvm/pull/18447) - [PyTorch] Add lower 
bound support for range constraints
 * [#18446](https://github.com/apache/tvm/pull/18446) - [PyTorch] Add 
decomposed operator support for MaxPool
 * [#18437](https://github.com/apache/tvm/pull/18437) - [PyTorch] Add 
decomposed operator support for AdaptiveAvgPool
 * [#18433](https://github.com/apache/tvm/pull/18433) - [PyTorch] Add 
decomposed operator support for Conv
 * [#18429](https://github.com/apache/tvm/pull/18429) - [PyTorch] Support basic 
range constraints
 * [#18428](https://github.com/apache/tvm/pull/18428) - [PyTorch] Add support 
for decomposed operators and fix IR of ops tests(8)
 * [#18427](https://github.com/apache/tvm/pull/18427) - [PyTorch] Add support 
for decomposed operators and fix IR of ops tests(7)
 * [#18420](https://github.com/apache/tvm/pull/18420) - [PyTorch] Add support 
for decomposed operators and fix IR of ops tests(6)
 * [#18417](https://github.com/apache/tvm/pull/18417) - [PyTorch] Add support 
for decomposed operators and fix IR of ops tests(5)
 * [#18416](https://github.com/apache/tvm/pull/18416) - [ONNX] Fix bug: 
Unsupported numpy or ml_dtypes dtype('O') when importing ONNX model using Relax 
frontend
 * [#18414](https://github.com/apache/tvm/pull/18414) - [PyTorch] Add support 
for decomposed operators and fix IR of ops tests(4)
 * [#18410](https://github.com/apache/tvm/pull/18410) - [PyTorch] Add support 
for decomposed operators and fix IR of ops tests(3)
 * [#18403](https://github.com/apache/tvm/pull/18403) - [PyTorch] Add support 
for decomposed operators and fix IR of ops tests(2)
 * [#18402](https://github.com/apache/tvm/pull/18402) - [PyTorch] Add support 
for decomposed operators and fix IR of ops tests(1)
 * [#18401](https://github.com/apache/tvm/pull/18401) - [PyTorch] Enable 
decomposition for unary ops and refactor tests
 * [#18400](https://github.com/apache/tvm/pull/18400) - [PyTorch] Add support 
for decomposed operators in extended unary ops tests
 * [#18399](https://github.com/apache/tvm/pull/18399) - [PyTorch] Add 
run_ep_decomposition flag to control PyTorch decomposition

### Runtime
 * [#18546](https://github.com/apache/tvm/pull/18546) - [MatchShape] Type 
error: Cannot convert from type ' DLTensor* ' to ' ffi.Shape '

### TIR
 * [#18639](https://github.com/apache/tvm/pull/18639) - [Schedule] Fix type 
checker to support subscripted generics in Python 3.14+
 * [#18515](https://github.com/apache/tvm/pull/18515) - [Schedule] 
FuseReductionEpilogue: Add Clipping pattern support
 * [#18556](https://github.com/apache/tvm/pull/18556) - [Schedule] Fix bug on 
bfloat16 conversion
 * [#18528](https://github.com/apache/tvm/pull/18528) - [Schedule] Fix mma 
tensorize error
 * [#18514](https://github.com/apache/tvm/pull/18514) - Fix tir.LowerIntrin 
check failed additional_info.size() == new_size
 * [#18505](https://github.com/apache/tvm/pull/18505) - Update function 
signatures for decompose_reduction
 * [#18479](https://github.com/apache/tvm/pull/18479) - : Fix 
VerifyStream::Verify causes dereferencing an invalid pointer
 * [#18421](https://github.com/apache/tvm/pull/18421) - Add step attribute to 
ForNode (Initial codes)
 * [#18418](https://github.com/apache/tvm/pull/18418) - [Schedule] Add 
FuseReductionEpilogue primitive to fuse epilogue …
 * [#18466](https://github.com/apache/tvm/pull/18466) - Fix Data Type Mismatch 
(int64 vs int32) in T.match_buffer when Working with Scalar Buffers in TIR

### TVMScript
 * [#18504](https://github.com/apache/tvm/pull/18504) - Add test for TIR macro 
block name suffix handling
 * [#18465](https://github.com/apache/tvm/pull/18465) - Add block name suffix 
management for TIR macros

### cuda & cutlass & tensorrt
 * [#18624](https://github.com/apache/tvm/pull/18624) - [CUDA] Fix 
cuModuleUnload crash during interpreter shutdown
 * [#18604](https://github.com/apache/tvm/pull/18604) - [CUDA][FFI] Extend 
kernel launch config to support Programmatic Dependent Launch and 
cuLaunchCooperativeKernel

### web
 * [#18683](https://github.com/apache/tvm/pull/18683) - Fix RPC argument 
parsing for new FFI string/bytes types
 * [#18686](https://github.com/apache/tvm/pull/18686) - Fix incorrect FFI 
export name in runtime.ts
 * [#18480](https://github.com/apache/tvm/pull/18480) - Bump web runtime 
version 0.23.0-dev1
 * [#18467](https://github.com/apache/tvm/pull/18467) - Replace string with 
TVMFFIByteArray* to avoid memory issues
 * [#18450](https://github.com/apache/tvm/pull/18450) - Fix progress reporting 
when loading from cache
 * [#18415](https://github.com/apache/tvm/pull/18415) - Fix arrayDecodeStorage 
scope issue for q0f32 models
 * [#18385](https://github.com/apache/tvm/pull/18385) - Upgrade web runtime to 
new FFI

### Misc
 * [#18681](https://github.com/apache/tvm/pull/18681) - [NVRTC] Add NVSHMEM 
support to NVRTC compilation path
 * [#18674](https://github.com/apache/tvm/pull/18674) - fix: MSVC pragma
 * [#18654](https://github.com/apache/tvm/pull/18654) - [FFI] bump to latest 
version
 * [#18656](https://github.com/apache/tvm/pull/18656) - Put options before 
objects when compiling
 * [#18519](https://github.com/apache/tvm/pull/18519) - [Compile] accelerate 
compilation speed using NVRTC
 * [#18582](https://github.com/apache/tvm/pull/18582) - Fix ACOS precision 
issue for boundary values (x=±1.0)
 * [#18557](https://github.com/apache/tvm/pull/18557) - [Attn] Fix calling 
FlashInfer attention plan function
 * [#18555](https://github.com/apache/tvm/pull/18555) - Fix duplicate 
`PresburgerSetNode` registration when `USE_MLIR=ON` and MLIR >= 15.0
 * [#18525](https://github.com/apache/tvm/pull/18525) - [Schedule] Fix 
LocalBuilder Check failed: (index_map_func.has_value()) is false
 * [#18511](https://github.com/apache/tvm/pull/18511) - [Pass] Add DumpIR pass 
instrument to save IR snapshots
 * [#18512](https://github.com/apache/tvm/pull/18512) - Remove unused TVMC 
configs
 * [#18509](https://github.com/apache/tvm/pull/18509) - Fix compilation warnings
 * [#18492](https://github.com/apache/tvm/pull/18492) - Fix BufferError when 
converting PyTorch models with sparse tensors
 * [#18469](https://github.com/apache/tvm/pull/18469) - [Contrib] Update 
RandomFill to use StreamSync for CUDA synchronization
 * [#18453](https://github.com/apache/tvm/pull/18453) - [DataType] Update to 
use explicit Bool Type Aligning with DLPack
 * [#18422](https://github.com/apache/tvm/pull/18422) - Adjusted Longrope 
embedding function to match Huggingface Implementation
 * [#18426](https://github.com/apache/tvm/pull/18426) - Support integer type 
input for log and log2
 * [#18411](https://github.com/apache/tvm/pull/18411) - [FFI] Bump tvm-ffi to 
latest
 * [#18409](https://github.com/apache/tvm/pull/18409) - Fixing database bug
 * [#18390](https://github.com/apache/tvm/pull/18390) - Support integer types 
in TIR expression operators
 * [#18398](https://github.com/apache/tvm/pull/18398) - fix the  8-bit vector 
loads/stores problem, which will solve the problem raised in the codegen test 
for cuda
 * [#18389](https://github.com/apache/tvm/pull/18389) - Add VisitStmt_ method 
for AssertStmtNode and StringImmNode
 * [#18361](https://github.com/apache/tvm/pull/18361) - [WebLLM] Replace int64s 
with int32s in WebGPU kernels
 * [#18384](https://github.com/apache/tvm/pull/18384) - Fix crash when multiple 
PrimFunc objects are present in IRModule
 * [#18378](https://github.com/apache/tvm/pull/18378) - [release][Dont Squash] 
Update version to 0.22.0 and 0.23.0.dev on main branch

-- 
View it on GitHub:
https://github.com/apache/tvm/releases/tag/v0.23.0.rc0
You are receiving this because you are subscribed to this thread.

Message ID: <apache/tvm/releases/[email protected]>

Reply via email to