ysh329 opened a new issue, #18701:
URL: https://github.com/apache/tvm/issues/18701

   # Introduction
   
   The TVM community has worked since the last release to deliver the following 
new exciting improvements!
   
   The main tags are below (**bold text is with lots of progress**): Relax 
(especial PyTorch frontend), FFI etc.
   
   Please visit the full listing of commits for a complete view: 
[v0.23.dev0...v0.23.0.rc0](https://github.com/apache/tvm/compare/v0.23.dev0...v0.23.0.rc0).
   
   ### Community
   
   None.
   
   ### RFCs
   
   None.
   
   ### Adreno
    * [#18523](https://github.com/apache/tvm/pull/18523) - [TEXTURE] Texture 
based lowering
   
   ### Arith
    * [#18542](https://github.com/apache/tvm/pull/18542) - Revert "Fix 
InternalError: Check failed: (eval_vec_) is false"
    * [#18536](https://github.com/apache/tvm/pull/18536) - Fix InternalError: 
Check failed: (eval_vec_) is false
   
   ### BugFix
    * [#18628](https://github.com/apache/tvm/pull/18628) - [Fix] Fix typo in 
file header comment
    * [#18589](https://github.com/apache/tvm/pull/18589) - [OpenCL] Guard QCOM 
perf hint behind USE_OPENCL_EXTN_QCOM to avoid undefined symbol on non-QCOM 
runtimes
    * [#18534](https://github.com/apache/tvm/pull/18534) - Prevent segfault 
when instantiating abstract SearchStrategy
   
   ### CI
    * [#18549](https://github.com/apache/tvm/pull/18549) - Remove hardcoded 
user and repo values
    * [#18484](https://github.com/apache/tvm/pull/18484) - Update file patterns 
for specific linting hooks
    * [#18470](https://github.com/apache/tvm/pull/18470) - Enhance python 
linting scripts to support revision-based checks
    * [#18498](https://github.com/apache/tvm/pull/18498) - Use glob for 
`conda/build-environment.yaml` in cache key
    * [#18495](https://github.com/apache/tvm/pull/18495) - Update 
`actions/cache` to v4 in setup action
    * [#18457](https://github.com/apache/tvm/pull/18457) - Fix crash when grep 
finds no matches
    * [#18448](https://github.com/apache/tvm/pull/18448) - Update pre-commit 
configuration
    * [#18432](https://github.com/apache/tvm/pull/18432) - Enable username 
checks in PR title and body
    * [#18430](https://github.com/apache/tvm/pull/18430) - [TEST][CODEGEN] Fix 
the test scripts tries to tell numpy a dtype name that it cannot recognise
    * [#18419](https://github.com/apache/tvm/pull/18419) - [TEST] Refactor: 
remove the deprecated warning message check from test cases
   
   ### Docs
    * [#18545](https://github.com/apache/tvm/pull/18545) - Improve static shape 
tuning parameter configuration (follow-up to commit c71aefc)
    * [#18539](https://github.com/apache/tvm/pull/18539) - Fix e2e_opt_model 
tutorial for GPU deployment
    * [#18451](https://github.com/apache/tvm/pull/18451) - Update the merge 
setting
    * [#18436](https://github.com/apache/tvm/pull/18436) - Remove prebuilt 
package references and disable Colab button at tutorials
    * [#18413](https://github.com/apache/tvm/pull/18413) - Update 
cross-compilation and RPC tutorial with modern PyTorch deployment workflow
    * [#18412](https://github.com/apache/tvm/pull/18412) - Update tutorial for 
exporting and loading back Relax executables
    * [#18404](https://github.com/apache/tvm/pull/18404) - Add tutorial for 
exporting and loading back Relax executables
   
   ### Frontend
    * [#18435](https://github.com/apache/tvm/pull/18435) - [ONNX] Fix operator 
Transpose: TVMError: PermuteDims expects the number of input axes to equal the 
ndim of the input tensor
   
   ### LLVM
    * [#18586](https://github.com/apache/tvm/pull/18586) - [Codegen] Avoid 
segfault when `arith::GetVScaleValues` returns empty vector
   
   ### MetaSchedule
    * [#18547](https://github.com/apache/tvm/pull/18547) - Fix tune_tir crash 
with ScheduleError in RewriteParallelVectorizeUnroll
   
   ### Relax
    * [#18676](https://github.com/apache/tvm/pull/18676) - Implement dynamic 
output trimming for NMS
    * [#18664](https://github.com/apache/tvm/pull/18664) - Add FDataDependent 
operator attribute for LegalizeOps
    * [#18668](https://github.com/apache/tvm/pull/18668) - [Onnx] Support Local 
Response Normalization (LRN)
    * [#18667](https://github.com/apache/tvm/pull/18667) - Add native size 
operator
    * [#18675](https://github.com/apache/tvm/pull/18675) - [LAYOUT] Support for 
dynamic layout specification
    * [#18652](https://github.com/apache/tvm/pull/18652) - [ONNX] add support 
for unique optional outputs
    * [#18665](https://github.com/apache/tvm/pull/18665) - Replace topi.take 
with relax.op.take
    * [#18663](https://github.com/apache/tvm/pull/18663) - Fix wrong memory 
planning when only lower bound was provided
    * [#18666](https://github.com/apache/tvm/pull/18666) - [Onnx][Resize] 
Handle non-4D input tensors
    * [#18658](https://github.com/apache/tvm/pull/18658) - [Onnx][PReLU] Handle 
slope and axis argument with different slope shapes
    * [#18649](https://github.com/apache/tvm/pull/18649) - Remove obsolete TODO 
comments
    * [#18642](https://github.com/apache/tvm/pull/18642) - Add 
FRelaxInferLayout for gather_elements operator
    * [#18643](https://github.com/apache/tvm/pull/18643) - Add 
FRelaxInferLayout for scatter_nd operator
    * [#18641](https://github.com/apache/tvm/pull/18641) - [Op] Fixed incorrect 
output shape of Pool op when ceil_mode = true
    * [#18638](https://github.com/apache/tvm/pull/18638) - Add 
FRelaxInferLayout for scatter_elements operator
    * [#18637](https://github.com/apache/tvm/pull/18637) - Add 
FRelaxInferLayout for flip operator
    * [#18633](https://github.com/apache/tvm/pull/18633) - Add 
FRelaxInferLayout and TMixedPrecisionPolicy for dynamic_strided_slice
    * [#18635](https://github.com/apache/tvm/pull/18635) - [Onnx] Pass 
output_padding param in ConvTranspose
    * [#18632](https://github.com/apache/tvm/pull/18632) - Move GetUsedVars to 
analysis module
    * [#18629](https://github.com/apache/tvm/pull/18629) - Add 
FInferMixedPrecision and FRelaxInferLayout for conv transpose ops
    * [#18626](https://github.com/apache/tvm/pull/18626) - [Op][PyTorch] 
Supported Median operator
    * [#18576](https://github.com/apache/tvm/pull/18576) - Correct YaRN RoPE 
frequency scaling formula to align with the original paper
    * [#18615](https://github.com/apache/tvm/pull/18615) - Add gpu-generic 
fallback for unrecognized GPU targets
    * [#18621](https://github.com/apache/tvm/pull/18621) - Use weight shape 
instead of dim in Embedding.forward
    * [#18613](https://github.com/apache/tvm/pull/18613) - Remove duplicated 
test case: test_if_branch_var_scope
    * [#18616](https://github.com/apache/tvm/pull/18616) - Replaced 
call_pure_packed with tensor_to_shape operator
    * [#18593](https://github.com/apache/tvm/pull/18593) - feat: Implement 
FRelaxInferLayout for tile operator
    * [#18618](https://github.com/apache/tvm/pull/18618) - Add test case for op 
attributes in AST printer
    * [#18619](https://github.com/apache/tvm/pull/18619) - [PyTorch] Fix 
PyTorch Dynamo frontend for Darwin compatibility
    * [#18575](https://github.com/apache/tvm/pull/18575) - [ONNX] Add edge 
padding mode
    * [#18620](https://github.com/apache/tvm/pull/18620) - Fix flaky 
test_conv2d gradient numeric test
    * [#18609](https://github.com/apache/tvm/pull/18609) - Fix batch 
normalization computation logic
    * [#18574](https://github.com/apache/tvm/pull/18574) - [Torch] 
AssertionError: Unsupported function types ['mean.default']
    * [#18591](https://github.com/apache/tvm/pull/18591) - Chore: Fix the 
DeprecationWarning: invalid escape sequence \
    * [#18577](https://github.com/apache/tvm/pull/18577) - Clean up 
scatter_elements unknown dtype handling
    * [#18579](https://github.com/apache/tvm/pull/18579) - Add layout inference 
support for repeat operator
    * [#18583](https://github.com/apache/tvm/pull/18583) - [Torch] Fixed issues 
related to sum op when without dim and keep dim
    * [#18554](https://github.com/apache/tvm/pull/18554) - Enhance unique block 
name generation with numeric suffixes
    * [#18558](https://github.com/apache/tvm/pull/18558) - Add edge padding mode
    * [#18559](https://github.com/apache/tvm/pull/18559) - Add mod operator 
support
    * [#18544](https://github.com/apache/tvm/pull/18544) - [PyTorch] Add 
support for Custom Ops for ExportedProgram frontend
    * [#18535](https://github.com/apache/tvm/pull/18535) - [PyTorch] Add 
support for masked_select
    * [#18551](https://github.com/apache/tvm/pull/18551) - [Frontend] Introduce 
ModuleDict
    * [#18550](https://github.com/apache/tvm/pull/18550) - [PyTorch] Enhance 
scale_factor handling in interpolation
    * [#18553](https://github.com/apache/tvm/pull/18553) - [PyTorch] Unify 
dtype used in conv2d tests
    * [#18548](https://github.com/apache/tvm/pull/18548) - [PyTroch] Add NHWC 
layout support
    * [#18533](https://github.com/apache/tvm/pull/18533) - [PyTorch] Fix 
index_put with broadcast indices
    * [#18521](https://github.com/apache/tvm/pull/18521) - [PyTorch] Handle 
unknown output shapes for _sym_size_int
    * [#18532](https://github.com/apache/tvm/pull/18532) - [PyTorch] Add 
support for bidirectional GRU
    * [#18530](https://github.com/apache/tvm/pull/18530) - [PyTorch] Add 
boolean tensor support for max operation and corresponding test case
    * [#18524](https://github.com/apache/tvm/pull/18524) - [PyTorch] Fix 
InternalError when converting scaled_dot_product_attention with 2D inputs
    * [#18527](https://github.com/apache/tvm/pull/18527) - [PyTorch] Add 
support for non-persistent buffers in ExportedProgram frontend
    * [#18529](https://github.com/apache/tvm/pull/18529) - [PyTorch] Add 
support for binary scalar operations in ExportedProgram frontend and 
corresponding tests
    * [#18522](https://github.com/apache/tvm/pull/18522) - [PyTorch] Unify 
tests using shared tvm.testing.assert_allclose
    * [#18516](https://github.com/apache/tvm/pull/18516) - [PyTorch] Add 
support for bidirectional LSTM
    * [#18499](https://github.com/apache/tvm/pull/18499) - [PyTorch] Add 
support for sparse matrix multiplication
    * [#18518](https://github.com/apache/tvm/pull/18518) - [PyTorch] Fix batch 
normalization training mode correctness
    * [#18517](https://github.com/apache/tvm/pull/18517) - [PyTorch] Unify 
tests using shared verify_model
    * [#18506](https://github.com/apache/tvm/pull/18506) - [PyTorch] Enhance 
data type handling in FX graph translator
    * [#18507](https://github.com/apache/tvm/pull/18507) - [PyTorch] Support 
specifying decimals for _round
    * [#18500](https://github.com/apache/tvm/pull/18500) - [PyTorch] Add 
support for antialiased bilinear upsampling
    * [#18489](https://github.com/apache/tvm/pull/18489) - [PyTorch] Enhance 
handling of unbounded upper bound constraints
    * [#17599](https://github.com/apache/tvm/pull/17599) - [PASS] Annotate 
Custom Scope layout pass for Adreno GPU
    * [#18497](https://github.com/apache/tvm/pull/18497) - [PyTorch] Add binary 
operation dtype promotion following PyTorch rules in ExportedProgram frontend
    * [#18478](https://github.com/apache/tvm/pull/18478) - Fix the squeeze 
operator to behave consistently with torch
    * [#18496](https://github.com/apache/tvm/pull/18496) - [PyTorch] Add `mul` 
operator in ExportedProgram frontend
    * [#18494](https://github.com/apache/tvm/pull/18494) - [PyTorch] Add 
negative slicing support in `slice_scatter` operation
    * [#18493](https://github.com/apache/tvm/pull/18493) - [PyTorch] Add 
broadcast support for `copy` operation
    * [#18490](https://github.com/apache/tvm/pull/18490) - [PyTorch] Add 
`as_strided` operator in ExportedProgram frontend
    * [#18487](https://github.com/apache/tvm/pull/18487) - [PyTorch] Add 
`count_include_pad` support to `avg_pool2d` in PyTorch frontend
    * [#18488](https://github.com/apache/tvm/pull/18488) - [PyTorch] Enhance 
index_put support for multi-dimensional indices
    * [#18486](https://github.com/apache/tvm/pull/18486) - [PyTorch] Fix 
`batch_norm.default` args handling in ExportedProgram frontend
    * [#18483](https://github.com/apache/tvm/pull/18483) - [PyTorch] Add 
support for grid_sample operator
    * [#18482](https://github.com/apache/tvm/pull/18482) - [PyTorch] Add 
support for gumbel_softmax
    * [#18485](https://github.com/apache/tvm/pull/18485) - [PyTorch] Add 
dynamic shape support to `torch.ops.aten.sym_size.int` in ExportedProgram 
frontend
    * [#18473](https://github.com/apache/tvm/pull/18473) - [PyTorch] Add 
support for `torch.ops.aten.sym_size.int` in ExportedProgram frontend
    * [#18471](https://github.com/apache/tvm/pull/18471) - [PyTorch] Enable 
run_ep_decomposition by default
    * [#18462](https://github.com/apache/tvm/pull/18462) - [PyTorch] Add 
decomposed operator support for interpolate
    * [#18455](https://github.com/apache/tvm/pull/18455) - Fix flaky 
test_conv2d_offload by increasing float32 tolerance
    * [#18463](https://github.com/apache/tvm/pull/18463) - [PyTorch] Support 
advanced range constraints (multiplication)
    * [#18464](https://github.com/apache/tvm/pull/18464) - [PyTorch] Enable 
decomposition in all tests
    * [#18461](https://github.com/apache/tvm/pull/18461) - [PyTorch] Fix 
KeyError: dtype when converting PyTorch model with gradient checkpointing using 
torch.export
    * [#18452](https://github.com/apache/tvm/pull/18452) - [PyTorch] Support 
advanced range constraints (addition)
    * [#18454](https://github.com/apache/tvm/pull/18454) - [PyTorch]: Fix the 
sqrt operation requires float dtype but receives int64 in attention scaling
    * [#18459](https://github.com/apache/tvm/pull/18459) - [PyTorch] Fix 
MultiheadAttention complie
    * [#18460](https://github.com/apache/tvm/pull/18460) - [PyTorch] Add 
decomposed operator support for normalization
    * [#18458](https://github.com/apache/tvm/pull/18458) - [PyTorch] Add 
decomposed operator support for Binary
    * [#18449](https://github.com/apache/tvm/pull/18449) - [PyTorch] Add 
decomposed operator support for Pad
    * [#18447](https://github.com/apache/tvm/pull/18447) - [PyTorch] Add lower 
bound support for range constraints
    * [#18446](https://github.com/apache/tvm/pull/18446) - [PyTorch] Add 
decomposed operator support for MaxPool
    * [#18437](https://github.com/apache/tvm/pull/18437) - [PyTorch] Add 
decomposed operator support for AdaptiveAvgPool
    * [#18433](https://github.com/apache/tvm/pull/18433) - [PyTorch] Add 
decomposed operator support for Conv
    * [#18429](https://github.com/apache/tvm/pull/18429) - [PyTorch] Support 
basic range constraints
    * [#18428](https://github.com/apache/tvm/pull/18428) - [PyTorch] Add 
support for decomposed operators and fix IR of ops tests(8)
    * [#18427](https://github.com/apache/tvm/pull/18427) - [PyTorch] Add 
support for decomposed operators and fix IR of ops tests(7)
    * [#18420](https://github.com/apache/tvm/pull/18420) - [PyTorch] Add 
support for decomposed operators and fix IR of ops tests(6)
    * [#18417](https://github.com/apache/tvm/pull/18417) - [PyTorch] Add 
support for decomposed operators and fix IR of ops tests(5)
    * [#18416](https://github.com/apache/tvm/pull/18416) - [ONNX] Fix bug: 
Unsupported numpy or ml_dtypes dtype('O') when importing ONNX model using Relax 
frontend
    * [#18414](https://github.com/apache/tvm/pull/18414) - [PyTorch] Add 
support for decomposed operators and fix IR of ops tests(4)
    * [#18410](https://github.com/apache/tvm/pull/18410) - [PyTorch] Add 
support for decomposed operators and fix IR of ops tests(3)
    * [#18403](https://github.com/apache/tvm/pull/18403) - [PyTorch] Add 
support for decomposed operators and fix IR of ops tests(2)
    * [#18402](https://github.com/apache/tvm/pull/18402) - [PyTorch] Add 
support for decomposed operators and fix IR of ops tests(1)
    * [#18401](https://github.com/apache/tvm/pull/18401) - [PyTorch] Enable 
decomposition for unary ops and refactor tests
    * [#18400](https://github.com/apache/tvm/pull/18400) - [PyTorch] Add 
support for decomposed operators in extended unary ops tests
    * [#18399](https://github.com/apache/tvm/pull/18399) - [PyTorch] Add 
run_ep_decomposition flag to control PyTorch decomposition
   
   ### Runtime
    * [#18546](https://github.com/apache/tvm/pull/18546) - [MatchShape] Type 
error: Cannot convert from type ' DLTensor* ' to ' ffi.Shape '
   
   ### TIR
    * [#18639](https://github.com/apache/tvm/pull/18639) - [Schedule] Fix type 
checker to support subscripted generics in Python 3.14+
    * [#18515](https://github.com/apache/tvm/pull/18515) - [Schedule] 
FuseReductionEpilogue: Add Clipping pattern support
    * [#18556](https://github.com/apache/tvm/pull/18556) - [Schedule] Fix bug 
on bfloat16 conversion
    * [#18528](https://github.com/apache/tvm/pull/18528) - [Schedule] Fix mma 
tensorize error
    * [#18514](https://github.com/apache/tvm/pull/18514) - Fix tir.LowerIntrin 
check failed additional_info.size() == new_size
    * [#18505](https://github.com/apache/tvm/pull/18505) - Update function 
signatures for decompose_reduction
    * [#18479](https://github.com/apache/tvm/pull/18479) - : Fix 
VerifyStream::Verify causes dereferencing an invalid pointer
    * [#18421](https://github.com/apache/tvm/pull/18421) - Add step attribute 
to ForNode (Initial codes)
    * [#18418](https://github.com/apache/tvm/pull/18418) - [Schedule] Add 
FuseReductionEpilogue primitive to fuse epilogue …
    * [#18466](https://github.com/apache/tvm/pull/18466) - Fix Data Type 
Mismatch (int64 vs int32) in T.match_buffer when Working with Scalar Buffers in 
TIR
   
   ### TVMScript
    * [#18504](https://github.com/apache/tvm/pull/18504) - Add test for TIR 
macro block name suffix handling
    * [#18465](https://github.com/apache/tvm/pull/18465) - Add block name 
suffix management for TIR macros
   
   ### cuda & cutlass & tensorrt
    * [#18624](https://github.com/apache/tvm/pull/18624) - [CUDA] Fix 
cuModuleUnload crash during interpreter shutdown
    * [#18604](https://github.com/apache/tvm/pull/18604) - [CUDA][FFI] Extend 
kernel launch config to support Programmatic Dependent Launch and 
cuLaunchCooperativeKernel
   
   ### web
    * [#18683](https://github.com/apache/tvm/pull/18683) - Fix RPC argument 
parsing for new FFI string/bytes types
    * [#18686](https://github.com/apache/tvm/pull/18686) - Fix incorrect FFI 
export name in runtime.ts
    * [#18480](https://github.com/apache/tvm/pull/18480) - Bump web runtime 
version 0.23.0-dev1
    * [#18467](https://github.com/apache/tvm/pull/18467) - Replace string with 
TVMFFIByteArray* to avoid memory issues
    * [#18450](https://github.com/apache/tvm/pull/18450) - Fix progress 
reporting when loading from cache
    * [#18415](https://github.com/apache/tvm/pull/18415) - Fix 
arrayDecodeStorage scope issue for q0f32 models
    * [#18385](https://github.com/apache/tvm/pull/18385) - Upgrade web runtime 
to new FFI
   
   ### Misc
    * [#18681](https://github.com/apache/tvm/pull/18681) - [NVRTC] Add NVSHMEM 
support to NVRTC compilation path
    * [#18674](https://github.com/apache/tvm/pull/18674) - fix: MSVC pragma
    * [#18654](https://github.com/apache/tvm/pull/18654) - [FFI] bump to latest 
version
    * [#18656](https://github.com/apache/tvm/pull/18656) - Put options before 
objects when compiling
    * [#18519](https://github.com/apache/tvm/pull/18519) - [Compile] accelerate 
compilation speed using NVRTC
    * [#18582](https://github.com/apache/tvm/pull/18582) - Fix ACOS precision 
issue for boundary values (x=±1.0)
    * [#18557](https://github.com/apache/tvm/pull/18557) - [Attn] Fix calling 
FlashInfer attention plan function
    * [#18555](https://github.com/apache/tvm/pull/18555) - Fix duplicate 
`PresburgerSetNode` registration when `USE_MLIR=ON` and MLIR >= 15.0
    * [#18525](https://github.com/apache/tvm/pull/18525) - [Schedule] Fix 
LocalBuilder Check failed: (index_map_func.has_value()) is false
    * [#18511](https://github.com/apache/tvm/pull/18511) - [Pass] Add DumpIR 
pass instrument to save IR snapshots
    * [#18512](https://github.com/apache/tvm/pull/18512) - Remove unused TVMC 
configs
    * [#18509](https://github.com/apache/tvm/pull/18509) - Fix compilation 
warnings
    * [#18492](https://github.com/apache/tvm/pull/18492) - Fix BufferError when 
converting PyTorch models with sparse tensors
    * [#18469](https://github.com/apache/tvm/pull/18469) - [Contrib] Update 
RandomFill to use StreamSync for CUDA synchronization
    * [#18453](https://github.com/apache/tvm/pull/18453) - [DataType] Update to 
use explicit Bool Type Aligning with DLPack
    * [#18422](https://github.com/apache/tvm/pull/18422) - Adjusted Longrope 
embedding function to match Huggingface Implementation
    * [#18426](https://github.com/apache/tvm/pull/18426) - Support integer type 
input for log and log2
    * [#18411](https://github.com/apache/tvm/pull/18411) - [FFI] Bump tvm-ffi 
to latest
    * [#18409](https://github.com/apache/tvm/pull/18409) - Fixing database bug
    * [#18390](https://github.com/apache/tvm/pull/18390) - Support integer 
types in TIR expression operators
    * [#18398](https://github.com/apache/tvm/pull/18398) - fix the  8-bit 
vector loads/stores problem, which will solve the problem raised in the codegen 
test for cuda
    * [#18389](https://github.com/apache/tvm/pull/18389) - Add VisitStmt_ 
method for AssertStmtNode and StringImmNode
    * [#18361](https://github.com/apache/tvm/pull/18361) - [WebLLM] Replace 
int64s with int32s in WebGPU kernels
    * [#18384](https://github.com/apache/tvm/pull/18384) - Fix crash when 
multiple PrimFunc objects are present in IRModule
    * [#18378](https://github.com/apache/tvm/pull/18378) - [release][Dont 
Squash] Update version to 0.22.0 and 0.23.0.dev on main branch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to