ysh329 opened a new issue, #18391:
URL: https://github.com/apache/tvm/issues/18391

   # Introduction
   
   The TVM community has worked since the last release to deliver the following 
new exciting improvements!
   
   The main tags are below (**bold text is with lots of progress**): Relax 
(especial PyTorch frontend), FFI etc.
   
   Please visit the full listing of commits for a complete view: 
[v0.21.dev0...v0.21.0.rc0](https://github.com/apache/tvm/compare/v0.22.dev0...v0.22.0.rc0).
   
   ### Community
   
   None.
   
   ### RFCs
   
   None.
   
   ### BugFix
    * [#18352](https://github.com/apache/tvm/pull/18352) - [Fix] Update 
ShapeView use in nccl.cc
    * [#18324](https://github.com/apache/tvm/pull/18324) - Fixing binding for 
bert
    * [#18296](https://github.com/apache/tvm/pull/18296) - [Fix] Add libxml2 
dependency to fix Windows CI build failure
    * [#18294](https://github.com/apache/tvm/pull/18294) - [Fix] Set DRefObj 
and CUDAIPCMemoryObj as mutable
    * [#18285](https://github.com/apache/tvm/pull/18285) - [FFI]Enable 
`load_inline` on macos
    * [#18287](https://github.com/apache/tvm/pull/18287) - [Hotfix] Fix the 
conflicts about ffi-related updated names
    * [#18281](https://github.com/apache/tvm/pull/18281) - [FFI]Fix bug of 
`ffi.cpp.load_inline` on Windows
    * [#18262](https://github.com/apache/tvm/pull/18262) - [NNAPI] Use kind() 
instead of type_key() after FFI refactor
    * [#18244](https://github.com/apache/tvm/pull/18244) - [Fix] Update 
FlashInfer JIT header lookup
    * [#18237](https://github.com/apache/tvm/pull/18237) - [FFI]Fix type_traits 
on DataType after SmallStr update
    * [#18232](https://github.com/apache/tvm/pull/18232) - [LLVM][Fix] Do not 
emit debuginfo on vscale or other unknown types
    * [#18219](https://github.com/apache/tvm/pull/18219) - [Fix] Resolve 
deadlock in PopenPoolExecutor and LocalBuilder
    * [#18207](https://github.com/apache/tvm/pull/18207) - [Fix][ONNX] No 
precision widening for numpy binary operations
    * [#18209](https://github.com/apache/tvm/pull/18209) - 
[ONNX][FRONTEND][Fix] Update Resize to accept ShapeExpr
    * [#18210](https://github.com/apache/tvm/pull/18210) - [Bug] Fix core dump 
in InferLayoutRMSNorm and fix typo
    * [#18208](https://github.com/apache/tvm/pull/18208) - [FFI][Fix] Update 
datatype registry calls to the new paths
    * [#18190](https://github.com/apache/tvm/pull/18190) - [Fix] Codegen fix 
for relax cutlass
    * [#18170](https://github.com/apache/tvm/pull/18170) - [Fix] Fix the wrong 
check for tuple node in #18163
    * [#18174](https://github.com/apache/tvm/pull/18174) - [Misc]Fix missing 
PadAttrs register in op_attrs.py
    * [#18158](https://github.com/apache/tvm/pull/18158) - Fix NCCL build with 
GlobalDef registration
    * [#18140](https://github.com/apache/tvm/pull/18140) - [NNAPI] Fix type 
mismatch and test_mean annotation
    * [#18138](https://github.com/apache/tvm/pull/18138) - [Fix][ONNX] Fixed 
constant ROI handling in resize2d when loading onnx models
    * [#18137](https://github.com/apache/tvm/pull/18137) - [Fix][ONNX] Fix 
CumSum conversion when loading ONNX model
   
   ### CI
    * [#18245](https://github.com/apache/tvm/pull/18245) - [LLVM][MSWIN]Fix 
LLVM module build with latest CI update
    * [#18227](https://github.com/apache/tvm/pull/18227) - Exit the build for 
AbortException
    * [#18145](https://github.com/apache/tvm/pull/18145) - [Test] Use roi_list 
variable instead of hardcoded values in ROI tensor creation
   
   ### Docs
    * [#18279](https://github.com/apache/tvm/pull/18279) - [FFI]Initial bringup 
of cpp docs
    * [#18264](https://github.com/apache/tvm/pull/18264) - Misc docs fix
    * [#18263](https://github.com/apache/tvm/pull/18263) - [FFI]Initial docs 
scaffolding
    * [#18261](https://github.com/apache/tvm/pull/18261) - [FFI]Add missing 
files in packaging example
    * [#18256](https://github.com/apache/tvm/pull/18256) - [FFI]Wheel Packaging
    * [#18128](https://github.com/apache/tvm/pull/18128) - [Doc] Visualize the 
architecture using a UML sequence diagram
   
   ### Frontend
    * [#18143](https://github.com/apache/tvm/pull/18143) - [ONNX] Extend axes 
for layer_norm when gamma/beta are multi-dimensional
   
   ### LLVM
    * [#18204](https://github.com/apache/tvm/pull/18204) - Fixes up to the 
latest LLVM21
    * [#18202](https://github.com/apache/tvm/pull/18202) - [CPPTEST] Small 
fixes for LLVM >= 20
   
   ### MetaSchedule
    * [#18243](https://github.com/apache/tvm/pull/18243) - [LLVM]Add RISCV 
V-extension v1.0 kernels to metaschedule
   
   ### Metal
    * [#18290](https://github.com/apache/tvm/pull/18290) - Fix MetalModuleCreate
    * [#18283](https://github.com/apache/tvm/pull/18283) - [Fix]Fix type for 
device array in Metal API
   
   ### ROCm
    * [#18225](https://github.com/apache/tvm/pull/18225) - Minor fixes for 
latest refactor
   
   ### Relax
    * [#18374](https://github.com/apache/tvm/pull/18374) - [PyTorch] improve 
the check for no bias situation
    * [#18358](https://github.com/apache/tvm/pull/18358) - [Frontend][ONNX] Fix 
`FastGelu` when bias does not set
    * [#18360](https://github.com/apache/tvm/pull/18360) - [PyTorch] Support 
gru op for ExportedProgram importer
    * [#18359](https://github.com/apache/tvm/pull/18359) - [PyTorch] Fix the 
segfault in from_exported_program when model returns (Tensor, None) tuple
    * [#18321](https://github.com/apache/tvm/pull/18321) - [ONNX] Support 
AllClassNMS Operator for ONNX Frontend
    * [#18346](https://github.com/apache/tvm/pull/18346) - [PyTorch] Support 
lstm op for ExportedProgram importer
    * [#18351](https://github.com/apache/tvm/pull/18351) - [Frontend][Torch] 
Fix parsing error when input dimension of unbind is 1
    * [#18331](https://github.com/apache/tvm/pull/18331) - Update BasePyModule 
with faster DLPack converter for tensor conversion
    * [#18343](https://github.com/apache/tvm/pull/18343) - [PyTorch] Support 
MatrixMultiply op for ExportedProgram importer
    * [#18336](https://github.com/apache/tvm/pull/18336) - Operator and RoPE 
support for Llama4
    * [#18329](https://github.com/apache/tvm/pull/18329) - [Frontend][ONNX] 
Error converting operator Expand: TVMError: broadcast_to expects the input 
tensor shape is broadcastable to the target shape
    * [#18326](https://github.com/apache/tvm/pull/18326) - [Backend] Implement 
R.call_py_func operator for calling Python functions from compiled TVM
    * [#18313](https://github.com/apache/tvm/pull/18313) - Introduce 
R.call_py_func operator for calling Python functions from Relax IR
    * [#18301](https://github.com/apache/tvm/pull/18301) - Fix 
RelaxToPyFuncConverter compatibility and improve fallback handling
    * [#18288](https://github.com/apache/tvm/pull/18288) - Add symbolic shape 
support to BasePyModule for dynamic tensor operations
    * [#18269](https://github.com/apache/tvm/pull/18269) - Add Relax to Python 
Function Converter
    * [#18253](https://github.com/apache/tvm/pull/18253) - Building TVMScript 
printer for IRModules with Python functions
    * [#18229](https://github.com/apache/tvm/pull/18229) - Add Python function 
support and BasePyModule for PyTorch integration
    * [#18242](https://github.com/apache/tvm/pull/18242) - ONNX frontend using 
relax softplus operator
    * [#18180](https://github.com/apache/tvm/pull/18180) - [ONNX] Parse ONNX 
Upsample to Relax resize2d
    * [#18179](https://github.com/apache/tvm/pull/18179) - Support Relax 
Operator PReLU
    * [#18163](https://github.com/apache/tvm/pull/18163) - Fix issue in fuse 
concat ops by pattern
    * [#18120](https://github.com/apache/tvm/pull/18120) - [Fix]Fix potential 
out-of-bounds access in `TupleRewriterNode`
    * [#18061](https://github.com/apache/tvm/pull/18061) - [ONNX][Transform] 
Add mode choice, new mode, and warning for take()
    * [#18122](https://github.com/apache/tvm/pull/18122) - [KVCache] Fix kernel 
dispatch based on attention kinds
   
   ### TIR
    * [#18319](https://github.com/apache/tvm/pull/18319) - Refactor division 
simplification in RewriteSimplifier
    * [#18341](https://github.com/apache/tvm/pull/18341) - Support sequence 
comparisons in TVMScript
    * [#18323](https://github.com/apache/tvm/pull/18323) - Add support for 
conditional expressions in TVMScript
    * [#18199](https://github.com/apache/tvm/pull/18199) - Fix host/device 
function check for build
    * [#18154](https://github.com/apache/tvm/pull/18154) - Fix trivial index 
map [] -> [0]
    * [#18151](https://github.com/apache/tvm/pull/18151) - Decouple DeepEqual 
from StructuralEqual
    * [#18134](https://github.com/apache/tvm/pull/18134) - Add 
`T.thread_return()` for early thread exit in CUDA kernels
   
   ### TVMScript
    * [#17804](https://github.com/apache/tvm/pull/17804) - Support continue and 
break in tvmscript
   
   ### cuda & cutlass & tensorrt
    * [#18353](https://github.com/apache/tvm/pull/18353) - [CUDA] Update 
FlashInfer JIT integration
    * [#18320](https://github.com/apache/tvm/pull/18320) - [TIR][CUDA] Preserve 
float precision in codegen with hexfloat output
    * [#18300](https://github.com/apache/tvm/pull/18300) - [CUDA] Support NVTX 
in CUDA 13
    * [#18238](https://github.com/apache/tvm/pull/18238) - [CUTLASS] Fix 
CUTLASS kernel compilation
    * [#18144](https://github.com/apache/tvm/pull/18144) - [CodeGen][CUDA] Add 
sinhf CUDA Math API for CodeGen
   
   ### web
    * [#18327](https://github.com/apache/tvm/pull/18327) - [CMake]Install 
`web/` directory in cmake for Python package
    * [#18168](https://github.com/apache/tvm/pull/18168) - Fix incompatible 
part after FFI updates
   
   ### Misc
    * [#18376](https://github.com/apache/tvm/pull/18376) - [FFI] Bump tvm-ffi 
to 0.1.0rc2
    * [#18330](https://github.com/apache/tvm/pull/18330) - [Analyzer] Enhance 
ConstIntBoundAnalyzer and IntervalSet with modular set analysis
    * [#18372](https://github.com/apache/tvm/pull/18372) - Upgrade to CUTLASS 
4.2.1
    * [#18375](https://github.com/apache/tvm/pull/18375) - [TE] [FFI] Fix 
broken axis/reduce_axis properties in BaseComputeOp and ScanOp after FFI 
refactoring
    * [#18370](https://github.com/apache/tvm/pull/18370) - [FFI] Bump tvm-ffi 
dependency
    * [#18354](https://github.com/apache/tvm/pull/18354) - [FFI][ABI] Bump 
tvm-ffi to latest
    * [#18349](https://github.com/apache/tvm/pull/18349) - [FFI][ABI] Bump 
tvm-ffi to latest
    * [#18348](https://github.com/apache/tvm/pull/18348) - [Python] Add library 
lookup path for tvm installed as a pakcage
    * [#18345](https://github.com/apache/tvm/pull/18345) - [FFI][ABI] Bump 
tvm-ffi version to reflect RC ABI Update
    * [#18332](https://github.com/apache/tvm/pull/18332) - [FFI][ABI] Bump 
version ffi to latest
    * [#18334](https://github.com/apache/tvm/pull/18334) - Fix conflict 
parameter name promote_dtye in FP8ComputeLegalize
    * [#18325](https://github.com/apache/tvm/pull/18325) - [flashinfer] Support 
directing JIT to FlashInfer GroupedGemm kernels
    * [#18328](https://github.com/apache/tvm/pull/18328) - Fixing datatype 
error for gpt-2
    * [#18318](https://github.com/apache/tvm/pull/18318) - [3rdparty] Remove 
dlpack/libbacktrace from 3rdparty
    * [#18317](https://github.com/apache/tvm/pull/18317) - [FlashInfer] Update 
include path and interface
    * [#18314](https://github.com/apache/tvm/pull/18314) - [REFACTOR][FFI] 
Split tvm-ffi into a separate repo
    * [#18312](https://github.com/apache/tvm/pull/18312) - [FFI][REFACTOR] 
Update TVM_FFI_STATIC_INIT_BLOCK to fn style
    * [#18311](https://github.com/apache/tvm/pull/18311) - [FFI][ABI] Better 
String and Nested Container handling
    * [#18308](https://github.com/apache/tvm/pull/18308) - [FFI][ABI] Refactor 
the naming of DLPack speed converter
    * [#18307](https://github.com/apache/tvm/pull/18307) - [FFI] Update 
`load_inline` interface
    * [#18306](https://github.com/apache/tvm/pull/18306) - [FFI][ABI][REFACTOR] 
Enhance DLPack Exchange Speed and Behavior
    * [#18304](https://github.com/apache/tvm/pull/18304) - Clear 
ext_lib_dll_names for macOS platform
    * [#18302](https://github.com/apache/tvm/pull/18302) - [FFI][REFACTOR] 
Refactor python ffi call mechanism for perf
    * [#18299](https://github.com/apache/tvm/pull/18299) - [Python] Fix runtime 
tensor import
    * [#18298](https://github.com/apache/tvm/pull/18298) - [FFI] Fix system 
library symbol lookup
    * [#18297](https://github.com/apache/tvm/pull/18297) - [FFI] Temp skip 
windows tests
    * [#18295](https://github.com/apache/tvm/pull/18295) - [FFI][ABI] Introduce 
generic stream exchange protocol
    * [#18289](https://github.com/apache/tvm/pull/18289) - [FFI][REFACTOR] 
Streamline Object Declare Macros
    * [#18291](https://github.com/apache/tvm/pull/18291) - [3rdparty] Bump 
cutlass_fpA_intB_gemm to fix SM90 build
    * [#18284](https://github.com/apache/tvm/pull/18284) - [FFI][REFACTOR] 
Introduce UnsafeInit and enhance ObjectRef null safety
    * [#18282](https://github.com/apache/tvm/pull/18282) - [FFI] Relax default 
alignment and continguous requirement
    * [#18280](https://github.com/apache/tvm/pull/18280) - [FFI][REFACTOR] 
Cleanup namespace
    * [#18278](https://github.com/apache/tvm/pull/18278) - [FFI] Temp skip 
load_inline tests nonlinux
    * [#18277](https://github.com/apache/tvm/pull/18277) - [FFI][REFACTOR] 
Cleanup tvm_ffi python API and types
    * [#18276](https://github.com/apache/tvm/pull/18276) - [FFI] Add 
ffi::Tensor.strides()
    * [#18275](https://github.com/apache/tvm/pull/18275) - [FFI][REFACTOR][ABI] 
Rename NDArray to Tensor
    * [#18274](https://github.com/apache/tvm/pull/18274) - [FFI] Update the 
interface of `ffi.load_inline` to match torch
    * [#18273](https://github.com/apache/tvm/pull/18273) - [FFI][ABI] Append 
symbol prefix for ffi exported functions
    * [#18272](https://github.com/apache/tvm/pull/18272) - [FFI] Construct 
NDArray.strides by default
    * [#18271](https://github.com/apache/tvm/pull/18271) - [FFI] Support inline 
module
    * [#18270](https://github.com/apache/tvm/pull/18270) - [FFI] Support Opaque 
PyObject
    * [#18266](https://github.com/apache/tvm/pull/18266) - [FFI] Update torch 
stream getter to use native torch c api
    * [#18252](https://github.com/apache/tvm/pull/18252) - [Build] Complete TVM 
wheel building migration
    * [#18259](https://github.com/apache/tvm/pull/18259) - [FFI][ABI] Introduce 
weak rc support
    * [#18258](https://github.com/apache/tvm/pull/18258) - [FFI] fix two 
seemingly migration issue
    * [#18254](https://github.com/apache/tvm/pull/18254) - [FFI][ABI] ABI 
Updates to for future metadata and complex ordering
    * [#18236](https://github.com/apache/tvm/pull/18236) - upgrade cutlass 
v4.2.0 supporting cuda 13
    * [#18251](https://github.com/apache/tvm/pull/18251) - [Python] Complete 
Python packaging with scikit-build-core
    * [#18248](https://github.com/apache/tvm/pull/18248) - [Python] Update 
version.py to bump pyproject.toml automatically
    * [#18249](https://github.com/apache/tvm/pull/18249) - [FFI][CMAKE] Revert 
cmake libbacktrace URL and update submodule
    * [#18239](https://github.com/apache/tvm/pull/18239) - [Build] Migrate 
Python packaging to pyproject.toml with scikit-build-core
    * [#18246](https://github.com/apache/tvm/pull/18246) - [FFI][CMAKE] Add 
missing download path for libbacktrace
    * [#18234](https://github.com/apache/tvm/pull/18234) - [FFI] Misc fixup for 
windows
    * [#18233](https://github.com/apache/tvm/pull/18233) - [FFI] Robustify the 
pyproject setup
    * [#18226](https://github.com/apache/tvm/pull/18226) - [FFI][REFACTOR] 
Establish tvm_ffi python module
    * [#18221](https://github.com/apache/tvm/pull/18221) - [FFI] Fix JSON 
parser/writer for the fast-math flag
    * [#18222](https://github.com/apache/tvm/pull/18222) - [NVSHMEM] Fix 
compatibility with CUDA code without nvshmem use
    * [#18220](https://github.com/apache/tvm/pull/18220) - [Thrust] Fix getting 
CUDA stream
    * [#18218](https://github.com/apache/tvm/pull/18218) - [FFI][REFACTOR] 
Cleanup API locations
    * [#18217](https://github.com/apache/tvm/pull/18217) - [FFI] AudoDLPack 
compatible with torch stream context
    * [#18216](https://github.com/apache/tvm/pull/18216) - [FFI][REFACTOR] 
Establish Stream Context in ffi
    * [#18214](https://github.com/apache/tvm/pull/18214) - [FFI][REFACTOR] 
Establish ffi.Module in python
    * [#18213](https://github.com/apache/tvm/pull/18213) - [FFI] Formalize 
ffi.Module
    * [#18212](https://github.com/apache/tvm/pull/18212) - [FFI] Make JSON 
Parser/Write fastmath safe
    * [#18211](https://github.com/apache/tvm/pull/18211) - [TARGET]add target 
for nvidia rtx 5060ti
    * [#18206](https://github.com/apache/tvm/pull/18206) - [CODEGEN][REFACTOR] 
tir.call_llvm_intrin to remove nargs
    * [#18205](https://github.com/apache/tvm/pull/18205) - [FFI][REFATOR] 
Cleanup entry function to redirect
    * [#18200](https://github.com/apache/tvm/pull/18200) - [FFI][REFACTOR] 
Update Map ABI to enable flexible smallMap switch
    * [#18198](https://github.com/apache/tvm/pull/18198) - [FFI][REFACTOR] Move 
Downcast out of ffi for now
    * [#18197](https://github.com/apache/tvm/pull/18197) - [REFACTOR] Update 
data type rewriter to enable recursive rewrite in Any
    * [#18193](https://github.com/apache/tvm/pull/18193) - Bump 
cutlass_fpA_intB_gemm to latest commit
    * [#18192](https://github.com/apache/tvm/pull/18192) - [FFI] Phase out 
ObjectPath in favor of AccessPath
    * [#18191](https://github.com/apache/tvm/pull/18191) - [FFI][REFACTOR] 
Refactor AccessPath to enable full tree repr
    * [#18189](https://github.com/apache/tvm/pull/18189) - [FFI][REFACTOR] 
Phase out getattr based attribute handling
    * [#18188](https://github.com/apache/tvm/pull/18188) - [FFI][REFACTOR] 
Migrate the Save/Load JSON to the new reflection
    * [#18187](https://github.com/apache/tvm/pull/18187) - [FFI][EXTRA] 
Serialization To/From JSONGraph
    * [#18186](https://github.com/apache/tvm/pull/18186) - [FFI] Lightweight 
json parser/writer
    * [#18185](https://github.com/apache/tvm/pull/18185) - [FFI] Introduce 
small string/bytes
    * [#18184](https://github.com/apache/tvm/pull/18184) - [FFI][REFACTOR] Hide 
StringObj/BytesObj into details
    * [#18183](https://github.com/apache/tvm/pull/18183) - [FFI][REFACTOR] 
Cleanup to align to latest ffi
    * [#18181](https://github.com/apache/tvm/pull/18181) - [REFACTOR] Upgrade 
NestedMsg<T> to use new ffi::Any mechanism
    * [#18178](https://github.com/apache/tvm/pull/18178) - [FFI] Fix 
SmallMapInit with duplicated keys
    * [#18177](https://github.com/apache/tvm/pull/18177) - [FFI][REFACTOR] 
Isolate out extra API
    * [#18176](https://github.com/apache/tvm/pull/18176) - [FFI] Improve string 
equal/hash handling
    * [#18172](https://github.com/apache/tvm/pull/18172) - [REFACTOR][FFI] 
Phase out SEqualReduce/SHashReduce
    * [#18166](https://github.com/apache/tvm/pull/18166) - [FFI][REFACTOR] 
Migrate StructuralEqual/Hash to new reflection
    * [#18165](https://github.com/apache/tvm/pull/18165) - [FFI][REFACTOR] 
Enable custom s_hash/equal
    * [#18160](https://github.com/apache/tvm/pull/18160) - [FFI][REFACTOR] 
Introduce TypeAttr in reflection
    * [#18156](https://github.com/apache/tvm/pull/18156) - [FFI] Structural 
equal and hash based on reflection
    * [#18153](https://github.com/apache/tvm/pull/18153) - Fix Release Package 
Test Script
    * [#18149](https://github.com/apache/tvm/pull/18149) - [FFI] Log and throw 
in function dup registration
    * [#18148](https://github.com/apache/tvm/pull/18148) - [FFI][REFACTOR] 
Phase out TVM_FFI_REGISTER_GLOBAL in favor of GlobalDef
    * [#18147](https://github.com/apache/tvm/pull/18147) - [FFI][REFACTOR] 
Modularize refelection
    * [#18141](https://github.com/apache/tvm/pull/18141) - [FFI][PYTHON] 
Improve the traceback generation in python
    * [#18142](https://github.com/apache/tvm/pull/18142) - [REFACTOR] Migrate 
TVM_FFI_REGISTER_GLOBAL to new reflection style
    * [#18130](https://github.com/apache/tvm/pull/18130) - Fix compilation 
warnings of unnecessary `std::move()` calls
    * [#18129](https://github.com/apache/tvm/pull/18129) - Delete redundant 
imports
    * [#18055](https://github.com/apache/tvm/pull/18055) - [Target] Support 
CUDA device function calls
    * [#18127](https://github.com/apache/tvm/pull/18127) - Revert "[Refactor] 
Build cython with isolate environment"
    * [#18125](https://github.com/apache/tvm/pull/18125) - Phase out StackVM 
runtime support
    * [#18124](https://github.com/apache/tvm/pull/18124) - [Refactor] Build 
cython with isolate environment
    * [#18123](https://github.com/apache/tvm/pull/18123) - [Codegen] Update 
LLVM version requirement for `insertDeclare`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to