GitHub user GlutenPerfBot created a discussion: October 17, 2025: Weekly Status Update in Gluten
*This weekly update is generated by LLMs. You're welcome to join our [Github](https://github.com/apache/incubator-gluten/discussions) for in-depth discussions.* ## Overall Activity Summary The past 7 days have been exceptionally productive for Apache Gluten, with 47 merged PRs and 21 open PRs. The community is actively preparing for the 1.5.0 release while making significant progress on Velox backend improvements, GPU support, and Apache maturity model compliance. Daily Velox version updates continue to keep the project current with upstream changes. ## Key Ongoing Projects **GPU Acceleration Initiative** - @jinchengchenghh is leading the GPU support implementation (#9098, #10621), with the first PR merged for CUDA docker images. This represents a major architectural expansion for Gluten. **Velox Backend Enhancement** - Multiple contributors are actively improving the Velox backend: - Daily version updates by @GlutenPerfBot keeping pace with upstream Velox changes - Function support expansion by @dmsuehir (#10058, #10296) adding `map_from_arrays` and `timestamp_seconds` - Performance optimizations by @zhouyuan (#10824, #10825) reducing memory copies and improving columnar-to-row conversion **Apache Maturity Model Compliance** - @zhztheplayer is driving the project toward graduation with comprehensive documentation updates (#10873, #10874, #10876, #10878, #10885) covering release processes, security policies, and community governance. **Delta Lake Native Write Support** - @zhztheplayer successfully implemented native Delta write support (#10801) for Spark 3.5/Delta 3.3, a significant milestone for data lake integration. ## Priority Items **Immediate Attention Needed:** - #10892: VeloxBatchResizer OOM issue with variable-length types needs resolution - #10868: Duplicate stage execution in TPCDS q88 requires investigation - #8417: TPCDS q72 performance regression needs optimization **Release Preparation:** - #10574: Gluten 1.5.0 release preparation is ongoing with patch porting - #10887: Delta version bump to 3.3.2 for compatibility ## Notable Discussions #8763: Documentation build process and website synchronization - @zjuwangg raised important questions about maintaining consistent documentation between the repo and the official website, with community members providing guidance on the build process. ## Emerging Trends 1. **Performance Optimization Focus**: Multiple PRs targeting memory allocation improvements and CPU efficiency 2. **Cloud Storage Integration**: Enhanced support for Azure, S3, and other cloud storage systems 3. **Testing Infrastructure**: Increased emphasis on integration testing for cloud storage connectors 4. **Community Governance**: Systematic approach to Apache graduation requirements ## Good First Issues **ClickHouse Backend Functions** - Several good first issues for contributors interested in ClickHouse backend development: - #6814: Implement `MakeYMInterval` expression - Great for learning expression handling - #4730: Add `date_from_unix_date` function - Good introduction to date/time functions - #6807: Support `split_part` function - Excellent for string manipulation practice - #6812: Implement `SparkPartitionID` function - Useful for understanding Spark integration - #6815: Add `MapZipWith` expression - Advanced but well-scoped for learning map operations These issues are ideal entry points because they involve implementing specific functions in the ClickHouse backend, allowing new contributors to understand the codebase structure while making meaningful contributions. Basic C++ knowledge and familiarity with database functions are helpful but not required - the community provides excellent guidance and code reviews. GitHub link: https://github.com/apache/incubator-gluten/discussions/10906 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
