GitHub user GlutenPerfBot created a discussion: March 06, 2026: Weekly Status Update in Gluten
*This weekly update is generated by LLMs. You're welcome to join our [Github](https://github.com/apache/incubator-gluten/discussions) for in-depth discussions.* ## Overall Activity Summary The past 7 days have seen intense activity across the Gluten project with 60+ pull requests and 20+ active issues. The community is actively preparing for the upcoming 1.6.0 release while simultaneously advancing major features like ANSI mode support, Parquet type widening, and dynamic filtering optimizations. The Velox backend continues to dominate development focus with significant performance improvements and bug fixes. ## Key Ongoing Projects **Dynamic Filter Pushdown & Performance Optimizations** - @acvictor is leading major dynamic filtering improvements with #11657 implementing dynamic filter pushdown to ValueStream and #11711 translating might_contain as subfield filters for bloom filter pushdown - @JkSelf's #8931 optimizes broadcast hash joins by building hash tables once per executor, showing 1.29x performance improvement on TPC-DS Q23a **ANSI Mode Support Expansion** - @malinjawi completed ANSI-compliant string to boolean casting (#11437) and is actively working on expanding ANSI support for other type conversions - @n0r0shi added ANSI mode decimal arithmetic with overflow checking (#11705) **Parquet Type Widening & Schema Evolution** - @baibaichen is enabling GlutenParquetTypeWideningSuite for Spark 4.0/4.1 (#11684, #11689) to support SPARK-40876, fixing type conversion issues and enabling 45+ previously failing tests **Release Preparation** - @zhztheplayer coordinated the 1.6.0 release process with multiple PRs (#11700, #11701, #11702, #11696) including version bumps and release script updates ## Priority Items **Critical Bug Fixes Needed:** - #11692: Dynamic partition pruning regression in Hive scans causing test failures - @acvictor has draft PR #11710 - #11678: CrossRelNode expression validation missing in native validation - @wecharyu has open PR #11679 - #11630: Iceberg test failures requiring immediate attention - multiple PRs attempted **Performance Critical:** - #11657: Dynamic filter pushdown to ValueStream (589 additions) - needs review - #8931: BHJ optimization (2243 additions, 263 comments) - long-running optimization effort ## Notable Discussions **#11585: Useful Velox PRs Tracking** - @FelixYBW maintains a comprehensive tracker of 100+ Velox PRs submitted by the Gluten community that haven't been merged upstream, including critical fixes for ANSI mode, Parquet reading, and performance optimizations. **#11713: Apache Gluten Graduation Tasks** - @weiting-chen coordinates Gluten's transition from Apache Incubator to Top Level Project, involving repository renaming, documentation updates, and process changes. **#8429: Gluten Slack Channel** - @zhouyuan announced the new ASF workspace Slack channel for real-time community discussions. ## Emerging Trends 1. **ANSI Mode as Default**: With Spark 4.0 enabling ANSI by default, the community is rapidly implementing ANSI-compliant functions and type conversions 2. **Dynamic Filtering Revolution**: Multiple PRs focus on pushing filters closer to storage for significant performance gains 3. **Release Quality Focus**: Extensive test suite fixes and infrastructure improvements ahead of 1.6.0 release 4. **Cross-Backend Compatibility**: Increased attention to ensuring features work across Velox, ClickHouse, and other backends ## Good First Issues **#11699: S3 IMDS Configuration** - Add support for Velox's new S3 IMDS configuration options. Good for contributors familiar with cloud storage configurations. **#11703: Iceberg Configuration Mapping** - Map Iceberg writer configurations to Velox equivalents. Requires understanding of both Iceberg and Velox configuration systems. **#11513: Fix input_file_name() for Iceberg** - Resolve the issue where input_file_name() returns empty strings on Iceberg tables. Good introduction to Iceberg integration. **#10134: ANSI Mode Support** - Contribute to the comprehensive ANSI mode implementation. Multiple sub-tasks available for different type casting functions, suitable for contributors wanting to learn Spark's type system. **#11622: TIMESTAMP_NTZ Type Support** - Implement support for Spark's TIMESTAMP_NTZ type in Velox backend. Good for learning type system integration between Spark and native engines. GitHub link: https://github.com/apache/incubator-gluten/discussions/11714 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
