[D] March 06, 2026: Weekly Status Update in Gluten [incubator-gluten]

via GitHub Fri, 06 Mar 2026 12:28:14 -0800


GitHub user GlutenPerfBot created a discussion: March 06, 2026: Weekly Status 
Update in Gluten


*This weekly update is generated by LLMs. You're welcome to join our 
[Github](https://github.com/apache/incubator-gluten/discussions) for in-depth 
discussions.*

## Overall Activity Summary
The past 7 days have seen intense activity across the Gluten project with 60+ 
pull requests and 20+ active issues. The community is actively preparing for 
the upcoming 1.6.0 release while simultaneously advancing major features like 
ANSI mode support, Parquet type widening, and dynamic filtering optimizations. 
The Velox backend continues to dominate development focus with significant 
performance improvements and bug fixes.

## Key Ongoing Projects

**Dynamic Filter Pushdown & Performance Optimizations**
- @acvictor is leading major dynamic filtering improvements with #11657 
implementing dynamic filter pushdown to ValueStream and #11711 translating 
might_contain as subfield filters for bloom filter pushdown
- @JkSelf's #8931 optimizes broadcast hash joins by building hash tables once 
per executor, showing 1.29x performance improvement on TPC-DS Q23a

**ANSI Mode Support Expansion**
- @malinjawi completed ANSI-compliant string to boolean casting (#11437) and is 
actively working on expanding ANSI support for other type conversions
- @n0r0shi added ANSI mode decimal arithmetic with overflow checking (#11705)

**Parquet Type Widening & Schema Evolution**
- @baibaichen is enabling GlutenParquetTypeWideningSuite for Spark 4.0/4.1 
(#11684, #11689) to support SPARK-40876, fixing type conversion issues and 
enabling 45+ previously failing tests

**Release Preparation**
- @zhztheplayer coordinated the 1.6.0 release process with multiple PRs 
(#11700, #11701, #11702, #11696) including version bumps and release script 
updates

## Priority Items

**Critical Bug Fixes Needed:**
- #11692: Dynamic partition pruning regression in Hive scans causing test 
failures - @acvictor has draft PR #11710
- #11678: CrossRelNode expression validation missing in native validation - 
@wecharyu has open PR #11679
- #11630: Iceberg test failures requiring immediate attention - multiple PRs 
attempted

**Performance Critical:**
- #11657: Dynamic filter pushdown to ValueStream (589 additions) - needs review
- #8931: BHJ optimization (2243 additions, 263 comments) - long-running 
optimization effort

## Notable Discussions

**#11585: Useful Velox PRs Tracking** - @FelixYBW maintains a comprehensive 
tracker of 100+ Velox PRs submitted by the Gluten community that haven't been 
merged upstream, including critical fixes for ANSI mode, Parquet reading, and 
performance optimizations.

**#11713: Apache Gluten Graduation Tasks** - @weiting-chen coordinates Gluten's 
transition from Apache Incubator to Top Level Project, involving repository 
renaming, documentation updates, and process changes.

**#8429: Gluten Slack Channel** - @zhouyuan announced the new ASF workspace 
Slack channel for real-time community discussions.

## Emerging Trends

1. **ANSI Mode as Default**: With Spark 4.0 enabling ANSI by default, the 
community is rapidly implementing ANSI-compliant functions and type conversions
2. **Dynamic Filtering Revolution**: Multiple PRs focus on pushing filters 
closer to storage for significant performance gains
3. **Release Quality Focus**: Extensive test suite fixes and infrastructure 
improvements ahead of 1.6.0 release
4. **Cross-Backend Compatibility**: Increased attention to ensuring features 
work across Velox, ClickHouse, and other backends

## Good First Issues

**#11699: S3 IMDS Configuration** - Add support for Velox's new S3 IMDS 
configuration options. Good for contributors familiar with cloud storage 
configurations.

**#11703: Iceberg Configuration Mapping** - Map Iceberg writer configurations 
to Velox equivalents. Requires understanding of both Iceberg and Velox 
configuration systems.

**#11513: Fix input_file_name() for Iceberg** - Resolve the issue where 
input_file_name() returns empty strings on Iceberg tables. Good introduction to 
Iceberg integration.

**#10134: ANSI Mode Support** - Contribute to the comprehensive ANSI mode 
implementation. Multiple sub-tasks available for different type casting 
functions, suitable for contributors wanting to learn Spark's type system.

**#11622: TIMESTAMP_NTZ Type Support** - Implement support for Spark's 
TIMESTAMP_NTZ type in Velox backend. Good for learning type system integration 
between Spark and native engines.

GitHub link: https://github.com/apache/incubator-gluten/discussions/11714

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[D] March 06, 2026: Weekly Status Update in Gluten [incubator-gluten]

Reply via email to