GitHub user GlutenPerfBot created a discussion: October 31, 2025: Weekly Status 
Update in Gluten

*This weekly update is generated by LLMs. You're welcome to join our 
[Github](https://github.com/apache/incubator-gluten/discussions) for in-depth 
discussions.*

## Overall Activity Summary
The past 7 days saw 32 merged PRs and 20 open PRs, with heavy focus on Velox 
backend stability, GPU/cuDF acceleration, and daily upstream Velox syncs. 
Memory-management fixes and shuffle optimizations dominated the bug-fix queue, 
while new backend proposals (Bolt, Omni) sparked community interest.

## Key Ongoing Projects
- **GPU shuffle & cuDF acceleration** – @jinchengchenghh is leading #10934 and 
#10933 to move locks out of the iterator constructor, enabling 1 GB GPU batches 
and concurrent CPU/GPU pipeline preparation.
- **Daily Velox up-streaming** – @GlutenPerfBot continues the mechanical daily 
version bumps (#10987, #10985, #10978, #10974, #10962, #10949, #10947, #10946) 
keeping Gluten in lock-step with facebookincubator/velox.
- **Apache maturity checklist** – @zhztheplayer closed #8018 (release-process 
docs) and #10377, pushing the podling toward graduation.
- **BHJ hash-table broadcast** – @JkSelf’s 1.6 k-line PR #8931 (open, 182 
comments) promises 1.29× TPC-DS Q23a speed-up and OOM relief for Q24a/b.
- **ClickHouse backend refresh** – @lgbo-ustc posted #10728, the monthly 
ClickHouse version update.

## Priority Items
- **Memory regression** – #10937 (open) reports spill can’t be triggered when 
dynamic off-heap sizing is on; @wForget already has a fix in review (#10936).
- **ORC schema mismatch** – long-standing #5638 (open) breaks reads when ORC 
file lacks column names; @ccat3z’s PR #8862 is stalled awaiting upstream Velox 
reviews.
- **Uniffle shuffle performance** – #10920 (open) flags a 1.5.0 regression vs 
1.2.0; @wForget’s buffer-size knob #10922 was merged but more tuning is 
expected.
- **GCC-13 readiness** – #10926 (open) reminds us Velox will soon require 
GCC-13; CI still on CentOS-7 + GCC-11.

## Notable Discussions
- #10929: @WangGuangxin proposes “Bolt”, a ByteDance Velox fork with JIT and 
OOM hardening, asking for guidance on upstreaming it as a new Gluten backend.
- #10188: @wjunLu presents “Omni”, an ARM-optimized backend showing 70 % TPC-DS 
speed-up; the team offers ARM CI resources if merged.

## Emerging Trends
- **GPU-first features** – cuDF validation, GPU shuffle reader, and cudf 
library pre-installs are landing almost daily, signaling a shift from CPU-only 
Velox.
- **Memory-management churn** – three separate dynamic off-heap fixes in one 
week suggest the feature is newly stressed in production.
- **Multi-backend ecosystem** – with Bolt and Omni proposals, Gluten is 
evolving into a thin meta-layer over pluggable native engines.
- **Graduation push** – documentation PRs for release process, security pages, 
and PMC lists indicate serious TLP submission prep for Q4 2025.

## Good First Issues
- #6814: Add ClickHouse expression MakeYMInterval – pure CH backend, no 
Velox/C++ needed.
- #4730: Implement date_from_unix_date for ClickHouse – similar pattern to 
existing date functions.
- #6807: Add split_part string function for ClickHouse – well-scoped, CH-only.
- #6812: Implement SparkPartitionID for ClickHouse – single-row function, good 
intro to CH function registry.
- #6815: Add MapZipWith higher-order function for ClickHouse – slightly larger, 
but excellent for learning CH’s lambda infrastructure.

All issues above are labeled “good first issue”, touch only the ClickHouse 
backend, and have clear function signatures, making them ideal entry points for 
new contributors comfortable with Java/Scala or C++ in ClickHouse context.

GitHub link: https://github.com/apache/incubator-gluten/discussions/10995

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to