Re: [PR] [GLUTEN-12251][VL] Add config switch to merge broadcast batches for BHJ performance [gluten]

via GitHub Tue, 09 Jun 2026 07:31:44 -0700


Xtpacz commented on PR #12259:
URL: https://github.com/apache/gluten/pull/12259#issuecomment-4660820747


   > Thanks @Xtpacz . The current implementation merges the build side into a 
single `ColumnBatch`, which does not seem to be a general solution.
   > 
   > > Verified on an internal Spark cluster running TPC-DS 5TB. On q64, with 
all other configs equal, the aggregated HashBuild total time dropped from 2.04h 
to 22.1min when mergeBatches=true (~5.5x reduction).
   > 
   > That said, based on the benchmark results you shared, the performance 
improvement is quite significant. Could you share more details about the 
previous bottleneck? Was the main issue caused by generating a large number of 
small `ColumnBatch` instances, or was there another factor contributing to the 
overhead?
   
   @wForget Thanks for the review!
   
   **Root cause:** The per batch's serialize/deserialize overhead will across 
the full pipeline. In our q64 case (19.6B build rows, maxBatchSize=4096), this 
produces about 480K independent buffers. Each one goes through PrestoSerializer 
creation + ArrowBuf allocation on serialize, and 
PrestoVectorSerde.deserialize() + small-vector HashBuild on executor side. The 
executor-side cost dominates — small vectors (4096 rows) have poor 
vectorization efficiency and cache locality for hash table building.
   
   **On generality:** This is not a new optimization — Gluten 1.2 used this 
exact merged serialization path. PR #9521 changed to per-batch to reduce native 
memory peak, which inadvertently caused this regression. Our patch simply 
restores the 1.2 behavior behind a config switch (default=false), keeping 
#9521's OOM-safe path as default.
   
   If the team prefers a middle ground, we can do a hybrid — merge in groups of 
N batches to amortize overhead without holding the full partition in native 
memory. Happy to implement if preferred.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [GLUTEN-12251][VL] Add config switch to merge broadcast batches for BHJ performance [gluten]

Reply via email to