andygrove opened a new pull request, #3839: URL: https://github.com/apache/datafusion-comet/pull/3839
## Which issue does this PR close? N/A - experimental performance optimization ## Rationale for this change The current multi-partition shuffle partitioner buffers all input batches in memory until `shuffle_write()` is called at the end. This requires large offheap memory allocations, especially at higher scale factors. This PR modifies the multi-partition partitioner to eagerly flush a partition's accumulated row indices into compressed IPC output (via the existing spill infrastructure) as soon as that partition reaches `batch_size` rows. This bounds the index memory per partition while maintaining the same output format and performance as the current approach. ## What changes are included in this PR? - In `buffer_partitioned_batch_may_spill`, after accumulating row indices, check if any partition has reached `batch_size` rows and flush it immediately via `flush_partition()` - New `flush_partition()` method that creates a `PartitionedBatchIterator` for a single partition, spills the produced batches to disk, and clears the indices - Made `PartitionedBatchIterator::new` visible to sibling modules ## How are these changes tested? Benchmarked on TPC-H SF1000 (Q1): - Baseline (main): 65.7s - This PR: 63.9s (no regression) - For comparison, the immediate-mode partitioner (PR #3838): 111.4s Full TPC-H SF1000 run in progress. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
