GaneshPatil7517 opened a new pull request, #19808: URL: https://github.com/apache/datafusion/pull/19808
## Which issue does this PR close? Closes #19789 ## Rationale for this change When executing a partitioned hash join with a wide build table (many columns), the current `PartitionMode::Partitioned` approach adds a `RepartitionExec` that copies **all columns** of the build table via `take_arrays`. This is expensive when only the join key columns are needed for partitioning. ## What changes are included in this PR? This PR introduces a new `PartitionMode::LazyPartitioned` that avoids the full build-side `RepartitionExec`. Instead: - **Build side**: Requests `UnspecifiedDistribution` (no repartitioning). All partitions are merged via `CoalescePartitionsExec`. - **Probe side**: Still requests `HashPartitioned` distribution (repartitioned as before). - **Lazy filtering**: During hash table construction, rows are filtered using `hash(join_keys) % partition_count == current_partition`, ensuring each partition only builds its relevant subset. ### Key changes: | File | Change | |------|--------| | `joins/mod.rs` | Added `LazyPartitioned` variant to `PartitionMode` enum | | `joins/hash_join/exec.rs` | Added `PartitionFilter` struct and `filter_batch_by_partition()` function; updated `required_input_distribution()` and `execute()` | | `joins/hash_join/stream.rs` | Updated pattern matches for new mode | | `joins/hash_join/shared_bounds.rs` | Updated `SharedBuildAccumulator` handling | | `physical-optimizer/enforce_distribution.rs` | Added `LazyPartitioned` to key reordering logic | | `physical-optimizer/join_selection.rs` | Added swap handling for new mode | | `proto/datafusion.proto` | Added `LAZY_PARTITIONED = 3` | | `proto/src/physical_plan/mod.rs` | Added serialization/deserialization | ## Are these changes tested? Yes. All existing tests pass: - ✅ 765 hash join tests - ✅ 16 physical optimizer tests - ✅ 2 proto roundtrip tests (including `roundtrip_hash_join`) ## Are there any user-facing changes? No breaking changes. This adds a new `PartitionMode::LazyPartitioned` option that can be explicitly selected for hash joins where the build table is wide but the join key is narrow. The existing `Partitioned`, `CollectLeft`, and `Auto` modes remain unchanged. ## Performance Impact For wide build tables, `LazyPartitioned` avoids copying non-join-key columns during repartitioning, reducing memory allocations and CPU overhead. The trade-off is that each partition now scans all build rows (but only retains those matching its partition). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
