andygrove opened a new pull request, #20463: URL: https://github.com/apache/datafusion/pull/20463
## Summary - When the join indices form a contiguous ascending range (e.g. `[3,4,5,6]`), replace the O(n) Arrow `take` kernel with O(1) `RecordBatch::slice` (zero-copy pointer arithmetic) - Applies to both the streamed (left) and buffered (right) sides of the sort merge join - Adds a criterion micro-benchmark for SMJ that measures join kernel performance in isolation ## Rationale In SMJ, the streamed side cursor advances sequentially, so its indices are almost always contiguous. The buffered side is scanned sequentially within each key group, so its indices are also contiguous for 1:1 and 1:few joins. The `take` kernel allocates new arrays and copies data even when a simple slice would suffice. ## Benchmark Results Criterion micro-benchmark (100K rows, pre-sorted, no sort/scan overhead): | Benchmark | Baseline | Optimized | Improvement | |-----------|----------|-----------|-------------| | inner_1to1 (unique keys) | 5.11 ms | 3.88 ms | **-24%** | | inner_1to10 (10K keys) | 17.64 ms | 16.29 ms | **-8%** | | left_1to1_unmatched (5% unmatched) | 4.80 ms | 3.87 ms | **-19%** | | left_semi_1to10 (10K keys) | 3.65 ms | 3.11 ms | **-15%** | | left_anti_partial (partial match) | 3.58 ms | 3.43 ms | **-4%** | All improvements are statistically significant (p < 0.05). TPC-H SF1 with SMJ forced (`prefer_hash_join=false`) shows no regressions across all 22 queries, with modest end-to-end improvements on join-heavy queries (Q3 -7%, Q19 -5%, Q21 -2%). ## Implementation - `is_contiguous_range()`: checks if a `UInt64Array` is a contiguous ascending range. Uses quick endpoint rejection then verifies every element sequentially. - `freeze_streamed()`: uses `slice` instead of `take` for streamed (left) columns when indices are contiguous. - `fetch_right_columns_from_batch_by_idxs()`: uses `slice` instead of `take` for buffered (right) columns when indices are contiguous. When indices are not contiguous (e.g. repeated indices in many-to-many joins), falls back to the existing `take` path. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
