Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3728238191 Thank you @UBarney ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan merged PR #19411: URL: https://github.com/apache/datafusion/pull/19411 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3728200782 π€: Benchmark completed Details ``` Comparing HEAD and perfect_hj Benchmark tpch_sf1.json ββββ³ββββ³β³ββββ β Queryβ HEAD β perfect_hj βChange β β‘ββββββββββββ© β QQuery 1 β 197.41 ms β 193.73 ms β no change β β QQuery 2 β 92.23 ms β 78.21 ms β +1.18x faster β β QQuery 3 β 118.66 ms β 119.31 ms β no change β β QQuery 4 β 77.39 ms β 75.23 ms β no change β β QQuery 5 β 172.35 ms β 171.62 ms β no change β β QQuery 6 β 63.31 ms β 63.36 ms β no change β β QQuery 7 β 211.99 ms β 201.61 ms β no change β β QQuery 8 β 162.42 ms β 162.11 ms β no change β β QQuery 9 β 224.05 ms β 217.71 ms β no change β β QQuery 10β 194.29 ms β 182.13 ms β +1.07x faster β β QQuery 11β 73.85 ms β 53.93 ms β +1.37x faster β β QQuery 12β 116.60 ms β 115.72 ms β no change β β QQuery 13β 219.95 ms β 210.38 ms β no change β β QQuery 14β 92.07 ms β 86.02 ms β +1.07x faster β β QQuery 15β 118.29 ms β 119.84 ms β no change β β QQuery 16β 55.78 ms β 57.05 ms β no change β β QQuery 17β 272.93 ms β 271.63 ms β no change β β QQuery 18β 324.23 ms β 317.09 ms β no change β β QQuery 19β 131.86 ms β 128.52 ms β no change β β QQuery 20β 125.68 ms β 122.17 ms β no change β β QQuery 21β 257.67 ms β 248.54 ms β no change β β QQuery 22β 43.63 ms β 36.74 ms β +1.19x faster β ββββ΄ββββ΄β΄ββββ βββββ³ββββ β Benchmark Summary β β β‘ββββββββ© β Total Time (HEAD) β 3346.65ms β β Total Time (perfect_hj) β 3232.65ms β β Average Time (HEAD) β 152.12ms β β Average Time (perfect_hj) β 146.94ms β β Queries Fasterβ 5 β β Queries Slowerβ 0 β β Queries with No Changeβ17 β β Queries with Failure β 0 β βββββ΄ββββ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3728198176 π€ `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing perfect_hj (c772070b6690093fc094644b618e52bd414c70a9) to 1037f0aa205e0d04ade7aa9973b4b4c485b696ab [diff](https://github.com/apache/datafusion/compare/1037f0aa205e0d04ade7aa9973b4b4c485b696ab..c772070b6690093fc094644b618e52bd414c70a9) using: tpch Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3728197862 π€: Benchmark completed Details ``` Comparing HEAD and perfect_hj Benchmark clickbench_extended.json ββββ³ββ³ββ³ββββ β QueryβHEAD β perfect_hj βChange β β‘βββββββββββ© β QQuery 0 β 2454.70 ms β 2388.05 ms β no change β β QQuery 1 β 972.85 ms β 921.28 ms β +1.06x faster β β QQuery 2 β 1927.43 ms β 1840.69 ms β no change β β QQuery 3 β 1207.26 ms β 1175.84 ms β no change β β QQuery 4 β 2293.93 ms β 2285.89 ms β no change β β QQuery 5 β 28351.12 ms β 28179.93 ms β no change β β QQuery 6 β 3856.37 ms β 3907.41 ms β no change β β QQuery 7 β 3715.67 ms β 3538.47 ms β no change β ββββ΄ββ΄ββ΄ββββ βββββ³β β Benchmark Summary ββ β‘βββββ© β Total Time (HEAD) β 44779.32ms β β Total Time (perfect_hj) β 44237.56ms β β Average Time (HEAD) β 5597.41ms β β Average Time (perfect_hj) β 5529.70ms β β Queries Fasterβ 1 β β Queries Slowerβ 0 β β Queries with No Changeβ 7 β β Queries with Failure β 0 β βββββ΄β Benchmark clickbench_partitioned.json ββββ³ββ³ββ³ββββ β QueryβHEAD β perfect_hj βChange β β‘βββββββββββ© β QQuery 0 β 1.43 ms β 1.42 ms β no change β β QQuery 1 β49.63 ms β47.90 ms β no change β β QQuery 2 β 137.51 ms β 133.46 ms β no change β β QQuery 3 β 159.31 ms β 150.20 ms β +1.06x faster β β QQuery 4 β 1114.41 ms β 1126.40 ms β no change β β QQuery 5 β 1357.08 ms β 1386.74 ms β no change β β QQuery 6 β 1.43 ms β 1.45 ms β no change β β QQuery 7 β53.61 ms β54.27 ms β no change β β QQuery 8 β 1474.17 ms β 1468.98 ms β no change β β QQuery 9 β 1818.60 ms β 1789.35 ms β no change β β QQuery 10β 357.03 ms β 353.62 ms β no change β β QQuery 11β 400.27 ms β 394.83 ms β no change β β QQuery 12β 1270.13 ms β 1273.58 ms β no change β β QQuery 13β 1969.54 ms β 1953.45 ms β no change β β QQuery 14β 1259.80 ms β 1245.24 ms β no change β β QQuery 15β 1272.31 ms β 1221.68 ms β no change β β QQuery 16β 2573.32 ms β 2591.69 ms β no change β β QQuery 17β 2535.10 ms β 2579.22 ms β no change β β QQuery 18β 5445.15 ms β 5005.38 ms β +1.09x faster β β QQuery 19β 123.16 ms β 121.03 ms β no change β β QQuery 20β 1933.61 ms β 1888.59 ms β no change β β QQuery 21β 2115.66 ms β 2211.40 ms β no change β β QQuery 22β 3720.38 ms β 3756.81 ms β no change β β QQuery 23β 22330.25 ms β 12098.87 ms β +1.85x faster β β QQuery 24β 213.70 ms β 209.57 ms β no change β β QQuery 25β 456.72 ms β 463.82 ms β no change β β QQuery 26β 217.99 ms β 206.74 ms β +1.05x faster β β QQuery 27β 2849.58 ms β 2672.49 ms β +1.07x faster β β QQuery 28β 22643.07 ms β 23254.82 ms β no change β β QQuery 29β 1003.18 ms β 965.96 ms β no change β β QQuery 30β 1337.18 ms β 1317.95 ms β no change β β QQuery 31β 1362.68 ms β 1337.85 ms β no change β β QQuery 32β 5349.86 ms β 4728.39 ms β +1.13x faster β β QQuery 33β 5896.39 ms β 5499.63 ms β +1.07x faster β β QQuery 34β 5921.03 ms β 5651.70 ms β no change β β QQuery 35β 1954.84 ms β 1860.00 ms β no change β β QQuery 36β64.52 ms β65.88 ms β no change β β QQuery 37β44.40 ms β43.29 ms β no change β β QQuery 38β65.27 ms β65.04 ms β no change β β QQuery 39β 101.86 ms β99.54 ms β no change β β QQuery 40β24.38 ms β24.85 ms β no change β β QQuery 41β22.19 ms β21.72 ms β no change β β QQuery 42β19.30 ms β19.20 ms β no change β ββββ΄ββ΄ββ΄ββββ βββββ³ββ β Benchmark Summary β β β‘ββββββ© β Total Time (HEAD) β 103021.02ms β β Total Time (perfect_hj) β 91364.01ms β β Average Time (HEAD) β 2395.84ms β β
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3728022461 π€ `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing perfect_hj (c772070b6690093fc094644b618e52bd414c70a9) to 1037f0aa205e0d04ade7aa9973b4b4c485b696ab [diff](https://github.com/apache/datafusion/compare/1037f0aa205e0d04ade7aa9973b4b4c485b696ab..c772070b6690093fc094644b618e52bd414c70a9) using: tpch_mem clickbench_partitioned clickbench_extended Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3727828861 run benchmark tpch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3727828403 run benchmarks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2672460238
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
Agreed. Setting it to ~15% works wellβit not only improves performance but
also reduces the memory footprint (space usage). I've updated the threshold.
```
Memory Comparison Matrix (num_rows = 100)
| Density | Dup Rate | ArrayMap (MB) | JoinHashMap (MB) | Ratio (AM/JHM) |
|-|--|---|--||
...
| 15% | 0% | 25.43 |37.81 | 0.67x |
| 15% | 25% | 29.25 |37.81 | 0.77x |
| 15% | 50% | 29.25 |37.81 | 0.77x |
| 15% | 75% | 29.25 |37.81 | 0.77x |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2672453954
##
datafusion/physical-plan/src/joins/hash_join/exec.rs:
##
@@ -168,6 +249,34 @@ impl JoinLeftData {
/// ` != `) are known as "filter expressions" and are evaluated
/// after the equijoin predicates.
///
+/// # ArrayMap Optimization
+///
+/// For joins with a single integer-based join key, `HashJoinExec` may use an
[`ArrayMap`]
+/// (also known as a "perfect hash join") instead of a general-purpose hash
map.
+/// This optimization is used when:
+/// 1. There is exactly one join key.
+/// 2. The join key can be any integer type convertible to u64 (excluding i128
and u128).
Review Comment:
I've applied the suggested change
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2672451547
##
datafusion/physical-plan/src/joins/hash_join/exec.rs:
##
@@ -1601,10 +1778,37 @@ mod tests {
#[template]
#[rstest]
-fn batch_sizes(#[values(8192, 10, 5, 2, 1)] batch_size: usize) {}
+fn hash_join_exec_configs(
+#[values(8192, 10, 5, 2, 1)] batch_size: usize,
+#[values(true, false)] use_perfect_hash_join_as_possible: bool,
+) {
+}
-fn prepare_task_ctx(batch_size: usize) -> Arc {
-let session_config =
SessionConfig::default().with_batch_size(batch_size);
+fn prepare_task_ctx(
+batch_size: usize,
+use_perfect_hash_join_as_possible: bool,
+) -> Arc {
+let mut session_config =
SessionConfig::default().with_batch_size(batch_size);
+
+if use_perfect_hash_join_as_possible {
+session_config
+.options_mut()
+.execution
+.perfect_hash_join_small_build_threshold = 819200;
+session_config
+.options_mut()
+.execution
+.perfect_hash_join_min_key_density = 0.0;
+} else {
+session_config
+.options_mut()
+.execution
+.perfect_hash_join_small_build_threshold = 0;
+session_config
+.options_mut()
+.execution
+.perfect_hash_join_min_key_density = 1.0 / 0.0;
Review Comment:
Accepted. Using f64::INFINITY is indeed much clearer.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2672449674
##
datafusion/physical-plan/src/joins/array_map.rs:
##
@@ -0,0 +1,540 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use arrow_schema::DataType;
+use num_traits::AsPrimitive;
+use std::mem::size_of;
+
+use crate::joins::MapOffset;
+use crate::joins::chain::traverse_chain;
+use arrow::array::{Array, ArrayRef, AsArray, BooleanArray};
+use arrow::buffer::BooleanBuffer;
+use arrow::datatypes::ArrowNumericType;
+use datafusion_common::{Result, ScalarValue, internal_err};
+
+/// A macro to downcast only supported integer types (up to 64-bit) and invoke
a generic function.
+///
+/// Usage: `downcast_supported_integer!(data_type => (Method, arg1, arg2,
...))`
+///
+/// The `Method` must be an associated method of [`ArrayMap`] that is generic
over
+/// `` and allow `T::Native: AsPrimitive`.
+macro_rules! downcast_supported_integer {
+($DATA_TYPE:expr => ($METHOD:ident $(, $ARGS:expr)*)) => {
+match $DATA_TYPE {
+arrow::datatypes::DataType::Int8 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::Int16 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::Int32 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::Int64 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt8 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt16 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt32 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt64 =>
ArrayMap::$METHOD::($($ARGS),*),
+_ => {
+return internal_err!(
+"Unsupported type for ArrayMap: {:?}",
+$DATA_TYPE
+);
+}
+}
+};
+}
+
+/// A dense map for single-column integer join keys within a limited range.
+///
+/// Maps join keys to build-side indices using direct array indexing:
+/// `data[val - min_val_in_build_side] -> val_idx_in_build_side + 1`.
+///
+/// NULL values are ignored on both the build side and the probe side.
+///
+/// # Handling Negative Numbers with `wrapping_sub`
+///
+/// This implementation supports signed integer ranges (e.g., `[-5, 5]`)
efficiently by
+/// treating them as `u64` (Two's Complement) and relying on the bitwise
properties of
+/// wrapping arithmetic (`wrapping_sub`).
+///
+/// In Two's Complement representation, `a_signed - b_signed` produces the
same bit pattern
+/// as `a_unsigned.wrapping_sub(b_unsigned)` (modulo 2^N). This allows us to
perform
+/// range calculations and zero-based index mapping uniformly for both signed
and unsigned
+/// types without branching.
+///
+/// ## Examples
+///
+/// Consider an `Int64` range `[-5, 5]`.
+/// * `min_val (-5)` casts to `u64`: `...1011` (`u64::MAX - 4`)
+/// * `max_val (5)` casts to `u64`: `...0101` (`5`)
+///
+/// **1. Range Calculation**
+///
+/// ```text
+/// In modular arithmetic, this is equivalent to:
+/// (5 - (2^64 - 5)) mod 2^64
+/// = (5 - 2^64 + 5) mod 2^64
+/// = (10 - 2^64) mod 2^64
+/// = 10
+///
+/// ```
+/// The resulting `range` (10) correctly represents the size of the interval
`[-5, 5]`.
+///
+/// **2. Index Lookup (in `get_matched_indices`)**
+///
+/// For a probe value of `0` (which is stored as `0u64`):
+/// ```text
+/// In modular arithmetic, this is equivalent to:
+/// (0 - (2^64 - 5)) mod 2^64
+/// = (-2^64 + 5) mod 2^64
+/// = 5
+/// ```
+/// This correctly maps `-5` to index `0`, `0` to index `5`, etc.
+#[derive(Debug)]
+pub struct ArrayMap {
+// data[probSideVal-offset] -> valIdxInBuildSide + 1; 0 for absent
+data: Vec,
+// min val in buildSide
+offset: u64,
+// next[buildSideIdx] -> next matching valIdxInBuildSide + 1; 0 for end of
chain.
+// If next is empty, it means there are no duplicate keys (no conflicts).
+// It uses the same chain-based conflict resolution as [`JoinHashMapType`].
+next: Vec,
+num_of_distinct_key: usize,
+}
+
+impl ArrayMap {
+pub fn is_supported_type(data_type: &Data
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2672448148
##
datafusion/physical-plan/src/joins/array_map.rs:
##
@@ -0,0 +1,540 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use arrow_schema::DataType;
+use num_traits::AsPrimitive;
+use std::mem::size_of;
+
+use crate::joins::MapOffset;
+use crate::joins::chain::traverse_chain;
+use arrow::array::{Array, ArrayRef, AsArray, BooleanArray};
+use arrow::buffer::BooleanBuffer;
+use arrow::datatypes::ArrowNumericType;
+use datafusion_common::{Result, ScalarValue, internal_err};
+
+/// A macro to downcast only supported integer types (up to 64-bit) and invoke
a generic function.
+///
+/// Usage: `downcast_supported_integer!(data_type => (Method, arg1, arg2,
...))`
+///
+/// The `Method` must be an associated method of [`ArrayMap`] that is generic
over
+/// `` and allow `T::Native: AsPrimitive`.
+macro_rules! downcast_supported_integer {
+($DATA_TYPE:expr => ($METHOD:ident $(, $ARGS:expr)*)) => {
+match $DATA_TYPE {
+arrow::datatypes::DataType::Int8 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::Int16 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::Int32 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::Int64 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt8 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt16 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt32 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt64 =>
ArrayMap::$METHOD::($($ARGS),*),
+_ => {
+return internal_err!(
+"Unsupported type for ArrayMap: {:?}",
+$DATA_TYPE
+);
+}
+}
+};
+}
+
+/// A dense map for single-column integer join keys within a limited range.
+///
+/// Maps join keys to build-side indices using direct array indexing:
+/// `data[val - min_val_in_build_side] -> val_idx_in_build_side + 1`.
+///
+/// NULL values are ignored on both the build side and the probe side.
+///
+/// # Handling Negative Numbers with `wrapping_sub`
+///
+/// This implementation supports signed integer ranges (e.g., `[-5, 5]`)
efficiently by
+/// treating them as `u64` (Two's Complement) and relying on the bitwise
properties of
+/// wrapping arithmetic (`wrapping_sub`).
+///
+/// In Two's Complement representation, `a_signed - b_signed` produces the
same bit pattern
+/// as `a_unsigned.wrapping_sub(b_unsigned)` (modulo 2^N). This allows us to
perform
+/// range calculations and zero-based index mapping uniformly for both signed
and unsigned
+/// types without branching.
+///
+/// ## Examples
+///
+/// Consider an `Int64` range `[-5, 5]`.
+/// * `min_val (-5)` casts to `u64`: `...1011` (`u64::MAX - 4`)
+/// * `max_val (5)` casts to `u64`: `...0101` (`5`)
+///
+/// **1. Range Calculation**
+///
+/// ```text
+/// In modular arithmetic, this is equivalent to:
+/// (5 - (2^64 - 5)) mod 2^64
+/// = (5 - 2^64 + 5) mod 2^64
+/// = (10 - 2^64) mod 2^64
+/// = 10
+///
+/// ```
+/// The resulting `range` (10) correctly represents the size of the interval
`[-5, 5]`.
+///
+/// **2. Index Lookup (in `get_matched_indices`)**
+///
+/// For a probe value of `0` (which is stored as `0u64`):
+/// ```text
+/// In modular arithmetic, this is equivalent to:
+/// (0 - (2^64 - 5)) mod 2^64
+/// = (-2^64 + 5) mod 2^64
+/// = 5
+/// ```
+/// This correctly maps `-5` to index `0`, `0` to index `5`, etc.
+#[derive(Debug)]
+pub struct ArrayMap {
+// data[probSideVal-offset] -> valIdxInBuildSide + 1; 0 for absent
+data: Vec,
+// min val in buildSide
+offset: u64,
+// next[buildSideIdx] -> next matching valIdxInBuildSide + 1; 0 for end of
chain.
+// If next is empty, it means there are no duplicate keys (no conflicts).
+// It uses the same chain-based conflict resolution as [`JoinHashMapType`].
+next: Vec,
+num_of_distinct_key: usize,
+}
+
+impl ArrayMap {
+pub fn is_supported_type(data_type: &Data
Re: [PR] perfect hash join [datafusion]
Dandandan commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2668825491
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
What about ~15%?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2668825491
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
What about ~0.15?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Copilot commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2668300012
##
datafusion/physical-plan/src/joins/hash_join/exec.rs:
##
@@ -168,6 +249,34 @@ impl JoinLeftData {
/// ` != `) are known as "filter expressions" and are evaluated
/// after the equijoin predicates.
///
+/// # ArrayMap Optimization
+///
+/// For joins with a single integer-based join key, `HashJoinExec` may use an
[`ArrayMap`]
+/// (also known as a "perfect hash join") instead of a general-purpose hash
map.
+/// This optimization is used when:
+/// 1. There is exactly one join key.
+/// 2. The join key can be any integer type convertible to u64 (excluding i128
and u128).
Review Comment:
The documentation refers to "excluding i128 and u128" types, but these types
are not actually checked or handled anywhere in the code. The
`is_supported_type` function in `array_map.rs` only supports up to 64-bit
integers (Int8-Int64, UInt8-UInt64). Consider removing the mention of i128/u128
from the documentation or clarifying that they're not supported due to design
limitations, not just current implementation.
```suggestion
/// 2. The join key is an integer type up to 64 bits wide that can be
losslessly converted
///to `u64` (128-bit integer types such as `i128` and `u128` are not
supported).
```
##
datafusion/physical-plan/src/joins/array_map.rs:
##
@@ -0,0 +1,540 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use arrow_schema::DataType;
+use num_traits::AsPrimitive;
+use std::mem::size_of;
+
+use crate::joins::MapOffset;
+use crate::joins::chain::traverse_chain;
+use arrow::array::{Array, ArrayRef, AsArray, BooleanArray};
+use arrow::buffer::BooleanBuffer;
+use arrow::datatypes::ArrowNumericType;
+use datafusion_common::{Result, ScalarValue, internal_err};
+
+/// A macro to downcast only supported integer types (up to 64-bit) and invoke
a generic function.
+///
+/// Usage: `downcast_supported_integer!(data_type => (Method, arg1, arg2,
...))`
+///
+/// The `Method` must be an associated method of [`ArrayMap`] that is generic
over
+/// `` and allow `T::Native: AsPrimitive`.
+macro_rules! downcast_supported_integer {
+($DATA_TYPE:expr => ($METHOD:ident $(, $ARGS:expr)*)) => {
+match $DATA_TYPE {
+arrow::datatypes::DataType::Int8 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::Int16 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::Int32 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::Int64 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt8 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt16 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt32 =>
ArrayMap::$METHOD::($($ARGS),*),
+arrow::datatypes::DataType::UInt64 =>
ArrayMap::$METHOD::($($ARGS),*),
+_ => {
+return internal_err!(
+"Unsupported type for ArrayMap: {:?}",
+$DATA_TYPE
+);
+}
+}
+};
+}
+
+/// A dense map for single-column integer join keys within a limited range.
+///
+/// Maps join keys to build-side indices using direct array indexing:
+/// `data[val - min_val_in_build_side] -> val_idx_in_build_side + 1`.
+///
+/// NULL values are ignored on both the build side and the probe side.
+///
+/// # Handling Negative Numbers with `wrapping_sub`
+///
+/// This implementation supports signed integer ranges (e.g., `[-5, 5]`)
efficiently by
+/// treating them as `u64` (Two's Complement) and relying on the bitwise
properties of
+/// wrapping arithmetic (`wrapping_sub`).
+///
+/// In Two's Complement representation, `a_signed - b_signed` produces the
same bit pattern
+/// as `a_unsigned.wrapping_sub(b_unsigned)` (modulo 2^N). This allows us to
perform
+/// range calculations and zero-based index mapping uniformly for both signed
and unsigned
+/// types without branching.
+///
+/// ## Examples
+///
+/// Consider an `Int64` range `[-5, 5]`.
+/// * `min_val (-5)` casts to `u64`: `...1011` (`
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718680693 π€ Hi @Dandandan, thanks for the request (https://github.com/apache/datafusion/pull/19411#issuecomment-3718680622). [`scrape_comments.py`](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/scrape_comments.py) only supports whitelisted benchmarks. - Standard: clickbench_1, clickbench_extended, clickbench_partitioned, clickbench_pushdown, external_aggr, tpcds, tpch, tpch10, tpch_mem, tpch_mem10 - Criterion: aggregate_query_sql, aggregate_vectorized, case_when, character_length, in_list, range_and_generate_series, sort, sql_planner, strpos, substr_index, with_hashes Please choose one or more of these with `run benchmark ` or `run benchmark ...` Unsupported benchmarks: hj. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718680622 run benchmark hj -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718354201 π€ `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing perfect_hj (e9b35e5ac9e764eeab29aa6ea237e400647466d7) to 1037f0aa205e0d04ade7aa9973b4b4c485b696ab [diff](https://github.com/apache/datafusion/compare/1037f0aa205e0d04ade7aa9973b4b4c485b696ab..e9b35e5ac9e764eeab29aa6ea237e400647466d7) using: tpch Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718372339 ``` ββββ³ββββ³β³ββββ β Queryβ HEAD β perfect_hj βChange β β‘ββββββββββββ© β QQuery 1 β 120.09 ms β 112.47 ms β +1.07x faster β β QQuery 2 β 29.95 ms β 25.31 ms β +1.18x faster β β QQuery 3 β 36.86 ms β 36.60 ms β no change β β QQuery 4 β 28.68 ms β 29.61 ms β no change β β QQuery 5 β 85.84 ms β 86.63 ms β no change β β QQuery 6 β 19.61 ms β 19.47 ms β no change β β QQuery 7 β 243.13 ms β 145.00 ms β +1.68x faster β β QQuery 8 β 37.03 ms β 31.49 ms β +1.18x faster β β QQuery 9 β 117.04 ms β 95.96 ms β +1.22x faster β β QQuery 10β 75.18 ms β 63.06 ms β +1.19x faster β β QQuery 11β 19.93 ms β 12.30 ms β +1.62x faster β β QQuery 12β 49.38 ms β 49.85 ms β no change β β QQuery 13β 46.20 ms β 45.37 ms β no change β β QQuery 14β 13.04 ms β 13.40 ms β no change β β QQuery 15β 23.47 ms β 23.69 ms β no change β β QQuery 16β 24.37 ms β 23.86 ms β no change β β QQuery 17β 148.67 ms β 147.84 ms β no change β β QQuery 18β 284.66 ms β 272.35 ms β no change β β QQuery 19β 35.58 ms β 37.97 ms β 1.07x slower β β QQuery 20β 48.85 ms β 51.08 ms β no change β β QQuery 21β 295.14 ms β 184.30 ms β +1.60x faster β β QQuery 22β 17.43 ms β 17.36 ms β no change β ββββ΄ββββ΄β΄ββββ βββββ³ββββ β Benchmark Summary β β β‘ββββββββ© β Total Time (HEAD) β 1800.12ms β β Total Time (perfect_hj) β 1524.97ms β β Average Time (HEAD) β 81.82ms β β Average Time (perfect_hj) β 69.32ms β β Queries Fasterβ 8 β β Queries Slowerβ 1 β β Queries with No Changeβ13 β β Queries with Failure β 0 β βββββ΄ββββ ``` Very nice! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718356562 π€ `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing perfect_hj (e9b35e5ac9e764eeab29aa6ea237e400647466d7) to 1037f0aa205e0d04ade7aa9973b4b4c485b696ab [diff](https://github.com/apache/datafusion/compare/1037f0aa205e0d04ade7aa9973b4b4c485b696ab..e9b35e5ac9e764eeab29aa6ea237e400647466d7) using: tpcds Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718356620 Benchmark script failed with exit code 1. Last 10 lines of output: Click to expand ``` BRANCH_NAME: HEAD DATA_DIR: /home/alamb/arrow-datafusion/benchmarks/data RESULTS_DIR: /home/alamb/arrow-datafusion/benchmarks/results/HEAD CARGO_COMMAND: cargo run --release PREFER_HASH_JOIN: true *** Please prepare TPC-DS data first by following instructions: ./bench.sh data tpcds ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718356441 π€: Benchmark completed Details ``` Comparing HEAD and perfect_hj Benchmark tpch_sf1.json ββββ³ββββ³β³ββββ β Queryβ HEAD β perfect_hj βChange β β‘ββββββββββββ© β QQuery 1 β 191.42 ms β 194.17 ms β no change β β QQuery 2 β 92.90 ms β 79.24 ms β +1.17x faster β β QQuery 3 β 122.93 ms β 118.45 ms β no change β β QQuery 4 β 76.58 ms β 76.39 ms β no change β β QQuery 5 β 169.79 ms β 167.60 ms β no change β β QQuery 6 β 65.74 ms β 62.99 ms β no change β β QQuery 7 β 213.34 ms β 204.62 ms β no change β β QQuery 8 β 159.39 ms β 158.60 ms β no change β β QQuery 9 β 224.42 ms β 220.08 ms β no change β β QQuery 10β 184.10 ms β 187.50 ms β no change β β QQuery 11β 73.24 ms β 57.11 ms β +1.28x faster β β QQuery 12β 116.93 ms β 116.71 ms β no change β β QQuery 13β 218.62 ms β 214.30 ms β no change β β QQuery 14β 94.08 ms β 89.32 ms β +1.05x faster β β QQuery 15β 115.42 ms β 117.40 ms β no change β β QQuery 16β 55.32 ms β 56.42 ms β no change β β QQuery 17β 269.67 ms β 269.94 ms β no change β β QQuery 18β 318.44 ms β 316.34 ms β no change β β QQuery 19β 131.59 ms β 133.16 ms β no change β β QQuery 20β 121.70 ms β 124.62 ms β no change β β QQuery 21β 259.46 ms β 252.28 ms β no change β β QQuery 22β 41.13 ms β 36.72 ms β +1.12x faster β ββββ΄ββββ΄β΄ββββ βββββ³ββββ β Benchmark Summary β β β‘ββββββββ© β Total Time (HEAD) β 3316.23ms β β Total Time (perfect_hj) β 3253.95ms β β Average Time (HEAD) β 150.74ms β β Average Time (perfect_hj) β 147.91ms β β Queries Fasterβ 4 β β Queries Slowerβ 0 β β Queries with No Changeβ18 β β Queries with Failure β 0 β βββββ΄ββββ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718353932 π€: Benchmark completed Details ``` Comparing HEAD and perfect_hj Benchmark clickbench_extended.json ββββ³ββ³ββ³ββββ β QueryβHEAD β perfect_hj βChange β β‘βββββββββββ© β QQuery 0 β 2361.73 ms β 2300.54 ms β no change β β QQuery 1 β 934.19 ms β 912.49 ms β no change β β QQuery 2 β 1885.10 ms β 1860.90 ms β no change β β QQuery 3 β 1202.09 ms β 1180.71 ms β no change β β QQuery 4 β 2306.91 ms β 2298.40 ms β no change β β QQuery 5 β 28041.32 ms β 28094.82 ms β no change β β QQuery 6 β 3849.98 ms β 3872.67 ms β no change β β QQuery 7 β 3794.60 ms β 3815.13 ms β no change β ββββ΄ββ΄ββ΄ββββ βββββ³β β Benchmark Summary ββ β‘βββββ© β Total Time (HEAD) β 44375.93ms β β Total Time (perfect_hj) β 44335.67ms β β Average Time (HEAD) β 5546.99ms β β Average Time (perfect_hj) β 5541.96ms β β Queries Fasterβ 0 β β Queries Slowerβ 0 β β Queries with No Changeβ 8 β β Queries with Failure β 0 β βββββ΄β Benchmark clickbench_partitioned.json ββββ³ββ³ββ³ββββ β QueryβHEAD β perfect_hj βChange β β‘βββββββββββ© β QQuery 0 β 1.44 ms β 1.46 ms β no change β β QQuery 1 β48.94 ms β49.13 ms β no change β β QQuery 2 β 131.83 ms β 132.88 ms β no change β β QQuery 3 β 154.94 ms β 151.89 ms β no change β β QQuery 4 β 1129.27 ms β 1094.93 ms β no change β β QQuery 5 β 1364.06 ms β 1368.14 ms β no change β β QQuery 6 β 1.41 ms β 1.42 ms β no change β β QQuery 7 β52.53 ms β53.49 ms β no change β β QQuery 8 β 1469.70 ms β 1415.32 ms β no change β β QQuery 9 β 1804.95 ms β 1800.38 ms β no change β β QQuery 10β 344.22 ms β 345.91 ms β no change β β QQuery 11β 387.44 ms β 396.02 ms β no change β β QQuery 12β 1237.81 ms β 1286.77 ms β no change β β QQuery 13β 1912.18 ms β 1972.25 ms β no change β β QQuery 14β 1233.69 ms β 1247.50 ms β no change β β QQuery 15β 1254.11 ms β 1243.60 ms β no change β β QQuery 16β 2567.98 ms β 2570.61 ms β no change β β QQuery 17β 2550.81 ms β 2580.33 ms β no change β β QQuery 18β 5612.71 ms β 4913.15 ms β +1.14x faster β β QQuery 19β 120.74 ms β 119.55 ms β no change β β QQuery 20β 1913.07 ms β 1904.79 ms β no change β β QQuery 21β 2146.63 ms β 2192.36 ms β no change β β QQuery 22β 3703.55 ms β 3729.43 ms β no change β β QQuery 23β 22049.38 ms β 12242.00 ms β +1.80x faster β β QQuery 24β 207.87 ms β 203.23 ms β no change β β QQuery 25β 456.99 ms β 469.41 ms β no change β β QQuery 26β 214.05 ms β 204.73 ms β no change β β QQuery 27β 2767.10 ms β 2680.87 ms β no change β β QQuery 28β 22195.62 ms β 23141.30 ms β no change β β QQuery 29β 991.32 ms β 939.56 ms β +1.06x faster β β QQuery 30β 1349.48 ms β 1319.89 ms β no change β β QQuery 31β 1368.06 ms β 1339.89 ms β no change β β QQuery 32β 5091.91 ms β 5138.39 ms β no change β β QQuery 33β 5512.57 ms β 5484.79 ms β no change β β QQuery 34β 5638.93 ms β 6020.19 ms β 1.07x slower β β QQuery 35β 1992.03 ms β 1922.04 ms β no change β β QQuery 36β63.99 ms β64.41 ms β no change β β QQuery 37β43.36 ms β45.51 ms β no change β β QQuery 38β66.10 ms β64.66 ms β no change β β QQuery 39β 100.63 ms β 101.04 ms β no change β β QQuery 40β28.11 ms β26.02 ms β +1.08x faster β β QQuery 41β21.33 ms β22.12 ms β no change β β QQuery 42β18.14 ms β19.02 ms β no change β ββββ΄ββ΄ββ΄ββββ βββββ³ββ β Benchmark Summary β β β‘ββββββ© β Total Time (HEAD) β 101320.99ms β β Total Time (perfect_hj) β 92020.38ms β β Average Time (HEAD) β 2356.30ms β β Average Time (perfect_hj) β 2140.01ms β β Q
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718215175 run benchmark tpcds -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718214577 run benchmark tpch -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718214414 π€ `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing perfect_hj (e9b35e5ac9e764eeab29aa6ea237e400647466d7) to 1037f0aa205e0d04ade7aa9973b4b4c485b696ab [diff](https://github.com/apache/datafusion/compare/1037f0aa205e0d04ade7aa9973b4b4c485b696ab..e9b35e5ac9e764eeab29aa6ea237e400647466d7) using: tpch_mem clickbench_partitioned clickbench_extended Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718214064 run benchmarks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3718078855 > > > Can you solve the conflicts? > > > > > > I've resolved the conflicts. Once #19602 is merged, I will update `ArrayMap::mark_existing_probes` to follow logic similar to `JoinHashMapType::contain_hashes`. > > Done I have added contain_keys to ArrayMap and updated the logic accordingly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3714076071 > > Can you solve the conflicts? > > I've resolved the conflicts. Once #19602 is merged, I will update `ArrayMap::mark_existing_probes` to follow logic similar to `JoinHashMapType::contain_hashes`. DonE -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3713987665 > Can you solve the conflicts? I've resolved the conflicts. Once https://github.com/apache/datafusion/pull/19602 is merged, I will update `ArrayMap::mark_existing_probes` to follow logic similar to `JoinHashMapType::contain_hashes`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2662706360
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
Cool!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3711965227 Can you solve the conflicts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2654932168
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
I think we can set it to 20%.
- In terms of memory usage ([run with this
code](https://github.com/UBarney/datafusion/commit/d901b70347e24a271464aefb64fe08f64d35a652#diff-1e8db68656c3040876f6740d3bc42251f6d1f9de1f5e3c14d257e72b48a0f68dR10):
), `ArrayMap` consumes less memory than `JoinHashMap` at a 20% density, even
with duplicate keys.
- Based on the hj.rs benchmark(results in the PR description), `ArrayMap` is
also faster than `JoinHashMap` at 20% density, regardless of whether there are
duplicate keys.
```
Memory Comparison Matrix (num_rows = 100)
| Density | Dup Rate | ArrayMap (MB) | JoinHashMap (MB) | Ratio (AM/JHM) |
|-|--|---|--||
|100% | 0% | 3.81 |37.81 | 0.10x |
|100% | 25% | 7.63 |37.81 | 0.20x |
|100% | 50% | 7.63 |37.81 | 0.20x |
|100% | 75% | 7.63 |37.81 | 0.20x |
| 75% | 0% | 5.09 |37.81 | 0.13x |
| 75% | 25% | 8.90 |37.81 | 0.24x |
| 75% | 50% | 8.90 |37.81 | 0.24x |
| 75% | 75% | 8.90 |37.81 | 0.24x |
| 50% | 0% | 7.63 |37.81 | 0.20x |
| 50% | 25% | 11.44 |37.81 | 0.30x |
| 50% | 50% | 11.44 |37.81 | 0.30x |
| 50% | 75% | 11.44 |37.81 | 0.30x |
| 20% | 0% | 19.07 |37.81 | 0.50x |
| 20% | 25% | 22.89 |37.81 | 0.61x |
| 20% | 50% | 22.89 |37.81 | 0.61x |
| 20% | 75% | 22.89 |37.81 | 0.61x |
| 10% | 0% | 38.15 |37.81 | 1.01x |
| 10% | 25% | 41.96 |37.81 | 1.11x |
| 10% | 50% | 41.96 |37.81 | 1.11x |
| 10% | 75% | 41.96 |37.81 | 1.11x |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2654932168
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
I think we can set it to 20%.
- In terms of memory usage ([run with this
code](https://github.com/UBarney/datafusion/commit/209c5002d3273e4ad762077d017c35049c9e4d6f#diff-1e8db68656c3040876f6740d3bc42251f6d1f9de1f5e3c14d257e72b48a0f68dR10):
), `ArrayMap` consumes less memory than `JoinHashMap` at a 20% density, even
with duplicate keys.
- Based on the hj.rs benchmark(results in the PR description), `ArrayMap` is
also faster than `JoinHashMap` at 20% density, regardless of whether there are
duplicate keys.
```
Memory Comparison Matrix (num_rows = 100)
| Density | Dup Rate | Distinct | Range | ArrayMap (MB) | JoinHashMap
(MB) | Ratio (AM/JHM) |
|-|--|--|--|---|--||
|100% | 0% | 100 | 100 | 3.81 |
37.81 | 0.10x |
|100% | 25% | 75 | 100 | 7.63 |
37.81 | 0.20x |
|100% | 50% | 50 | 100 | 7.63 |
37.81 | 0.20x |
|100% | 75% | 25 | 100 | 7.63 |
37.81 | 0.20x |
| 75% | 0% | 100 | 133 | 5.09 |
37.81 | 0.13x |
| 75% | 25% | 75 | 133 | 8.90 |
37.81 | 0.24x |
| 75% | 50% | 50 | 133 | 8.90 |
37.81 | 0.24x |
| 75% | 75% | 25 | 133 | 8.90 |
37.81 | 0.24x |
| 50% | 0% | 100 | 200 | 7.63 |
37.81 | 0.20x |
| 50% | 25% | 75 | 200 | 11.44 |
37.81 | 0.30x |
| 50% | 50% | 50 | 200 | 11.44 |
37.81 | 0.30x |
| 50% | 75% | 25 | 200 | 11.44 |
37.81 | 0.30x |
| 20% | 0% | 100 | 500 | 19.07 |
37.81 | 0.50x |
| 20% | 25% | 75 | 500 | 22.89 |
37.81 | 0.61x |
| 20% | 50% | 50 | 500 | 22.89 |
37.81 | 0.61x |
| 20% | 75% | 25 | 500 | 22.89 |
37.81 | 0.61x |
| 10% | 0% | 100 | 1000 | 38.15 |
37.81 | 1.01x |
| 10% | 25% | 75 | 1000 | 41.96 |
37.81 | 1.11x |
| 10% | 50% | 50 | 1000 | 41.96 |
37.81 | 1.11x |
| 10% | 75% | 25 | 1000 | 41.96 |
37.81 | 1.11x |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2649548041
##
datafusion/physical-plan/src/joins/array_map.rs:
##
@@ -0,0 +1,565 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use arrow::buffer::MutableBuffer;
+use num_traits::AsPrimitive;
+use std::fmt;
+use std::sync::Arc;
+
+use crate::joins::chain::traverse_chain;
+use crate::joins::join_hash_map::JoinHashMapOffset;
+use crate::joins::utils::JoinHashMapType;
+use arrow::array::{Array, ArrayRef, AsArray};
+use arrow::datatypes::DataType;
+use arrow::datatypes::{
+Int8Type, Int16Type, Int32Type, Int64Type, UInt8Type, UInt16Type,
UInt32Type,
+UInt64Type,
+};
+use datafusion_common::{Result, internal_err};
+
+/// A "perfect" hash map for single-column integer join keys, represented as a
dense array.
+///
+/// This structure is highly optimized for joins where the keys are integers
within a limited
+/// range. Instead of calculating hashes, it uses the integer key itself as an
index into a
+/// `Vec`, achieving O(1) lookup performance.
+///
+/// # NULL Handling
+///
+/// This optimization can be used for joins with
`NullEquality::NullEqualsNothing` even if the
+/// join keys contain `NULL`s. This is because:
+///
+/// 1. `try_new` (build side): Ignores rows with `NULL` keys when creating the
map. This is
+///correct as `NULL` keys would not match anything anyway.
+/// 2. `get_matched_indices_with_limit_offset` (probe side): Skips any `NULL`
keys encountered
+///in the probe side input.
+///
+/// This structure **cannot** be used for joins with
`NullEquality::NullEqualsNull` if the
+/// build side contains `NULL`s, as it does not have a mechanism to store and
match `NULL` values.
+#[derive(Debug)]
+pub struct ArrayMap {
+// data[probSideVal-offset] -> valIdxInBuildSide + 1; 0 for absent
+data: Vec,
+offset: u64, // min val in buildSide
+next: Option>,
+}
+
+impl ArrayMap {
+/// Creates a new [`ArrayKV`] from the given array of join keys.
+///
+/// Note: This function processes only the non-null values in the input
`array`,
+/// effectively ignoring any rows where the key is `NULL`.
+///
+/// TODO: Support `NullEquality::NullEqualsNull` by storing null indices
in a
+/// separate `Vec` to allow for `NULL=NULL` matching in the future.
+pub(crate) fn try_new(
+array: &ArrayRef,
+offset_val: u64,
+range: usize,
+) -> Result {
+// Initialize with 0 (sentinel for not found)
+let mut data: Vec = vec![0; range];
+let mut next: Option> = None;
Review Comment:
Done, it's cleaner this way
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2637851683
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
I ran a quick test on performance (time taken) at different densities, and
even at a density of 0.1, ArrayMap was still faster than HashMap π€. (I haven't
measured memory usage yet, though.)
```
ββββ³ββββ³ββ³ββββ
β Queryβ density=6 β density=0.1 β
Change β
β‘βββββββββββββ©
β QQuery 1_density=1_prob_hit=1_25*1.5Mβ 5.83 ms β 4.96 ms β
+1.18x faster β
β QQuery 2_density=0.026_prob_hit=1_25*1.5Mβ 6.56 ms β 4.58 ms β
+1.43x faster β
β QQuery 3_density=1_prob_hit=1_100K*60M β 134.90 ms β94.36 ms β
+1.43x faster β
β QQuery 4_density=1_prob_hit=0.1_100K*60M β 167.25 ms β 106.93 ms β
+1.56x faster β
β QQuery 5_density=0.75_prob_hit=1_100K*60Mβ 141.52 ms β93.04 ms β
+1.52x faster β
β QQuery 6_density=0.75_prob_hit=0.1_100K*60M β 248.02 ms β 193.13 ms β
+1.28x faster β
β QQuery 7_density=0.5_prob_hit=1_100K*60M β 132.51 ms β82.83 ms β
+1.60x faster β
β QQuery 8_density=0.5_prob_hit=0.1_100K*60M β 226.32 ms β 171.06 ms β
+1.32x faster β
β QQuery 9_density=0.2_prob_hit=1_100K*60M β 130.31 ms β96.29 ms β
+1.35x faster β
β QQuery 10_density=0.2_prob_hit=0.1_100K*60M β 231.85 ms β 184.09 ms β
+1.26x faster β
β QQuery 11_density=0.1_prob_hit=1_100K*60Mβ 129.65 ms β 105.23 ms β
+1.23x faster β
β QQuery 12_density=0.1_prob_hit=0.1_100K*60M β 235.93 ms β 191.80 ms β
+1.23x faster β
β QQuery 13_density=0.01_prob_hit=1_100K*60M β 128.74 ms β 129.35 ms β
no change β
β QQuery 14_density=0.01_prob_hit=0.1_100K*60M β 230.82 ms β 233.62 ms β
no change β
ββββ΄ββββ΄ββ΄ββββ
ββ³ββββ
β Benchmark Summary β β
β‘βββββ©
β Total Time (density=6) β 2150.21ms β
β Total Time (density=0.1) β 1691.27ms β
β Average Time (density=6) β 153.59ms β
β Average Time (density=0.1) β 120.80ms β
β Queries Faster β12 β
β Queries Slower β 0 β
β Queries with No Change β 2 β
β Queries with Failure β 0 β
ββ΄ββββ
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411: URL: https://github.com/apache/datafusion/pull/19411#discussion_r2637856134 ## benchmarks/compare.py: ## @@ -154,17 +154,17 @@ def compare( baseline = BenchmarkRun.load_from_file(baseline_path) comparison = BenchmarkRun.load_from_file(comparison_path) -console = Console() +console = Console(width=200) Review Comment: I've increased the console width to 200. I added more information like 'density' to the queryName, which made it longer and caused it to be cut off in the output before ## benchmarks/compare.py: ## @@ -154,17 +154,17 @@ def compare( baseline = BenchmarkRun.load_from_file(baseline_path) comparison = BenchmarkRun.load_from_file(comparison_path) -console = Console() +console = Console(width=200) # use basename as the column names -baseline_header = baseline_path.parent.stem -comparison_header = comparison_path.parent.stem +baseline_header = baseline_path.parent.name +comparison_header = comparison_path.parent.name Review Comment: Before, a path like `.../density=0.1/...` was incorrectly shortened to `density=0`. Now, by using `.parent.name`, we correctly get the full directory name, `density=0.1` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2636932448
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
Sounds good!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
UBarney commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2636743834
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
That's a great point.
I'll add some benchmarks to compare the performance at different densities,
including 75%, to find the optimal value for our use case. I'll update this
based on the results.
Thanks for the suggestion!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3676314727 π€: Benchmark completed Details ``` Comparing HEAD and perfect_hj Benchmark tpcds_sf1.json ββββ³ββ³ββ³ββββ β QueryβHEAD β perfect_hj βChange β β‘βββββββββββ© β QQuery 1 β64.32 ms β64.03 ms β no change β β QQuery 2 β 202.76 ms β 209.93 ms β no change β β QQuery 3 β 160.17 ms β 160.30 ms β no change β β QQuery 4 β 2076.33 ms β 2016.14 ms β no change β β QQuery 5 β 271.72 ms β 267.86 ms β no change β β QQuery 6 β 1539.80 ms β 1528.76 ms β no change β β QQuery 7 β 501.34 ms β 500.09 ms β no change β β QQuery 8 β 172.26 ms β 172.88 ms β no change β β QQuery 9 β 283.58 ms β 274.82 ms β no change β β QQuery 10β 169.65 ms β 172.34 ms β no change β β QQuery 11β 1414.43 ms β 1436.15 ms β no change β β QQuery 12β74.91 ms β73.81 ms β no change β β QQuery 13β 554.37 ms β 557.49 ms β no change β β QQuery 14β 2016.91 ms β 1803.46 ms β +1.12x faster β β QQuery 15β28.50 ms β27.30 ms β no change β β QQuery 16β56.60 ms β57.60 ms β no change β β QQuery 17β 354.64 ms β 357.88 ms β no change β β QQuery 18β 191.69 ms β 190.50 ms β no change β β QQuery 19β 228.28 ms β 229.67 ms β no change β β QQuery 20β23.36 ms β23.50 ms β no change β β QQuery 21β33.71 ms β32.15 ms β no change β β QQuery 22β 985.29 ms β 906.05 ms β +1.09x faster β β QQuery 23β 1845.20 ms β 1713.30 ms β +1.08x faster β β QQuery 24β 642.89 ms β 636.64 ms β no change β β QQuery 25β 523.12 ms β 516.07 ms β no change β β QQuery 26β 123.94 ms β 126.15 ms β no change β β QQuery 27β 504.93 ms β 500.48 ms β no change β β QQuery 28β 303.81 ms β 302.04 ms β no change β β QQuery 29β 444.98 ms β 447.70 ms β no change β β QQuery 30β65.01 ms β63.94 ms β no change β β QQuery 31β 306.15 ms β 306.66 ms β no change β β QQuery 32β77.86 ms β78.00 ms β no change β β QQuery 33β 193.25 ms β 194.57 ms β no change β β QQuery 34β 159.04 ms β 162.10 ms β no change β β QQuery 35β 169.34 ms β 171.45 ms β no change β β QQuery 36β 296.09 ms β 286.26 ms β no change β β QQuery 37β 257.06 ms β 254.49 ms β no change β β QQuery 38β 155.26 ms β 151.49 ms β no change β β QQuery 39β 216.46 ms β 194.61 ms β +1.11x faster β β QQuery 40β 197.40 ms β 172.47 ms β +1.14x faster β β QQuery 41β16.30 ms β16.66 ms β no change β β QQuery 42β 142.70 ms β 145.52 ms β no change β β QQuery 43β 126.26 ms β 129.13 ms β no change β β QQuery 44β15.68 ms β14.96 ms β no change β β QQuery 45β83.14 ms β80.86 ms β no change β β QQuery 46β 329.53 ms β 329.04 ms β no change β β QQuery 47β 1279.95 ms β 1186.17 ms β +1.08x faster β β QQuery 48β 416.35 ms β 420.97 ms β no change β β QQuery 49β 355.14 ms β 354.37 ms β no change β β QQuery 50β 338.48 ms β 343.17 ms β no change β β QQuery 51β 298.83 ms β 294.59 ms β no change β β QQuery 52β 147.14 ms β 143.27 ms β no change β β QQuery 53β 149.53 ms β 146.40 ms β no change β β QQuery 54β 207.43 ms β 216.19 ms β no change β β QQuery 55β 144.21 ms β 144.60 ms β no change β β QQuery 56β 193.16 ms β 190.45 ms β no change β β QQuery 57β 319.98 ms β 302.28 ms β +1.06x faster β β QQuery 58β 515.69 ms β 482.75 ms β +1.07x faster β β QQuery 59β 285.93 ms β 288.57 ms β no change β β QQuery 60β 199.98 ms β 199.54 ms β no change β β QQuery 61β 240.08 ms β 235.47 ms β no change β β QQuery 62β 1378.44 ms β 1398.12 ms β no change β β QQuery 63β 149.43 ms β 148.88 ms β no change β β QQuery 64β 1142.94 ms β 1161.26 ms β no change β β QQuery 65β 368.38 ms β 363.41 ms β no change β β QQuery 66β 389.40 ms β 393.14 ms β no change β β QQuery 67β 646.76 ms β 627.61 ms β no change β β QQuery 68β 383.87 ms β 384.78 ms β no change β β QQuery 69β 167.41 ms β 170.00 ms β no change β β QQuery 70β 526.93 ms β 528.35 ms β no change β β QQuery 71β 183.79 ms β 184.70
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3676292704 π€ `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing perfect_hj (a442460deb5764caf51a454fcc0959292e0cf530) to 8550010bd55ccae217b08240664c44fa8bda8261 [diff](https://github.com/apache/datafusion/compare/8550010bd55ccae217b08240664c44fa8bda8261..a442460deb5764caf51a454fcc0959292e0cf530) using: tpcds Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3676292478 π€: Benchmark completed Details ``` Comparing HEAD and perfect_hj Benchmark clickbench_extended.json ββββ³ββ³ββ³ββββ β QueryβHEAD β perfect_hj βChange β β‘βββββββββββ© β QQuery 0 β 2728.43 ms β 2746.77 ms β no change β β QQuery 1 β 1259.76 ms β 1286.32 ms β no change β β QQuery 2 β 2442.97 ms β 2404.14 ms β no change β β QQuery 3 β 1146.01 ms β 1175.40 ms β no change β β QQuery 4 β 2324.58 ms β 2297.95 ms β no change β β QQuery 5 β 28847.71 ms β 28651.86 ms β no change β β QQuery 6 β 3925.44 ms β 4071.92 ms β no change β β QQuery 7 β 3480.92 ms β 3513.25 ms β no change β ββββ΄ββ΄ββ΄ββββ βββββ³β β Benchmark Summary ββ β‘βββββ© β Total Time (HEAD) β 46155.84ms β β Total Time (perfect_hj) β 46147.61ms β β Average Time (HEAD) β 5769.48ms β β Average Time (perfect_hj) β 5768.45ms β β Queries Fasterβ 0 β β Queries Slowerβ 0 β β Queries with No Changeβ 8 β β Queries with Failure β 0 β βββββ΄β Benchmark clickbench_partitioned.json ββββ³ββ³ββ³ββββ β QueryβHEAD β perfect_hj βChange β β‘βββββββββββ© β QQuery 0 β 2.55 ms β 2.28 ms β +1.12x faster β β QQuery 1 β49.94 ms β49.72 ms β no change β β QQuery 2 β 136.75 ms β 135.77 ms β no change β β QQuery 3 β 157.55 ms β 156.38 ms β no change β β QQuery 4 β 1127.59 ms β 1099.95 ms β no change β β QQuery 5 β 1513.28 ms β 1540.48 ms β no change β β QQuery 6 β 2.09 ms β 2.18 ms β no change β β QQuery 7 β54.92 ms β54.54 ms β no change β β QQuery 8 β 1435.36 ms β 1429.02 ms β no change β β QQuery 9 β 1826.50 ms β 1818.96 ms β no change β β QQuery 10β 362.05 ms β 359.18 ms β no change β β QQuery 11β 413.64 ms β 415.45 ms β no change β β QQuery 12β 1346.02 ms β 1355.55 ms β no change β β QQuery 13β 2003.64 ms β 2030.93 ms β no change β β QQuery 14β 1258.62 ms β 1264.24 ms β no change β β QQuery 15β 1236.44 ms β 1220.63 ms β no change β β QQuery 16β 2660.06 ms β 2679.40 ms β no change β β QQuery 17β 2646.48 ms β 2669.57 ms β no change β β QQuery 18β 5032.89 ms β 5243.53 ms β no change β β QQuery 19β 119.08 ms β 120.08 ms β no change β β QQuery 20β 1909.83 ms β 1955.97 ms β no change β β QQuery 21β 2219.56 ms β 2228.45 ms β no change β β QQuery 22β 3754.10 ms β 3742.17 ms β no change β β QQuery 23β 12284.83 ms β 12304.70 ms β no change β β QQuery 24β 200.89 ms β 216.80 ms β 1.08x slower β β QQuery 25β 459.79 ms β 477.52 ms β no change β β QQuery 26β 217.46 ms β 226.68 ms β no change β β QQuery 27β 2733.98 ms β 2714.81 ms β no change β β QQuery 28β 24706.02 ms β 23446.26 ms β +1.05x faster β β QQuery 29β 983.86 ms β 951.63 ms β no change β β QQuery 30β 1325.66 ms β 1334.96 ms β no change β β QQuery 31β 1361.59 ms β 1319.42 ms β no change β β QQuery 32β 4654.91 ms β 4894.57 ms β 1.05x slower β β QQuery 33β 5895.01 ms β 5580.74 ms β +1.06x faster β β QQuery 34β 5921.45 ms β 5968.72 ms β no change β β QQuery 35β 1944.76 ms β 1894.39 ms β no change β β QQuery 36β66.49 ms β65.88 ms β no change β β QQuery 37β45.71 ms β47.05 ms β no change β β QQuery 38β66.55 ms β67.85 ms β no change β β QQuery 39β 105.16 ms β 104.80 ms β no change β β QQuery 40β28.03 ms β27.95 ms β no change β β QQuery 41β23.88 ms β24.08 ms β no change β β QQuery 42β19.99 ms β20.73 ms β no change β ββββ΄ββ΄ββ΄ββββ βββββ³β β Benchmark Summary ββ β‘βββββ© β Total Time (HEAD) β 94314.96ms β β Total Time (perfect_hj) β 93263.97ms β β Average Time (HEAD) β 2193.37ms β β Average Time (perfect_hj) β 2168.93ms β β Queries
Re: [PR] perfect hash join [datafusion]
Dandandan commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2636002211
##
datafusion/common/src/config.rs:
##
@@ -468,6 +468,25 @@ config_namespace! {
/// metadata memory consumption
pub batch_size: usize, default = 8192
+/// A perfect hash join will be considered if the number of rows on
the build
+/// side is below this threshold. This provides a fast path for joins
with
+/// very small build sides, bypassing the density check.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_small_build_threshold: usize, default = 1024
+
+/// The minimum required density of join keys on the build side to
consider a
+/// perfect hash join. Density is calculated as:
+/// `(number of rows) / (max_key - min_key + 1)`.
+/// A perfect hash join may be used if the actual key density exceeds
this
+/// value. For example, a value of 0.99 means the keys must fill at
least
+/// 99% of their value range.
+///
+/// TODO: Currently only supports cases where left_side.num_rows() <
u32::MAX.
+/// Support for left_side.num_rows() >= u32::MAX will be added in the
future.
+pub perfect_hash_join_min_key_density: f64, default = 0.99
Review Comment:
This seems very high? For a hashmap I believe it's ~75% default (plus it has
some more overhead per key), so I think a 75% probably could still be better
overall?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3676159917 run benchmark tpcds -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
alamb-ghbot commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3676159863 π€ `./gh_compare_branch.sh` [gh_compare_branch.sh](https://github.com/alamb/datafusion-benchmarking/blob/main/scripts/gh_compare_branch.sh) Running Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Comparing perfect_hj (a442460deb5764caf51a454fcc0959292e0cf530) to 8550010bd55ccae217b08240664c44fa8bda8261 [diff](https://github.com/apache/datafusion/compare/8550010bd55ccae217b08240664c44fa8bda8261..a442460deb5764caf51a454fcc0959292e0cf530) using: tpch_mem clickbench_partitioned clickbench_extended Results will be posted here when complete -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on PR #19411: URL: https://github.com/apache/datafusion/pull/19411#issuecomment-3676159501 run benchmarks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] perfect hash join [datafusion]
Dandandan commented on code in PR #19411:
URL: https://github.com/apache/datafusion/pull/19411#discussion_r2635957210
##
datafusion/physical-plan/src/joins/array_map.rs:
##
@@ -0,0 +1,565 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use arrow::buffer::MutableBuffer;
+use num_traits::AsPrimitive;
+use std::fmt;
+use std::sync::Arc;
+
+use crate::joins::chain::traverse_chain;
+use crate::joins::join_hash_map::JoinHashMapOffset;
+use crate::joins::utils::JoinHashMapType;
+use arrow::array::{Array, ArrayRef, AsArray};
+use arrow::datatypes::DataType;
+use arrow::datatypes::{
+Int8Type, Int16Type, Int32Type, Int64Type, UInt8Type, UInt16Type,
UInt32Type,
+UInt64Type,
+};
+use datafusion_common::{Result, internal_err};
+
+/// A "perfect" hash map for single-column integer join keys, represented as a
dense array.
+///
+/// This structure is highly optimized for joins where the keys are integers
within a limited
+/// range. Instead of calculating hashes, it uses the integer key itself as an
index into a
+/// `Vec`, achieving O(1) lookup performance.
+///
+/// # NULL Handling
+///
+/// This optimization can be used for joins with
`NullEquality::NullEqualsNothing` even if the
+/// join keys contain `NULL`s. This is because:
+///
+/// 1. `try_new` (build side): Ignores rows with `NULL` keys when creating the
map. This is
+///correct as `NULL` keys would not match anything anyway.
+/// 2. `get_matched_indices_with_limit_offset` (probe side): Skips any `NULL`
keys encountered
+///in the probe side input.
+///
+/// This structure **cannot** be used for joins with
`NullEquality::NullEqualsNull` if the
+/// build side contains `NULL`s, as it does not have a mechanism to store and
match `NULL` values.
+#[derive(Debug)]
+pub struct ArrayMap {
+// data[probSideVal-offset] -> valIdxInBuildSide + 1; 0 for absent
+data: Vec,
+offset: u64, // min val in buildSide
+next: Option>,
+}
+
+impl ArrayMap {
+/// Creates a new [`ArrayKV`] from the given array of join keys.
+///
+/// Note: This function processes only the non-null values in the input
`array`,
+/// effectively ignoring any rows where the key is `NULL`.
+///
+/// TODO: Support `NullEquality::NullEqualsNull` by storing null indices
in a
+/// separate `Vec` to allow for `NULL=NULL` matching in the future.
+pub(crate) fn try_new(
+array: &ArrayRef,
+offset_val: u64,
+range: usize,
+) -> Result {
+// Initialize with 0 (sentinel for not found)
+let mut data: Vec = vec![0; range];
+let mut next: Option> = None;
Review Comment:
```suggestion
let mut next: Vec = vec![];
```
I think this should work as well
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
