This is an automated email from the ASF dual-hosted git repository.
panxiaolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new f5feae9bc0e [opt](function) Eliminate redundant hash computation in
AggregateFunctionUniq (#61730)
f5feae9bc0e is described below
commit f5feae9bc0ebfda1ac19d4069d80d450765d2101
Author: Mryange <[email protected]>
AuthorDate: Tue Mar 31 10:05:30 2026 +0800
[opt](function) Eliminate redundant hash computation in
AggregateFunctionUniq (#61730)
Problem Summary:
In `AggregateFunctionUniq::add_batch_single_place` and `add_batch` (the
hot path
for `SELECT count(distinct col)`), every key's hash is computed
**twice**:
1. `set.prefetch(keys[i + HASH_MAP_PREFETCH_DIST])` internally calls
`this->hash(key)` to locate the slot for CPU prefetch.
2. `set.insert(keys[i])` goes through `emplace` → `EmplaceDecomposable`
→
`s.hash(key)` again to find or prepare the insert position.
The same codebase already has a correct "precompute hash + reuse"
pattern in the
`DistinctStreamingAgg` operator path (`hash_map_context.h`), where
`init_hash_values()` computes hash once, and subsequent `prefetch` /
`lazy_emplace_key` calls reuse the precomputed value. This PR applies
the same
optimization to `AggregateFunctionUniq`.
### How does this PR solve it?
For both `add_batch` and `add_batch_single_place` in
`aggregate_function_uniq.h`
and `aggregate_function_uniq_distribute_key.h`:
1. **Precompute hash values** into a `std::vector<size_t>` before the
main loop,
using `set.hash(keys[i])` — this is the only hash computation per key.
2. **Replace `set.prefetch(key)`** with
`set.prefetch_hash(hash_values[...])` —
reuses the precomputed hash, avoids recalculation and the unnecessary
memory
access to `keys[i + HASH_MAP_PREFETCH_DIST]` at prefetch time.
3. **Replace `set.insert(key)`** with
`set.emplace_with_hash(hash_values[i], key)`
— passes the precomputed hash directly, skipping the internal hash call.
Both `prefetch_hash` and `emplace_with_hash` are existing APIs in phmap
(`parallel_hashmap/phmap.h`), no third-party changes needed.
**Expected improvements:**
- Hash computation reduced from 2× to 1× per key
- The precompute loop is a pure sequential scan over `keys[]`, which is
more
cache-friendly than interleaving hash computation with hash-table
probing
- Better prefetch effectiveness: prefetch no longer needs to access
`keys[i + HASH_MAP_PREFETCH_DIST]` memory just to compute its hash
---
be/src/exprs/aggregate/aggregate_function_uniq.h | 22 +++++++++++++++++-----
.../aggregate_function_uniq_distribute_key.h | 22 +++++++++++++++++-----
2 files changed, 34 insertions(+), 10 deletions(-)
diff --git a/be/src/exprs/aggregate/aggregate_function_uniq.h
b/be/src/exprs/aggregate/aggregate_function_uniq.h
index 4f10bc9f2ce..7de1732c1ca 100644
--- a/be/src/exprs/aggregate/aggregate_function_uniq.h
+++ b/be/src/exprs/aggregate/aggregate_function_uniq.h
@@ -164,13 +164,19 @@ public:
array_of_data_set[i] = &(this->data(places[i] + place_offset).set);
}
+ // Precompute hash values to avoid double computation in prefetch +
insert
+ std::vector<size_t> hash_values(batch_size);
+ for (size_t i = 0; i < batch_size; ++i) {
+ hash_values[i] = array_of_data_set[i]->hash(keys[i]);
+ }
+
for (size_t i = 0; i != batch_size; ++i) {
if (i + HASH_MAP_PREFETCH_DIST < batch_size) {
- array_of_data_set[i + HASH_MAP_PREFETCH_DIST]->prefetch(
- keys[i + HASH_MAP_PREFETCH_DIST]);
+ array_of_data_set[i + HASH_MAP_PREFETCH_DIST]->prefetch_hash(
+ hash_values[i + HASH_MAP_PREFETCH_DIST]);
}
- array_of_data_set[i]->insert(keys[i]);
+ array_of_data_set[i]->emplace_with_hash(hash_values[i], keys[i]);
}
}
@@ -193,11 +199,17 @@ public:
const KeyType* keys = get_keys(keys_container, *columns[0],
batch_size);
auto& set = this->data(place).set;
+ // Precompute hash values to avoid double computation in prefetch +
insert
+ std::vector<size_t> hash_values(batch_size);
+ for (size_t i = 0; i < batch_size; ++i) {
+ hash_values[i] = set.hash(keys[i]);
+ }
+
for (size_t i = 0; i != batch_size; ++i) {
if (i + HASH_MAP_PREFETCH_DIST < batch_size) {
- set.prefetch(keys[i + HASH_MAP_PREFETCH_DIST]);
+ set.prefetch_hash(hash_values[i + HASH_MAP_PREFETCH_DIST]);
}
- set.insert(keys[i]);
+ set.emplace_with_hash(hash_values[i], keys[i]);
}
}
diff --git a/be/src/exprs/aggregate/aggregate_function_uniq_distribute_key.h
b/be/src/exprs/aggregate/aggregate_function_uniq_distribute_key.h
index 8d44d74d03e..35c2f5f7f2a 100644
--- a/be/src/exprs/aggregate/aggregate_function_uniq_distribute_key.h
+++ b/be/src/exprs/aggregate/aggregate_function_uniq_distribute_key.h
@@ -122,13 +122,19 @@ public:
array_of_data_set[i] = &(this->data(places[i] + place_offset).set);
}
+ // Precompute hash values to avoid double computation in prefetch +
insert
+ std::vector<size_t> hash_values(batch_size);
+ for (size_t i = 0; i < batch_size; ++i) {
+ hash_values[i] = array_of_data_set[i]->hash(keys[i]);
+ }
+
for (size_t i = 0; i != batch_size; ++i) {
if (i + HASH_MAP_PREFETCH_DIST < batch_size) {
- array_of_data_set[i + HASH_MAP_PREFETCH_DIST]->prefetch(
- keys[i + HASH_MAP_PREFETCH_DIST]);
+ array_of_data_set[i + HASH_MAP_PREFETCH_DIST]->prefetch_hash(
+ hash_values[i + HASH_MAP_PREFETCH_DIST]);
}
- array_of_data_set[i]->insert(keys[i]);
+ array_of_data_set[i]->emplace_with_hash(hash_values[i], keys[i]);
}
}
@@ -138,11 +144,17 @@ public:
const KeyType* keys = get_keys(keys_container, *columns[0],
batch_size);
auto& set = this->data(place).set;
+ // Precompute hash values to avoid double computation in prefetch +
insert
+ std::vector<size_t> hash_values(batch_size);
+ for (size_t i = 0; i < batch_size; ++i) {
+ hash_values[i] = set.hash(keys[i]);
+ }
+
for (size_t i = 0; i != batch_size; ++i) {
if (i + HASH_MAP_PREFETCH_DIST < batch_size) {
- set.prefetch(keys[i + HASH_MAP_PREFETCH_DIST]);
+ set.prefetch_hash(hash_values[i + HASH_MAP_PREFETCH_DIST]);
}
- set.insert(keys[i]);
+ set.emplace_with_hash(hash_values[i], keys[i]);
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]