Hello David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15511

to look at the new patch set (#4).

Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table.
......................................................................

WIP IMPALA-9434: Implement Robin Hood Hash Table.

Robin hood hashing reduces the variances of probe lengths by
continually rebalancing elements. This patch is intended to be the
first step towards full robin hood hash table implementation by doing
bucket rebalancing after every insert.

Instead of proactively swapping elements during insertion, the
algorithm does the rebalancing by first letting the new element probe
an empty bucket and claim it through PrepareBucketForInsert(). After
the bucket at index curr_idx claimed, the rebalancing algorithm in
function Rebalance() walk through the probe sequence again to find
richer or empty bucket to swap with bucket[curr_idx]. The search and
swapping continue until it swap with empty bucket or does not find any
richer bucket along the probe sequence. It effectively treat
bucket[curr_idx] as temporary variable during the rebalancing process.
Rebalance() return the new index of the recently claimed bucket after
rebalancing process done.

The decision to not swap elements proactively is because the probe
function is used for both insertion and lookup-only. Keeping probing
routine decoupled from insertion routine allow us to invoke rebalance
only after bucket claim, not on lookup-only. This patch also maintain
compatibility with quadratic probing. However, it block us from
leveraging some robin hood technique that only possible with linear
probing such as lookup short-circuit (fast lookup on nonexistent
keys).

Testing:
- Pass core tests

Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/hash-table-benchmark.cc
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exprs/scalar-expr.h
6 files changed, 610 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/15511/4
--
To view, visit http://gerrit.cloudera.org:8080/15511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986
Gerrit-Change-Number: 15511
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: David Rorke <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>

Reply via email to