Hello Tim Armstrong,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/1519
to look at the new patch set (#9).
Change subject: IMPALA-2653: COUNT DISTINCT performance is not monotonic.
......................................................................
IMPALA-2653: COUNT DISTINCT performance is not monotonic.
In the status quo, a PartitionedAggregationNode upstream of another
PartitionedAggregationNode that sees the rows in the order of their
hash values can end up with hash tables in which the item distribution
among the buckets is very skewed. This leads to long probe lengths
that cause non-linear performance of HashTable::Insert, which creates
long execution times for queries like COUNT DISTINCT.
Hash tables resize only when they surpass fill 0.75. Imagine filling a
hash table with 1024 buckets by reading, in hash value order, from a
table with 2048 buckets, filled with 1280 items.
The small hash table will only resize once we have read 768
items. This will appear in roughly the first (2048/1280)*768 = ~1229
buckets of the large hash table. Because we are reading them in hash
value order, we expect the first 640 items (appearing in about the
first 1024 buckets of the large hash table) to be evenly distributed
in the small hash table.
The remaining 128 items from buckets 1024 to ~1229 in the large table
are distributed in the first ~205 buckets of the smaller table. This
leaves the smaller table very unbalanced - it is bottom-heavy, having
about 256 items that belong in the bottom 205 buckets.
This patch strides over the larger hash table 3 buckets at a time,
rather than 1. In the example above, this means that the last 128
items to be put into the smaller hash table are spread out over about
614 = ~128*(2048/1280) buckets. Since the lower 614 buckets in the
smaller hash table have 368 items in them when the next 128 are
inserted, they do not get overloaded.
With a HashTable::MAX_FILL_FACTOR of 0.75, the destination hash
table's lower buckets end up with a fill factor of no more than 0.75 +
0.75/3 = 1, and so do not overfill.
Witht this stride change in place, we can go back to using linear
probing, which gives a small performance bump over quadratic probing.
Change-Id: I9bcac97424c366121c480c3cb48400a44bffd69f
---
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-aggregation-node.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/com/cloudera/impala/planner/AggregationNode.java
A
testdata/workloads/targeted-perf/queries/primitive_count_distinct_bigint_highndv.test
8 files changed, 64 insertions(+), 7 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/19/1519/9
--
To view, visit http://gerrit.cloudera.org:8080/1519
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9bcac97424c366121c480c3cb48400a44bffd69f
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Jim Apple <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>