Hello Kudu Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/19794
to look at the new patch set (#3).
Change subject: [cpp-client] KUDU-3455 Reduce space complexity and speed up
hash partition pruning for in-list predicate
......................................................................
[cpp-client] KUDU-3455 Reduce space complexity and speed up hash partition
pruning for in-list predicate
This patch's background is in https://gerrit.cloudera.org/c/19568/. As
that patch said, logic of pruning hash partitions for in-list predicate
in Kudu cpp client has also a high space complexity and slow. Old
algorithm must keep all intermedium objects because they are incomplete
until they are completed and can be computed hash.
This patch fixes the problems and provides a recursive algorithm the
same as java client in patch https://gerrit.cloudera.org/c/19568/.
Unlike java client, this problem in cpp client is not critical.
Old algorithm in cpp client does rare go out of memory because it
swap the intermedium objects. This optimization has good benifit too.
The benifits depend on the in-list length and size of primary columns,
benifit are better if in-list length is bigger. For example,
PartitionPrunerTest::TestMultiColumnInListHashPruningManyValues,
Using 10 key columns and kMaxInListLength=50, old algorithm memory cost
may reach 600MB, while new algorithm's memory cost can be ignored
(it only need one objects and a few stacks for contexts). At the same
time, new algorithm has a good speedup, some effect as below:
combination_count: 5554006920000, old cost: 428238us, new cost: 713us, speedup:
600.6x
combination_count: 89083783664568, old cost: 2764924us, new cost: 1145us,
speedup: 2414.7x
combination_count: 27194091724800, old cost: 1610475us, new cost: 1151us,
speedup: 1399.2x
combination_count: 7116622216704, old cost: 34544289us, new cost: 375us,
speedup: 92118.1x
combination_count: 37570734489600, old cost: 1733205us, new cost: 901us,
speedup: 1923.6x
Change-Id: Ie4bea5c10b4ac2c62b85625fe9d2a33ceb4fb2e9
---
M src/kudu/common/partition_pruner-test.cc
M src/kudu/common/partition_pruner.cc
M src/kudu/common/partition_pruner.h
3 files changed, 228 insertions(+), 18 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/19794/3
--
To view, visit http://gerrit.cloudera.org:8080/19794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie4bea5c10b4ac2c62b85625fe9d2a33ceb4fb2e9
Gerrit-Change-Number: 19794
Gerrit-PatchSet: 3
Gerrit-Owner: Yuqi Du <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)