Yuqi Du has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19794
Change subject: [client] Reduce space complexity and speed up hash partition pruning for in-list predicate ...................................................................... [client] Reduce space complexity and speed up hash partition pruning for in-list predicate This patch's background is in https://gerrit.cloudera.org/c/19568/. As that patch said, logic of pruning hash partitions for in-list predicate in Kudu cpp client has also a high space complexity and slow. Old algorithm must keep all intermedium objects because they are incomplete until they are completed and can be compute hash. This patch fixes the problems and provides a recursive algorithm the same as java client in patch https://gerrit.cloudera.org/c/19568/. Unlike java client, this problem in cpp client is not critical. Old algorithm in cpp client does rare go out of memory because it swap the intermedium objects. This optimization has good benifit too. PartitionPrunerTest::TestMultiColumnInListHashPruningManyValues can show some benifits. The benifits depend on the in-list length, for example, in the test TestMultiColumnInListHashPruningManyValues, set kMaxInListLength 50, old algorithm memory cost can reach 300MB, while new algorithm's memory cost can be ignored(it only need one object and a few stacks for context). At the same time, new algorithm has a good speedup, below shows some effect: combination_count: 5554006920000, old algorithm cost: 428238us, new algorithm cost: 713us, speedup: 600.61430575035058x combination_count: 89083783664568, old algorithm cost: 2764924us, new algorithm cost: 1145us, speedup: 2414.780786026201x combination_count: 27194091724800, old algorithm cost: 1610475us, new algorithm cost: 1151us, speedup: 1399.1963509991313x combination_count: 7116622216704, old algorithm cost: 34544289us, new algorithm cost: 375us, speedup: 92118.104x combination_count: 37570734489600, old algorithm cost: 1733205us, new algorithm cost: 901us, speedup: 1923.6459489456161x Change-Id: Ie4bea5c10b4ac2c62b85625fe9d2a33ceb4fb2e9 --- M src/kudu/common/partition_pruner-test.cc M src/kudu/common/partition_pruner.cc M src/kudu/common/partition_pruner.h 3 files changed, 228 insertions(+), 18 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/19794/1 -- To view, visit http://gerrit.cloudera.org:8080/19794 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie4bea5c10b4ac2c62b85625fe9d2a33ceb4fb2e9 Gerrit-Change-Number: 19794 Gerrit-PatchSet: 1 Gerrit-Owner: Yuqi Du <[email protected]>
