Yifan Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19794 )

Change subject: [cpp-client] KUDU-3455 Reduce space complexity and speed up 
hash partition pruning for in-list predicate
......................................................................


Patch Set 3:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@9
PS3, Line 9: This patch's background is in 
https://gerrit.cloudera.org/c/19568/. As
This patch comes from https://gerrit.cloudera.org/c/19568/.


http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@18
PS3, Line 18: Unlike java client, this problem in cpp client is not critical.
            : Old algorithm in cpp client does rare go out of memory because it
            : swap the intermedium objects.
Compared with the java client, the cpp client is less likely to cause the OOM 
condition because it does not keep too many intermediate results.


http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@21
PS3, Line 21: depend on
are related to


http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@21
PS3, Line 21: size of
the number of


http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@22
PS3, Line 22: benifit are better
The performance are better


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc
File src/kudu/common/partition_pruner-test.cc:

http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@974
PS3, Line 974: // For test cases that should run with both kinds of tokens.
I don't quite understand this sentence, please reorganize it.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@982
PS3, Line 982: with a few key columns (10)
nit: with 10 columns.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@983
PS3, Line 983: // Check new algorithm is correct by comparing with the old one.
Check the correctness of the new algorithm by comparing it with the old one.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@984
PS3, Line 984: // Compare the speed on the two algorithms and show the speedup.
Compare the efficiency of the two algorithms.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@985
PS3, Line 985: TEST_P(PartitionPrunerTestWithMaxInListLength, 
TestMultiColumnInListHashPruningManyValues) {
nit: Add 'SKIP_IF_SLOW_NOT_ALLOWED()' for the test.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@1023
PS3, Line 1023: comparator with the old one
comparing it with the old one.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@1046
PS3, Line 1046: The following logs can compare v2 and v1.
The following logs are used to compare the efficiency of the two algorithms.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@1063
PS3, Line 1063: / Reserve bigger enough capacity(kMaxSafeLength * 
kKeyColumnSize) to avoid renew memory
              :     // and copy objects which would cause memory pointer we 
record changed.
Increase the vector's capacity to kMaxSafeLength * kKeyColumnSize to avoid 
reallocation that invalidates all references to the elements.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.h
File src/kudu/common/partition_pruner.h:

http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.h@94
PS3, Line 94:   static void ComputeHashBuckets(const Schema& schema,
nit: Please add some description about this newly added method.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.cc
File src/kudu/common/partition_pruner.cc:

http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.cc@182
PS3, Line 182: // newer
nit: Remove this line.


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.cc@216
PS3, Line 216: indexes_picked
nit: What about renaming it to 'predicate_values_picked'?


http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.cc@242
PS3, Line 242: push_back
nit: What about using 'emplace_back'?



--
To view, visit http://gerrit.cloudera.org:8080/19794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie4bea5c10b4ac2c62b85625fe9d2a33ceb4fb2e9
Gerrit-Change-Number: 19794
Gerrit-PatchSet: 3
Gerrit-Owner: Yuqi Du <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: KeDeng <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Yuqi Du <[email protected]>
Gerrit-Comment-Date: Fri, 05 May 2023 01:29:16 +0000
Gerrit-HasComments: Yes

Reply via email to