Yifan Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19794 )
Change subject: [cpp-client] KUDU-3455 Reduce space complexity and speed up hash partition pruning for in-list predicate ...................................................................... Patch Set 3: (17 comments) http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@9 PS3, Line 9: This patch's background is in https://gerrit.cloudera.org/c/19568/. As This patch comes from https://gerrit.cloudera.org/c/19568/. http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@18 PS3, Line 18: Unlike java client, this problem in cpp client is not critical. : Old algorithm in cpp client does rare go out of memory because it : swap the intermedium objects. Compared with the java client, the cpp client is less likely to cause the OOM condition because it does not keep too many intermediate results. http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@21 PS3, Line 21: depend on are related to http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@21 PS3, Line 21: size of the number of http://gerrit.cloudera.org:8080/#/c/19794/3//COMMIT_MSG@22 PS3, Line 22: benifit are better The performance are better http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@974 PS3, Line 974: // For test cases that should run with both kinds of tokens. I don't quite understand this sentence, please reorganize it. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@982 PS3, Line 982: with a few key columns (10) nit: with 10 columns. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@983 PS3, Line 983: // Check new algorithm is correct by comparing with the old one. Check the correctness of the new algorithm by comparing it with the old one. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@984 PS3, Line 984: // Compare the speed on the two algorithms and show the speedup. Compare the efficiency of the two algorithms. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@985 PS3, Line 985: TEST_P(PartitionPrunerTestWithMaxInListLength, TestMultiColumnInListHashPruningManyValues) { nit: Add 'SKIP_IF_SLOW_NOT_ALLOWED()' for the test. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@1023 PS3, Line 1023: comparator with the old one comparing it with the old one. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@1046 PS3, Line 1046: The following logs can compare v2 and v1. The following logs are used to compare the efficiency of the two algorithms. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner-test.cc@1063 PS3, Line 1063: / Reserve bigger enough capacity(kMaxSafeLength * kKeyColumnSize) to avoid renew memory : // and copy objects which would cause memory pointer we record changed. Increase the vector's capacity to kMaxSafeLength * kKeyColumnSize to avoid reallocation that invalidates all references to the elements. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.h File src/kudu/common/partition_pruner.h: http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.h@94 PS3, Line 94: static void ComputeHashBuckets(const Schema& schema, nit: Please add some description about this newly added method. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.cc File src/kudu/common/partition_pruner.cc: http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.cc@182 PS3, Line 182: // newer nit: Remove this line. http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.cc@216 PS3, Line 216: indexes_picked nit: What about renaming it to 'predicate_values_picked'? http://gerrit.cloudera.org:8080/#/c/19794/3/src/kudu/common/partition_pruner.cc@242 PS3, Line 242: push_back nit: What about using 'emplace_back'? -- To view, visit http://gerrit.cloudera.org:8080/19794 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie4bea5c10b4ac2c62b85625fe9d2a33ceb4fb2e9 Gerrit-Change-Number: 19794 Gerrit-PatchSet: 3 Gerrit-Owner: Yuqi Du <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Ashwani Raina <[email protected]> Gerrit-Reviewer: KeDeng <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Wang Xixu <[email protected]> Gerrit-Reviewer: Yifan Zhang <[email protected]> Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Reviewer: Yuqi Du <[email protected]> Gerrit-Comment-Date: Fri, 05 May 2023 01:29:16 +0000 Gerrit-HasComments: Yes
