Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17643 )
Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash schemas. ...................................................................... Patch Set 7: (12 comments) http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366 PS3, Line 1366: > Yep, I agree that ASSERT_OK() is a better choice here. Ack http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc@1490 PS6, Line 1490: } > I'm not sure whether I saw testcases to cover situations when scan_spec.Can Custom hash schemas per range shouldn't change how ScanSpec::CanShortCircuit() functions. When checking if it can bypass future checks, the function checks if the lower bound set is greater than the upper bound set. It also checks for empty predicates or InList predicates where no list is provided. Both of these are independent of custom hash schemas per range so IMO I don't think it's necessary to add a test to cover this case. http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc@1491 PS6, Line 1491: } // namespace kudu > Also, does it make sense to add a few test cases to cover situations when r TestHashSchemasPerRangePruning has a PK of (A, B, C) and is range partitioned on only one column of the PK (C). Is that test case enough coverage or do you think I should add more? http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h File src/kudu/common/partition_pruner.h: http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@46 PS6, Line 46: > nit: why not just RangeBounds -- could there be any other than lower and up nope, will change this to 'RangeBounds'. http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@48 PS6, Line 48: ounds(std > What is semantic significance of using std::pair instead of struct here? D Nope, the pairs aren't related. Each pair contains the range bounds and the respective partition key ranges. I changed all the typedefs to structs because I thought it made more sense. I'm acutely aware that the struct of 'RangeBounds' and 'PartitionKeyRange' can me merged into one if the field names match, but I'm not sure if it's better to have a separation of concerns or merge them. Please let me know your thoughts on this. FYI, I needed the default constructors so the newly named field 'range_bounds_to_partition_key_ranges' can be resized. http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@71 PS6, Line 71: : RangeBounds range_bounds; : std::vector<PartitionKeyRange> partition_key_ranges; : }; : > I see this method is indirectly called in NextPartitionKey() method, and th The issue with using the pre-computed result is that PartitionPruner::RemovePartitionKeyRange potentially modifies the size of the field every iteration, so we need to constantly be checking the size of this field. http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc File src/kudu/common/partition_pruner.cc: http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@241 PS6, Line 241: redicate > nit: could use 'auto' here as well? Done http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@243 PS6, Line 243: k; > nit: per code style, the asterisk should stick to the type, not variable Done http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@521 PS6, Line 521: ge_it > nit: for better readability, maybe name this iterator 'range_it'? Done http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@521 PS6, Line 521: > nit: could establish a reference variable to refer to partition_key_range.s Done http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@523 PS6, Line 523: ((*rang > nit: prefer using prefix increment operators -- they might be faster becaus Done http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@557 PS6, Line 557: > nit: move '&&' to the previous line Done -- To view, visit http://gerrit.cloudera.org:8080/17643 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7 Gerrit-Change-Number: 17643 Gerrit-PatchSet: 7 Gerrit-Owner: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Sat, 31 Jul 2021 09:02:36 +0000 Gerrit-HasComments: Yes
