Mahesh Reddy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17643 )

Change subject: [pruning] KUDU-2671: Pruning compatible with custom hash 
schemas.
......................................................................


Patch Set 7:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc
File src/kudu/common/partition_pruner-test.cc:

http://gerrit.cloudera.org:8080/#/c/17643/3/src/kudu/common/partition_pruner-test.cc@1366
PS3, Line 1366:
> Yep, I agree that ASSERT_OK() is a better choice here.
Ack


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc
File src/kudu/common/partition_pruner-test.cc:

http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc@1490
PS6, Line 1490: }
> I'm not sure whether I saw testcases to cover situations when scan_spec.Can
Custom hash schemas per range shouldn't change how ScanSpec::CanShortCircuit() 
functions. When checking if it can bypass future checks, the function checks if 
the lower bound set is greater than the upper bound set. It also checks for 
empty predicates or InList predicates where no list is provided. Both of these 
are independent of custom hash schemas per range so IMO I don't think it's 
necessary to add a test to cover this case.


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner-test.cc@1491
PS6, Line 1491: } // namespace kudu
> Also, does it make sense to add a few test cases to cover situations when r
TestHashSchemasPerRangePruning has a PK of (A, B, C) and is range partitioned 
on only one column of the PK (C). Is that test case enough coverage or do you 
think I should add more?


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h
File src/kudu/common/partition_pruner.h:

http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@46
PS6, Line 46:
> nit: why not just RangeBounds -- could there be any other than lower and up
nope, will change this to 'RangeBounds'.


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@48
PS6, Line 48: ounds(std
> What is semantic significance of using std::pair instead of struct here?  D
Nope, the pairs aren't related. Each pair contains the range bounds and the 
respective partition key ranges. I changed all the typedefs to structs because 
I thought it made more sense. I'm acutely aware that the struct of 
'RangeBounds' and 'PartitionKeyRange' can me merged into one if the field names 
match, but I'm not sure if it's better to have a separation of concerns or 
merge them. Please let me know your thoughts on this.

FYI, I needed the default constructors so the newly named field 
'range_bounds_to_partition_key_ranges' can be resized.


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.h@71
PS6, Line 71:
            :     RangeBounds range_bounds;
            :     std::vector<PartitionKeyRange> partition_key_ranges;
            :   };
            :
> I see this method is indirectly called in NextPartitionKey() method, and th
The issue with using the pre-computed result is that 
PartitionPruner::RemovePartitionKeyRange potentially modifies the size of the 
field every iteration, so we need to constantly be checking the size of this 
field.


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc
File src/kudu/common/partition_pruner.cc:

http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@241
PS6, Line 241: redicate
> nit: could use 'auto' here as well?
Done


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@243
PS6, Line 243: k;
> nit: per code style, the asterisk should stick to the type, not variable
Done


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@521
PS6, Line 521: ge_it
> nit: for better readability, maybe name this iterator 'range_it'?
Done


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@521
PS6, Line 521: 
> nit: could establish a reference variable to refer to partition_key_range.s
Done


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@523
PS6, Line 523: ((*rang
> nit: prefer using prefix increment operators -- they might be faster becaus
Done


http://gerrit.cloudera.org:8080/#/c/17643/6/src/kudu/common/partition_pruner.cc@557
PS6, Line 557:
> nit: move '&&' to the previous line
Done



--
To view, visit http://gerrit.cloudera.org:8080/17643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I05c37495430f61a2c6f6012c72251138aee465b7
Gerrit-Change-Number: 17643
Gerrit-PatchSet: 7
Gerrit-Owner: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 31 Jul 2021 09:02:36 +0000
Gerrit-HasComments: Yes

Reply via email to