Mahesh Reddy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16859 )

Change subject: [master] Range specific hashing at table creation time.
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16859/4/src/kudu/common/partition-test.cc
File src/kudu/common/partition-test.cc:

http://gerrit.cloudera.org:8080/#/c/16859/4/src/kudu/common/partition-test.cc@1157
PS4, Line 1157:   struct PartitionData {
> I see, thanks for clarifying.
The only reason I can think of now is concerning the metadata. Currently, the 
metadata of the table contains the table wide hash schemas and the metadata for 
each partition contains the hash buckets, and both the start and end of its 
partition keys. If we decide to store the per-range schemas in the metadata 
(which I imagine we probably will?), then a change to either the 
Partition/PartitionSchema class will be necessary.


http://gerrit.cloudera.org:8080/#/c/16859/5/src/kudu/common/partition-test.cc
File src/kudu/common/partition-test.cc:

http://gerrit.cloudera.org:8080/#/c/16859/5/src/kudu/common/partition-test.cc@1166
PS5, Line 1166: {
              :           {0}, string("a1\0\0\0\0c1", 8), 
string("a2\0\0\0\0c2", 8),
              :               string("\0\0\0\0" "a1\0\0\0\0c1", 12), 
string("\0\0\0\0" "a2\0\0\0\0c2", 12)
              :       },
> nit: the GSG is a bit fuzz here since there are multiple layers of initiali
Done


http://gerrit.cloudera.org:8080/#/c/16859/5/src/kudu/common/partition-test.cc@1218
PS5, Line 1218:     if (d1.range_key_start == d2.range_key_start) {
              :       return d1.partition_key_start < d2.partition_key_start;
              :     }
              :     return d1.range_key_start < d2.range_key_start;
> Aren't all of the partition_key_start values unique? Can't we just return d
They're all unique, but if I don't check for the range_key_start then they 
won't be grouped by their ranges like they should be. For example, all the 
partitions with hash_buckets = 0 will be returned first since that causes their 
partition key to be less than the other partition keys.

For this test case, I actually don't need a sorting function since I insert 
them in order anyways. I just added it in case we want to come back to this at 
any point and insert them out of order.



--
To view, visit http://gerrit.cloudera.org:8080/16859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f0dcbc3324f8f2d6e99b4d169fdf5c7f7dff95d
Gerrit-Change-Number: 16859
Gerrit-PatchSet: 6
Gerrit-Owner: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Comment-Date: Thu, 17 Dec 2020 02:01:45 +0000
Gerrit-HasComments: Yes

Reply via email to