Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16859 )

Change subject: [master] Range specific hashing at table creation time.
......................................................................


Patch Set 6:

(5 comments)

I think this patch looks OK given it's scoped to handling CreateTable requests. 
That said, there are some serious limitations that are worth calling out and 
addressing in subsequent patches.

http://gerrit.cloudera.org:8080/#/c/16859/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16859/6//COMMIT_MSG@12
PS6, Line 12:
While this appears to create the correct partitions, it looks like the 
underlying tablets won't get correct partition schemas. I am OK with doing that 
in the next patch, but it's worth calling that out as a limitation here.


http://gerrit.cloudera.org:8080/#/c/16859/4/src/kudu/common/partition-test.cc
File src/kudu/common/partition-test.cc:

http://gerrit.cloudera.org:8080/#/c/16859/4/src/kudu/common/partition-test.cc@1157
PS4, Line 1157:   struct PartitionData {
> The only reason I can think of now is concerning the metadata. Currently, t
Ah, thanks for calling this out. I looked through the rest of the create table 
path, and it seems like there's a bit more that needs to be done before we're 
done with the master. I left some comments in relevant code.

The work can be in a later patch, but it's probably also worth mentioning the 
limitations of this patch in its commit message.


http://gerrit.cloudera.org:8080/#/c/16859/5/src/kudu/common/partition-test.cc
File src/kudu/common/partition-test.cc:

http://gerrit.cloudera.org:8080/#/c/16859/5/src/kudu/common/partition-test.cc@1218
PS5, Line 1218:     if (d1.range_key_start == d2.range_key_start) {
              :       return d1.partition_key_start < d2.partition_key_start;
              :     }
              :     return d1.range_key_start < d2.range_key_start;
> They're all unique, but if I don't check for the range_key_start then they
Ah I forgot for a second that the comparator here is for sorting and not for 
uniqueness in a set :)


http://gerrit.cloudera.org:8080/#/c/16859/6/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/16859/6/src/kudu/master/catalog_manager.cc@1852
PS6, Line 1852: partition_schema
We should also pass 'range_hash_schemas' here. As is, looks like we're dropping 
per-range metadata on the floor, meaning our PartitionPBs might not make sense. 
As you mentioned in the other thread, since we don't have more info in our 
Partition class and PartitionSchema classes, we're missing out on using the 
correct hash buckets. e.g. if we specify a 10 hashes for a range when the 
default hash buckets is 1, we'd have a bunch of PartitionPBs that point at 
buckets [0-9] and fail when we realize that our hash bucket values don't match 
our partition schema. This would probably show itself if we tried to write to 
or scan from a table. We'd probably also see this if we were to try examining 
the tablets on the tablet servers and checking for their partition schemas.


http://gerrit.cloudera.org:8080/#/c/16859/6/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/16859/6/src/kudu/master/master.proto@178
PS6, Line 178:   // The table's partitioning schema.
             :   optional PartitionSchemaPB partition_schema = 9;
Should also include the PerRangeHashBucketSchemasPB here.



--
To view, visit http://gerrit.cloudera.org:8080/16859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f0dcbc3324f8f2d6e99b4d169fdf5c7f7dff95d
Gerrit-Change-Number: 16859
Gerrit-PatchSet: 6
Gerrit-Owner: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Comment-Date: Thu, 17 Dec 2020 03:07:20 +0000
Gerrit-HasComments: Yes

Reply via email to