Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16859 )

Change subject: [master] Range specific hashing at table creation time.
......................................................................


Patch Set 3:

(10 comments)

Overall looks good. Mostly nits and some test suggestions.

http://gerrit.cloudera.org:8080/#/c/16859/2/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/16859/2/src/kudu/common/partition.cc@418
PS2, Line 418:   if (!split_rows.empty()) {
             :     for (const auto& hash_schemas : range_hash_schemas) {
             :       if (!hash_schemas.empty()) {
             :         return Status::InvalidArgument("Both 'split_rows' and 
'range_hash_schemas' cannot be "
             :                                        "populated at the same 
time.");
             :       }
             :     }
> I meant to say if the user was using the new client API, the size of 'range
Ack


http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/common/partition.cc@417
PS3, Line 417:
             :   if (!split_rows.empty()) {
             :     for (const auto& hash_schemas : range_hash_schemas) {
             :       if (!hash_schemas.empty()) {
             :         return Status::InvalidArgument("Both 'split_rows' and 
'range_hash_schemas' cannot be "
             :                                        "populated at the same 
time.");
             :       }
             :     }
             :   }
             :
             :   if (!range_hash_schemas.empty() && range_bounds.size() != 
range_hash_schemas.size()) {
             :     return Status::InvalidArgument("The number of range bounds 
does not match the number of user"
             :                                    "supplied hash schemas.");
             :   }
Do we have a partition-test case in which split rows is non-empty and 
range_hash_schemas is also non-empty but contains empty schemas? If not, can 
you add one?


http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/integration-tests/table_locations-itest.cc
File src/kudu/integration-tests/table_locations-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/integration-tests/table_locations-itest.cc@216
PS3, Line 216:   auto* partition_schema_pb = req.mutable_partition_schema();
Just FYI, this renders req.has_partition_schema() = true

See https://developers.google.com/protocol-buffers/docs/reference/cpp-generated 
and search for "mutable_"


http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/integration-tests/table_locations-itest.cc@459
PS3, Line 459: w}
nit: add a space


http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/integration-tests/table_locations-itest.cc@473
PS3, Line 473: user
nit: these are all "user" schemas. "per-range" would be more accurate.


http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/integration-tests/table_locations-itest.cc@473
PS3, Line 473: table wide
nit: capitalize "Table-wide"


http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/integration-tests/table_locations-itest.cc@491
PS3, Line 491:  // Tablets returned in sorted order by partition key range.
             :     EXPECT_EQ(string("\0\0\0\0" "\0\0\0\0" "a" "\0\0", 11),
             :               
resp.tablet_locations(0).partition().partition_key_start());
             :     EXPECT_EQ(string("\0\0\0\0" "\0\0\0\1" "a" "\0\0", 11),
             :               
resp.tablet_locations(1).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\0" "c" "\0\0", 7),
             :               
resp.tablet_locations(2).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\0" "e" "\0\0", 7),
             :               
resp.tablet_locations(3).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\1" "\0\0\0\0" "a" "\0\0", 11),
             :               
resp.tablet_locations(4).partition().partition_key_start());
             :     EXPECT_EQ(string("\0\0\0\1" "\0\0\0\1" "a" "\0\0", 11),
             :               
resp.tablet_locations(5).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\1" "c" "\0\0", 7),
             :               
resp.tablet_locations(6).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\1" "e" "\0\0", 7),
             :               
resp.tablet_locations(7).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\2" "\0\0\0\0" "a" "\0\0", 11),
             :               
resp.tablet_locations(8).partition().partition_key_start());
             :     EXPECT_EQ(string("\0\0\0\2" "\0\0\0\1" "a" "\0\0", 11),
             :               
resp.tablet_locations(9).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\2" "c" "\0\0", 7),
             :               
resp.tablet_locations(10).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\2" "e" "\0\0", 7),
             :               
resp.tablet_locations(11).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\3" "\0\0\0\0" "a" "\0\0", 11),
             :               
resp.tablet_locations(12).partition().partition_key_start());
             :     EXPECT_EQ(string("\0\0\0\3" "\0\0\0\1" "a" "\0\0", 11),
             :               
resp.tablet_locations(13).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\3" "c" "\0\0", 7),
             :               
resp.tablet_locations(14).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\3" "e" "\0\0", 7),
             :               
resp.tablet_locations(15).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\4" "c" "\0\0", 7),
             :               
resp.tablet_locations(16).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\4" "e" "\0\0", 7),
             :               
resp.tablet_locations(17).partition().partition_key_start());
             :
             :     EXPECT_EQ(string("\0\0\0\5" "c" "\0\0", 7),
             :               
resp.tablet_locations(18).partition().partition_key_start());
nit: it would be easier to read and extend as:

set<string> key_starts({
    string(...partitions for a),
    ...
    string(...partitions for c),
    ...
    string(...partitions for e),
    ...
});
vector<string> sorted_key_starts(key_starts.begin(), key_starts.end());
ASSERT_EQ(key_starts.size(), resp.tablet_locations_size());
for (int i = 0; i < sorted_key_starts.size(); i++) {
  EXPECT_EQ(sorted_key_starts[i], 
resp.tablet_locations(i).partition().partition_key_start());
}

That way it's really obvious that we have the right number of hashes per range, 
and it's also clear what "correct" ordering is.


http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/master/master.proto@507
PS3, Line 507: per user's request
nit: remove?


http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/master/master.proto@507
PS3, Line 507: per
nit: for


http://gerrit.cloudera.org:8080/#/c/16859/3/src/kudu/master/master.proto@508
PS3, Line 508: Only populated when user creates range bounds and not split 
rows, populated in the same order.
nit: we can be more specific and refer to other fields. Also mind filling this 
out with proper grammar? How about:

"Only populated when 'split_rows_range_bounds' specifies range bounds, and must 
be empty if any split rows are specified. If set, must have the same number of 
elements as there are range bounds, and must be in the same order. If empty, 
'partition_schema' is assumed for every range bound."



--
To view, visit http://gerrit.cloudera.org:8080/16859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f0dcbc3324f8f2d6e99b4d169fdf5c7f7dff95d
Gerrit-Change-Number: 16859
Gerrit-PatchSet: 3
Gerrit-Owner: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Comment-Date: Tue, 15 Dec 2020 08:42:39 +0000
Gerrit-HasComments: Yes

Reply via email to