Hello Mahesh Reddy, Tidy Bot, Kudu Jenkins, Andrew Wong,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18045

to look at the new patch set (#5).

Change subject: KUDU-2671 number of per-range hash dimensions should be fixed 
for now
......................................................................

KUDU-2671 number of per-range hash dimensions should be fixed for now

As it turned out, updating the client's metacache, the system catalog's
logic, and the partition pruner to accommodate for partition keys with
variable size of the hash part seems to be a substantial effort on
itself.  However, we can still deliver the most frequently requested
functionality of changing the number of hash buckets per range partition
if adding the restriction on the size of the hash part of a partition
key, requiring it to be of the same size across all the range partitions
in a table.

So, this patch adds a new restriction for per-range custom schemas:
the number of hash dimensions must be the same for all the ranges in
a table.  Since the absence of hash bucketing is equivalent to having
zero hash dimensions for a table's range, that means it's not possible
to make a particular range having no hash bucketing at all if the rest
of the ranges in the table have non-trivial hash schemas.

With the introduced restriction, it's still possible to change other
parameters used to define a hash schema per range in any hash dimension:
  * the number of hash buckets (NOTE: the number of hash buckets must be
    equal or greater than two)
  * the set of columns for the hash bucketing
  * the seed for the hash function

As a part of this changelist, a few test scenarios are now disabled:
those are to be re-enabled once the rest of the code in the system
catalog, the client metacache, and the partition pruner is able to
handle different number of hash dimensions.  In addition, new test
scenarios have been added to verify that the invariant of the same
number of hash dimensions across all the range partition is properly
enforced at the server side while creating a table.

Also, I updated the comparison operator for PartitionKey: since the
number of hash dimensions isn't varying across per-range hash schemas,
it's no longer necessary to concatenate the hash and the range parts to
provide the legacy ordering of partition keys for some edge cases which
were pertinent to situations with varying number of hash dimensions for
the hash schema of a range.  The implementation of the PartitionKey's
comparison operator might change if switching to a single string under
the hood, but at this point I decided to keep them separate.  For the
sake of keeping the code future-proof and easier for review, I think
of starting using strings views (or Slice) for the range_key() and the
hash_key() methods in a follow-up changelist, regardless of how the
serialized partition key is represented under the hood in PartitionKey.

Change-Id: Ic884fa556462b85c64d77385a521d9077d33c7c1
---
M src/kudu/client/flex_partitioning_client-test.cc
M src/kudu/common/partition.h
M src/kudu/common/partition_pruner-test.cc
M src/kudu/integration-tests/table_locations-itest.cc
M src/kudu/master/catalog_manager.cc
5 files changed, 422 insertions(+), 79 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/45/18045/5
--
To view, visit http://gerrit.cloudera.org:8080/18045
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic884fa556462b85c64d77385a521d9077d33c7c1
Gerrit-Change-Number: 18045
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Tidy Bot (241)

Reply via email to