Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17769 )

Change subject: KUDU-2671 key encoding for custom hash bucket schemas
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/17769/4/src/kudu/client/flex_partitioning_client-test.cc
File src/kudu/client/flex_partitioning_client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17769/4/src/kudu/client/flex_partitioning_client-test.cc@228
PS4, Line 228: shared_ptr<KuduTable> table;
             :     ASSERT_OK(client_->OpenTable(table_name, &table));
             :
             :     vector<scoped_refptr<TabletInfo>> all_tablets_info;
             :     {
             :       auto* cm = 
cluster_->mini_master(0)->master()->catalog_manager();
             :       CatalogManager::ScopedLeaderSharedLock l(cm);
             :       scoped_refptr<master::TableInfo> table_info;
             :       ASSERT_OK(cm->GetTableInfo(table->id(), &table_info));
             :       table_info->GetAllTablets(&all_tablets_info);
             :     }
             :     vector<scoped_refptr<TabletReplica>> replicas;
             :     for (const auto& tablet_info : all_tablets_info) {
             :       for (auto i = 0; i < cluster_->num_tablet_servers(); ++i) {
             :         scoped_refptr<TabletReplica> r;
             :         ASSERT_TRUE(cluster_->mini_tablet_server(i)->server()->
             :                         
tablet_manager()->LookupTablet(tablet_info->id(), &r));
             :         replicas.emplace_back(std::move(r));
             :       }
             :     }
             :
             :     int64_t count = 0;
             :     for (const auto& r : replicas) {
             :       count += r->CountLiveRowsNoFail();
             :     }
> nit: would client_->GetTableStatistics() work instead?
Yeah, but that's super slow -- even with 10ms tablet server to master hearbeat 
interval, it takes many retries under ASSERT_EVENTUALLY() to get this reported 
as needed.  Compare:

FlexPartitioningCreateTableTest.DefaultAndCustomHashBuckets (10261 ms)

vs

FlexPartitioningCreateTableTest.DefaultAndCustomHashBuckets (1007 ms)

I guess at some point we will remove this chunk of code anyways once partition 
pruning works, so maybe let's strive to keep the testing time lower until the 
partition pruning is correctly implemented?


http://gerrit.cloudera.org:8080/#/c/17769/3/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/17769/3/src/kudu/common/partition.cc@1525
PS3, Line 1525:
> I see. I suppose an alternative would be
Yeah, that would make a very long string.  I think I'd rather revamp the 
terminology throughout the code here in a separate changelist:

  * hash bucket schema --> hash schema
  * hash partitioning schema --> hash schema
  * range partition schema --> range schema

With that, the naming would be more universal and there should be opportunities 
to shorten some lines in the code.

Does it make sense to you?


http://gerrit.cloudera.org:8080/#/c/17769/3/src/kudu/common/partition.cc@1528
PS3, Line 1528:
> Thanks for fixing that! It's much clearer now.
Thank you for pointing to the inconsistency -- it seems this class uses a 
clean-up on terminology used for various entities.


http://gerrit.cloudera.org:8080/#/c/17769/4/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/17769/4/src/kudu/common/partition.cc@260
PS4, Line 260: std::distance(ranges_ptr->cbegin(), it)
> Rather than pointing to an index, would it make sense to have the value be
It might be an option and I started with that (well, I did use iterators to be 
able to spot those issues I described below), but then I realized it would be 
necessary to prohibit copying and moving instances of PartitionSchema or 
implement custom copying & moving operators.  Otherwise there would be dangling 
pointers -- as of now, there are use cases in the code when PartionSchema 
objects are being copied or moved.

I'd rather clean up the code in that regard and introduce custom copying/moving 
operators in a separate changelist, moving this into using iterators/pointers 
instead.

Is it OK with you?


http://gerrit.cloudera.org:8080/#/c/17769/4/src/kudu/common/partition.cc@1547
PS4, Line 1547: key
> nit: range_key
Done



--
To view, visit http://gerrit.cloudera.org:8080/17769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I81aa1c10998d88a1bd5314fc3132037b1c8bbe4b
Gerrit-Change-Number: 17769
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Comment-Date: Fri, 13 Aug 2021 19:40:25 +0000
Gerrit-HasComments: Yes

Reply via email to