Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14177 )

Change subject: WIP KUDU-2780: create thread for auto-rebalancing
......................................................................


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc
File src/kudu/master/auto_rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@377
PS3, Line 377:   unordered_set<string> tablets_in_move;
             :   for (const auto& move : moves) {
             :     vector<string> tablet_ids;
             :     RETURN_NOT_OK(rebalancer_->FindReplicas(move, raw_info, 
&tablet_ids));
             :     if (cross_location) {
             :       // In case of cross-location (a.k.a. inter-location) 
rebalancing it is
             :       // necessary to make sure the majority of replicas would 
not end up
             :       // at the same location after the move. If so, remove 
those tablets
             :       // from the list of candidates.
             :       
RETURN_NOT_OK(rebalancer_->FilterCrossLocationTabletCandidates(
             :           cluster_info.locality.location_by_ts_id, tpi, move, 
&tablet_ids));
             :     }
             :     // Shuffle the set of the tablet identifiers: that's to 
achieve even spread
             :     // of moves across tables with the same skew.
             :     std::shuffle(tablet_ids.begin(), tablet_ids.end(), 
random_device());
             :     string move_tablet_id;
             :     for (const auto& tablet_id : tablet_ids) {
             :       if (tablets_in_move.find(tablet_id) == 
tablets_in_move.end()) {
             :         // For now, choose the very first tablet that does not 
have replicas
             :         // in move. Later on, additional logic might be added to 
find
             :         // the best candidate.
             :         move_tablet_id = tablet_id;
             :         break;
             :       }
             :     }
> Oops, actual response:
Yeah, maybe it'd make sense as a part of the rebalance namespace. 30 lines 
worth of well-commented code definitely ought to be reused if possible, 
especially since a lot of it looks more or less identical.


http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@615
PS3, Line 615:           ReplicaTypeFilter::ANY_REPLICA,
             :           &locs_pb,
             :           &ts_infos_dict,
             :           boost::none));
             :
             :       for (const auto& r : locs_pb.interned_replicas()) {
             :         int index = r.ts_info_idx();
             :         const TSInfoPB ts_info = 
*(ts_infos_dict.ts_info_pbs[index]);
             :         ReplicaSummary rep;
             :         rep.ts_uuid = ts_info.permanent_uuid();
             :         const auto& addr = ts_info.rpc_addresses(0);
             :         rep.ts_address = Substitute("$0:$1", addr.host(), 
addr.port());
             :         replicas.push_back(rep);
             :       }
             :
             :       tablet_summary.replicas.swap(replicas);
             :       tablet_summaries.push_back(std::move(tablet_summary));
             :     }
             :
             :     table_summaries.push_back(std::move(table_summary));
             :   }
             :
             :   if (!location) {
             :     // Information on the whole cluster.
             :     raw_info->tserver_summaries = std::move(tserver_summaries);
             :     raw_info->tablet_summaries = std::move(tablet_summaries);
             :     raw_info->table_summaries = std::move(table_summaries);
             :   } else {
             :     // Information on the specified location only: filter out 
non-relevant info.
             :     const auto& location_str = *location;
             :
             :     unordered_set<string> ts_ids_at_location;
             :     for (const auto& summary : tserver_summaries) {
             :       if (summary.ts_location == location_str) {
             :         raw_info->tserver_summaries.push_back(summary);
             :         InsertOrDie(&ts_ids_at_location, summary.uuid);
             :       }
             :     }
             :
             :     unordered_set<string> table_ids_at_location;
             :     for (const auto& summary : tablet_summaries) {
             :
> table id's can be inserted into table_ids_at_location in the above for loop
Ah, I would have thought we could build the ts_at_locations up front and then 
filtered the table_infos as we built up tablet_summaries and table_summaries 
(e.g. in the above loop, if none of the tablet_summaries contain a replica in 
given location, don't push_back the table_summary). I guess that would be a lot 
of potentially unnecessary map lookups in the common case (no locations), but 
it might also be less code.



--
To view, visit http://gerrit.cloudera.org:8080/14177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifca25d1063c07047cf2123e6792b3c7395be20e4
Gerrit-Change-Number: 14177
Gerrit-PatchSet: 5
Gerrit-Owner: Hannah Nguyen <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Hannah Nguyen <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Thu, 19 Sep 2019 22:10:53 +0000
Gerrit-HasComments: Yes

Reply via email to