Hannah Nguyen has posted comments on this change. ( http://gerrit.cloudera.org:8080/14177 )
Change subject: WIP KUDU-2780: create thread for auto-rebalancing ...................................................................... Patch Set 4: (2 comments) some straggler comment responses http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@377 PS3, Line 377: unordered_set<string> tablets_in_move; : for (const auto& move : moves) { : vector<string> tablet_ids; : RETURN_NOT_OK(rebalancer_->FindReplicas(move, raw_info, &tablet_ids)); : if (cross_location) { : // In case of cross-location (a.k.a. inter-location) rebalancing it is : // necessary to make sure the majority of replicas would not end up : // at the same location after the move. If so, remove those tablets : // from the list of candidates. : RETURN_NOT_OK(rebalancer_->FilterCrossLocationTabletCandidates( : cluster_info.locality.location_by_ts_id, tpi, move, &tablet_ids)); : } : // Shuffle the set of the tablet identifiers: that's to achieve even spread : // of moves across tables with the same skew. : std::shuffle(tablet_ids.begin(), tablet_ids.end(), random_device()); : string move_tablet_id; : for (const auto& tablet_id : tablet_ids) { : if (tablets_in_move.find(tablet_id) == tablets_in_move.end()) { : // For now, choose the very first tablet that does not have replicas : // in move. Later on, additional logic might be added to find : // the best candidate. : move_tablet_id = tablet_id; : break; : } : } > Done Oops, actual response: This is pulled from just part of RebalancerTool::AlgoBasedRunner::GetNextMovesImpl(). The original function has to deal with the possibility of unhealthy tservers/tablets, filtering out the tservers and moves that are already scheduled/in flight. I didn't know if it was worth pulling out 30ish lines of code out of a member function of a class in a different namespace--do you think it's worth refactoring? http://gerrit.cloudera.org:8080/#/c/14177/3/src/kudu/master/auto_rebalancer.cc@452 PS3, Line 452: } > We called CatalogManager::GetTabletLocations() when we computed the cluster In AutoRebalancerTask::BuildClusterRawInfo(), CatalogManager::GetTabletLocations() is called once for every single tablet in the cluster. Here, we're calling only calling it again for the tablets that are involved in the specified replica_moves. We won't know in BuildClusterRawInfo() which tablets will fall in that subset, so I didn't think it was worth stashing the information for all the tablets. Plus, the size of replica_moves is limited too. -- To view, visit http://gerrit.cloudera.org:8080/14177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifca25d1063c07047cf2123e6792b3c7395be20e4 Gerrit-Change-Number: 14177 Gerrit-PatchSet: 4 Gerrit-Owner: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 19 Sep 2019 20:55:58 +0000 Gerrit-HasComments: Yes
