Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14177 )
Change subject: KUDU-2780: create thread for auto-rebalancing ...................................................................... Patch Set 13: (9 comments) Haven't done a full pass yet, just noting some things unrelated to the auto-rebalancing part of the patch. http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/catalog_manager.h@777 PS13, Line 777: master::TSManager* ts_manager() const; nit: not used? http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/catalog_manager.cc@1779 PS13, Line 1779: stalte nit: revert this? http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/catalog_manager.cc@5271 PS13, Line 5271: int CatalogManager::auto_rebalancer_iterations() const { : return auto_rebalancer_->number_of_loop_iterations_for_test_; : } : : int CatalogManager::auto_rebalancer_moves() const { : return auto_rebalancer_->moves_scheduled_this_round_for_test_; : } nit: Can't we get these via auto_rebalancer()->number_of_loop_iterations_for_tests_? http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/rebalance/rebalancer.h File src/kudu/rebalance/rebalancer.h: http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/rebalance/rebalancer.h@248 PS13, Line 248: // A struct that nit: can probably omit these prefixes. http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/rebalance/rebalancer.h@258 PS13, Line 258: : // A function that finds destination tservers for tablets that are : // identified by the rebalancing greedy algorithm as tablets that : // should be moved. nit: could you update this to be a bit more descriptive of what it does, rather than how it's used? And describe how the arguments are used? http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/rebalance/rebalancer.cc File src/kudu/rebalance/rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/rebalance/rebalancer.cc@499 PS13, Line 499: if (tablets_in_move->find(tablet_id) == tablets_in_move->end()) { nit: use ContainsKey() for better readability? http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/rebalance/rebalancer.cc@494 PS13, Line 494: // Shuffle the set of the tablet identifiers: that's to achieve even spread : // of moves across tables with the same skew. : std::shuffle(tablet_ids->begin(), tablet_ids->end(), random_generator); : string move_tablet_id; : for (const auto& tablet_id : *tablet_ids) { : if (tablets_in_move->find(tablet_id) == tablets_in_move->end()) { : // For now, choose the very first tablet that does not have replicas : // in move. Later on, additional logic might be added to find : // the best candidate. : move_tablet_id = tablet_id; : break; : } : } : if (move_tablet_id.empty()) { : return Status::NotFound(Substitute( : "table $0: could not find any suitable replica to move " : "from server $1 to server $2", move.table_id, move.from, move.to)); : } : Rebalancer::ReplicaMove move_info; : move_info.tablet_uuid = move_tablet_id; : move_info.ts_uuid_from = move.from; : const auto& extra_info = FindOrDie(extra_info_by_tablet_id, move_tablet_id); : if (extra_info.replication_factor < extra_info.num_voters) { : // The number of voter replicas is greater than the target replication : // factor. It might happen the replica distribution would be better : // if just removing the source replica. Anyway, once a replica is removed, : // the system will automatically add a new one, if needed, where the new : // replica will be placed to have balanced replica distribution. : move_info.ts_uuid_to = ""; : } else { : move_info.ts_uuid_to = move.to; : } : replica_moves->emplace_back(std::move(move_info)); : // Mark the tablet as 'has a replica in move'. : tablets_in_move->emplace(move_tablet_id); : : return Status::OK(); nit: shift to the left a couple spaces http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/tools/rebalancer_tool.cc File src/kudu/tools/rebalancer_tool.cc: http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/tools/rebalancer_tool.cc@1020 PS13, Line 1020: SelectReplicasToMove(move, extra_info_by_tablet_id, random_generator_, : &tablet_ids, &tablets_in_move, replica_moves); WARN_NOT_OK to match the current behavior? Also this call duplicates all the work done above. http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/tools/rebalancer_tool.cc@1021 PS13, Line 1021: &tablet_ids Since these are repopulated every loop iteration, how about passing them by copy/rvalue? I.e. change SelectReplicasToMove() to pass tablet_ids by copy, and std::move(tablet_ids) here. That way we're not making extra copies, we can still shuffle the tablet IDs in this call, and it also becomes clear that this is an input parameter, not an in-out or out parameter. -- To view, visit http://gerrit.cloudera.org:8080/14177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifca25d1063c07047cf2123e6792b3c7395be20e4 Gerrit-Change-Number: 14177 Gerrit-PatchSet: 13 Gerrit-Owner: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 03 Feb 2020 21:22:54 +0000 Gerrit-HasComments: Yes
