Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14177 )
Change subject: KUDU-2780: create thread for auto-rebalancing ...................................................................... Patch Set 14: (14 comments) http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@732 PS12, Line 732: return Status::NotFound("tablet not found:", tablet_uuid); : } : DCHECK_EQ(1, resp.tablets_size()); : cstate = resp.tablets(0).cstate(); : : bool to_ts_uuid_in_config = false; : bool to_ts_uuid_is_a_voter = false; : for (const auto& peer : cstate.committed_config().peers()) { : if (peer.permanent_uuid() == to_ts_uuid) { : to_ts_uuid_in_config = true; : if (peer.member_type() == RaftPeerPB::VOTER) { : to_ts_uuid_is_a_voter = true; : } : break; : } : } : : // Failure case: newly added repli > Would it be okay to keep as-is and as close to CheckCompleteMove() as possi Yeah I'm fine improving this later. Could you add a TODO pointing to the other CheckCompleteMove() so it's clear to future interested parties that they should also look at the rebalancer? http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/master/auto_rebalancer.cc@209 PS14, Line 209: if (paused_) continue; Is this set anywhere? If not, remove it? http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/master/auto_rebalancer.cc@273 PS14, Line 273: all_moves_complete Do we need this? Could we just check if replica_moves is empty? http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/master/auto_rebalancer.cc@660 PS14, Line 660: *all_moves_complete = true; nit: it's generally best practice to not update out-params if the method fails. Could you update a local variable and only set this when we return OK? http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/master/auto_rebalancer.cc@663 PS14, Line 663: stack<int> moves_to_remove; Is the LIFO order here important? And is it important to actually pop everything down at L688? Could we get by with a vector and instead just iterate through the elements when we're removing them? If not, doc why using a stack here is important, otherwise it leaves readers with some questions. http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/master/auto_rebalancer.cc@670 PS14, Line 670: move_completion_status Not used? Is that intentional? http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/master/ts_manager.cc File src/kudu/master/ts_manager.cc: http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/master/ts_manager.cc@214 PS14, Line 214: for (const TSDescriptorMap::value_type& entry : servers_by_id_) { nit: in C++11 you can use auto for this, e.g. for (const auto& entry : servers_by_id_) { const auto& ts = entry.second; ... } if you want http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/rebalance/rebalancer.h File src/kudu/rebalance/rebalancer.h: http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/rebalance/rebalancer.h@171 PS14, Line 171: The source and destination replicas are determined by the elements of the : // 'tablet_ids' container and tablet server UUIDs TableReplicaMove::from and : // TableReplica::to correspondingly. nit: not from this patch, but this makes it seem like 'tablet_ids' is an input parameter. Mind updating this? http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/rebalance/rebalancer.h@260 PS14, Line 260: ' nit: remove http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/rebalance/rebalancer.h@261 PS14, Line 261: in move nit: it's easy to misread this as "replicas in 'move'". Maybe "replicas in 'tablets_in_move'" instead? http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/rebalance/rebalancer.h@264 PS14, Line 264: Status SelectReplicasToMove( nit: maybe also describe the behavior re: potentially returning an empty string when we're overreplicated? http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/rebalance/rebalancer.cc File src/kudu/rebalance/rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/rebalance/rebalancer.cc@489 PS14, Line 489: std::mt19937 random_generator, Should this be passed as a pointer since it's an in-out param? Otherwise, isn't this passing a copy around? http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/rebalance/rebalancer.cc@487 PS14, Line 487: const TableReplicaMove& move, : const unordered_map<string, TabletExtraInfo>& extra_info_by_tablet_id, : std::mt19937 random_generator, : vector<string> tablet_ids, : unordered_set<string>* tablets_in_move, : vector<Rebalancer::ReplicaMove>* replica_moves) { nit: indent by four spaces http://gerrit.cloudera.org:8080/#/c/14177/14/src/kudu/rebalance/rebalancer.cc@497 PS14, Line 497: string nit: make this a const ref? -- To view, visit http://gerrit.cloudera.org:8080/14177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifca25d1063c07047cf2123e6792b3c7395be20e4 Gerrit-Change-Number: 14177 Gerrit-PatchSet: 14 Gerrit-Owner: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 17 Feb 2020 04:47:40 +0000 Gerrit-HasComments: Yes
