Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14177 )
Change subject: KUDU-2780: create thread for auto-rebalancing ...................................................................... Patch Set 13: (17 comments) http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc File src/kudu/master/auto_rebalancer-test.cc: http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc@64 PS13, Line 64: void AssignLocations(int num_locations, int num_tservers) { nit: could you add a comment explaining that this doesn't assign locations evenly, and that the first location will be assigned more tservers? Seems like that's important to know, since it likely means we'll be moving replicas away from the first location. Or if that wasn't intentional, maybe try evening things out a la https://github.com/apache/kudu/blob/master/src/kudu/integration-tests/maintenance_mode-itest.cc#L549 Looking at how this is used, maybe it's worth defining some kind of AssignLocationsWithSkew() or something http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc@68 PS13, Line 68: TSDescriptorVector descs; : cluster_->mini_master(master_idx)->master()->ts_manager()-> : GetAllDescriptors(&descs); Probably want to put this into the ASSERT_EVENTUALLY too? Right now it's just checking the same size over and over again. http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc@106 PS13, Line 106: number_of_loop_iterations_for_test nit: use PascalCase for these, since they're not simple member accessors. Also no need to indicate "for_test" since these are all defined in the test. http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc@111 PS13, Line 111: return -1; nit: maybe instead of doing this, if we know we'll always have a cluster, do something like: int NumLoopIterations(int master_idx) { DCHECK_NE(cluster_, nullptr); return cluster_->mini_master(master_idx)->master()->catalo... } Same elsewhere. That way it sets readers' expectations w.r.t. when we can expect cluster_ to be unset (i.e. never, AFAICT) and provides cleaner semantics (no -1 special casing) http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc@150 PS13, Line 150: <kN nit: spacing http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc@151 PS13, Line 151: i==lead nit: spacing http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc@153 PS13, Line 153: NE Could be even more explicit and say ASSERT_LT, right? Especially if -1 is a magic number that might be returned http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc@189 PS13, Line 189: number_of_loop_iterations_for_test Maybe do a sanity check before shutting down the leader checking that this is 0 for all non-leaders? http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@220 PS12, Line 220: moves_scheduled_this_round_for_test_ = 0; Does this mean that replica_moves might not be empty? And if it's not empty, does it mean it's an in-progress move? If so, should this be passed into BuildClusterInfo? http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@236 PS12, Line 236: and BuildTa nit: mind annotating this with a comment? E.g. Status s = BuildClusterRawInfo(/*location*/boost::none, &raw_info); http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@283 PS12, Line 283: } : : } // end while : } nit: Could use WARN_NOT_OK http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@448 PS12, Line 448: up th nit: mind naming this 'replace' or something, and the below change 'add_non_voter' or something? Having them both be called 'change' is a bit confusing. http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@471 PS12, Line 471: vector<Sockaddr> resolved; Doesn't this mean that it'll be promoted to a voter immediately? Even without having been copied to? http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@680 PS12, Line 680: nit: spacing http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@696 PS12, Line 696: } : : return Status::OK(); Doesn't this mean that we're always erasing the move from replica_moves? If so, you could probably rewrite this like this instead of tracking indexes: for (const auto& move : replica_moves) { Status s = CheckMoveCompleted(move); replica_moves->erase(move); RETURN_NOT_OK(s); /* check move_is_complete */ } http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@732 PS12, Line 732: req.add_tablet_ids(tablet_uuid); : RETURN_NOT_OK(proxy->GetConsensusState(req, &resp, &rpc)); : if (resp.has_error()) { : return StatusFromPB(resp.error().status()); : } : if (resp.tablets_size() == 0) { : return Status::NotFound("tablet not found:", tablet_uuid); : } : DCHECK_EQ(1, resp.tablets_size()); : cstate = resp.tablets(0).cstate(); : : bool to_ts_uuid_in_config = false; : bool to_ts_uuid_is_a_voter = false; : for (const auto& peer : cstate.committed_config().peers()) { : if (peer.permanent_uuid() == to_ts_uuid) { : to_ts_uuid_in_config = true; : if (peer.member_type() == RaftPeerPB::VOTER) { : to_ts_uuid_is_a_voter = true Did you consider getting this information from the CatalogManager instead? Maybe by looking at the tablet_map_? The TabletInfo should have some consensus state in it. Ah, nevermind. I see you're just copying CheckCompleteMove() here and below, so nevermind, maybe? Still curious what you think. http://gerrit.cloudera.org:8080/#/c/14177/12/src/kudu/master/auto_rebalancer.cc@782 PS12, Line 782: // to voter. : if (orig_leader_uuid == from_ts_uuid && orig_leader_uuid == cstate.leader_uuid()) { : LeaderStepDownRequestPB req; : LeaderStepDownResponsePB resp; : RpcController rpc; : req.set_dest_uuid(orig_leader_uuid); : req.set_tablet_id(tablet_uuid); : req.set_mode(LeaderStepDownMode::GRACEFUL); : rpc.set_timeout(MonoDelta::FromSeconds(FLAGS_auto_rebalancing_timeout_seconds)); : RETURN_NOT_OK(proxy->LeaderStepDown(req, &resp, &rpc)); : if (resp.has_error()) { : return StatusFromPB(resp.error().status()); : } : } : : from_ts_uuid_in_config = true; : break; : } : } : : if (!from_ts_uuid_in_config && : ( Did you consider doing this in ExecuteMoves()? I found it a bit surprising to find it here -- I would've expected CheckMoveCompleted() to be a read-only kind of method. -- To view, visit http://gerrit.cloudera.org:8080/14177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifca25d1063c07047cf2123e6792b3c7395be20e4 Gerrit-Change-Number: 14177 Gerrit-PatchSet: 13 Gerrit-Owner: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 03 Feb 2020 23:59:54 +0000 Gerrit-HasComments: Yes
