Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14177 )
Change subject: KUDU-2780: create thread for auto-rebalancing ...................................................................... Patch Set 18: (13 comments) http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer-test.cc File src/kudu/master/auto_rebalancer-test.cc: http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer-test.cc@424 PS18, Line 424: Inject latency. nit: Inject latency to make tablet's Raft configuration unstable due to frequent leader re-elections. http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer-test.cc@429 PS18, Line 429: the latency. nit: frequent leadership changes. http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h File src/kudu/master/auto_rebalancer.h: http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h@157 PS18, Line 157: nor 'completion_status' That doesn't exist anymore. http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h@159 PS18, Line 159: 'completion_status' is set to the : // corresponding move status: Status::OK() if the move succeeded, or : // non-OK status if it failed. ditto http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h@163 PS18, Line 163: and 'completion_status' ditto http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h@164 PS18, Line 164: contain valid information. This isn't the case with the new implementation: now with even non-OK return status the 'is_complete' can contain valid information. Which is confusing, if talking about my opinion. http://gerrit.cloudera.org:8080/#/c/14177/17/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/17/src/kudu/master/auto_rebalancer.cc@748 PS17, Line 748: TS $1 -> TS $2 m > This was a bug in earlier iterations! I realized after reading through the Right, the dest_uuid() is the target for the GetConsensusState RPC: it's not about replica movements :) http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@120 PS18, Line 120: int64 Would int32/uint32 suffice here? http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@120 PS18, Line 120: auto_rebalancing_timeout_seconds This naming looks a bit confusing because it's too generic. Maybe, rename into auto_rebalancing_rpc_timeout_seconds or alike? http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@374 PS18, Line 374: replica_moves->swap(rep_moves); nit: using std::move() is preferred in such cases: *replica_moves = std::move(rep_moves); http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@669 PS18, Line 669: if (!s.ok()) { : replica_moves->erase(replica_moves->begin() + i); : LOG(WARNING) << Substitute("Could not move replica: $0", s.ToString()); : return s; : } : : // If move was completed, remove it from 'replica_moves'. : if (move_is_complete) { : indexes_to_remove.push_back(i); : } I think this code doesn't bode well anymore with new semantics of the CheckMoveCompleted(): it can now return non-OK status, but set 'move_is_complete' to true. Is it correct to warn and return non-OK at line 672 without removing corresponding element from indexes_to_remove? Hows does the code work when some index is stuck there in such cases? http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@689 PS18, Line 689: unknown Why 'unknown'? I thought that was you who added this TODO, no? http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@706 PS18, Line 706: shared_ptr<ConsensusServiceProxy> proxy; nit: move closer to the point of usage, i.e. right before linke 711 -- To view, visit http://gerrit.cloudera.org:8080/14177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifca25d1063c07047cf2123e6792b3c7395be20e4 Gerrit-Change-Number: 14177 Gerrit-PatchSet: 18 Gerrit-Owner: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Fri, 06 Mar 2020 19:42:23 +0000 Gerrit-HasComments: Yes
