Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14177 )

Change subject: KUDU-2780: create thread for auto-rebalancing
......................................................................


Patch Set 18:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer-test.cc
File src/kudu/master/auto_rebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer-test.cc@424
PS18, Line 424: Inject latency.
nit: Inject latency to make tablet's Raft configuration unstable due to 
frequent leader re-elections.


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer-test.cc@429
PS18, Line 429: the latency.
nit: frequent leadership changes.


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h
File src/kudu/master/auto_rebalancer.h:

http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h@157
PS18, Line 157: nor 'completion_status'
That doesn't exist anymore.


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h@159
PS18, Line 159: 'completion_status' is set to the
              :   // corresponding move status: Status::OK() if the move 
succeeded, or
              :   // non-OK status if it failed.
ditto


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h@163
PS18, Line 163: and 'completion_status'
ditto


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.h@164
PS18, Line 164: contain valid information.
This isn't the case with the new implementation: now with even non-OK return 
status the 'is_complete' can contain valid information.  Which is confusing, if 
talking about my opinion.


http://gerrit.cloudera.org:8080/#/c/14177/17/src/kudu/master/auto_rebalancer.cc
File src/kudu/master/auto_rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/14177/17/src/kudu/master/auto_rebalancer.cc@748
PS17, Line 748: TS $1 -> TS $2 m
> This was a bug in earlier iterations! I realized after reading through the
Right, the dest_uuid() is the target for the GetConsensusState RPC: it's not 
about replica movements :)


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc
File src/kudu/master/auto_rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@120
PS18, Line 120: int64
Would int32/uint32 suffice here?


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@120
PS18, Line 120: auto_rebalancing_timeout_seconds
This naming looks a bit confusing because it's too generic.  Maybe, rename into 
auto_rebalancing_rpc_timeout_seconds or alike?


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@374
PS18, Line 374: replica_moves->swap(rep_moves);
nit: using std::move() is preferred in such cases:
  *replica_moves = std::move(rep_moves);


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@669
PS18, Line 669:     if (!s.ok()) {
              :       replica_moves->erase(replica_moves->begin() + i);
              :       LOG(WARNING) << Substitute("Could not move replica: $0", 
s.ToString());
              :       return s;
              :     }
              :
              :     // If move was completed, remove it from 'replica_moves'.
              :     if (move_is_complete) {
              :       indexes_to_remove.push_back(i);
              :     }
I think this code doesn't bode well anymore with new semantics of the 
CheckMoveCompleted(): it can now return non-OK status, but set 
'move_is_complete' to true.  Is it correct to warn and return non-OK at line 
672 without removing corresponding element from indexes_to_remove?  Hows does 
the code work when some index is stuck there in such cases?


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@689
PS18, Line 689: unknown
Why 'unknown'?  I thought that was you who added this TODO, no?


http://gerrit.cloudera.org:8080/#/c/14177/18/src/kudu/master/auto_rebalancer.cc@706
PS18, Line 706: shared_ptr<ConsensusServiceProxy> proxy;
nit: move closer to the point of usage, i.e. right before linke 711



--
To view, visit http://gerrit.cloudera.org:8080/14177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifca25d1063c07047cf2123e6792b3c7395be20e4
Gerrit-Change-Number: 14177
Gerrit-PatchSet: 18
Gerrit-Owner: Hannah Nguyen <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Hannah Nguyen <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 06 Mar 2020 19:42:23 +0000
Gerrit-HasComments: Yes

Reply via email to