Yifan Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14154 )

Change subject: KUDU-2914: Rebalance tool support moving replicas from some 
specific tablet servers
......................................................................


Patch Set 13:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance-test.cc
File src/kudu/rebalance/rebalance-test.cc:

http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance-test.cc@156
PS12, Line 156:       }
> const ?
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance-test.cc@157
PS12, Line 157: in()
> nit: fix the alignment
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance-test.cc@197
PS12, Line 197:                       rhs.servers_by_total_replica_count);
              : }
> nit: is it possible to templatize bool HasSameContents(const ServersByCount
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance-test.cc@271
PS12, Line 271: Empty
> I think this method by itself doesn't imply working on ignored or other sub
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance-test.cc@293
PS12, Line 293:           { "tablet_a0", "table_a", { { "ts_0", true }, }, },
> I think it's worth adding a comment explaining the essence of the constrain
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance-test.cc@296
PS12, Line 296:         },
> Add SCOPED_TRACE() to get more information on the context if ASSERT_FALSE()
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance-test.cc@527
PS12, Line 527:
> if moving replicas from
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance_algo.h
File src/kudu/rebalance/rebalance_algo.h:

http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalance_algo.h@92
PS12, Line 92: on
> nit: on
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalancer.cc
File src/kudu/rebalance/rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/rebalance/rebalancer.cc@450
PS12, Line 450:       const int* replica_count = 
FindOrNull(tserver_replicas_count, ignored_tserver);
              :       if (!replica_count) {
> Would FindOrNull() work here?
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool-test.cc
File src/kudu/tools/rebalancer_tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool-test.cc@273
PS12, Line 273:       "--ignored_tservers=" + JoinStrings(ignored_tservers, 
","),
              :       "--move_replicas_from_ignored_tserv
> Join these two strings.
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool-test.cc@701
PS12, Line 701:  // Wait for the catalog manager to unde
> If an ignored tablet server went down
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool-test.cc@702
PS12, Line 702:
> server
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool-test.cc@706
PS12, Line 706:
> nit: move this to the next line
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool-test.cc@757
PS12, Line 757: ts()) {
> I think it might be important to know which tablet server is in progress of
It's quiet difficult to know which tablet server is in progress of copying 
data, through ksck tool we could only know some tablets are copying data 
(tablets contains 4 replicas and one of replicas is not running), but which 
replica would be removed is not sure.

The second case is similar to the case where the ignored tablet server goes 
down before run the rebalancing.


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool-test.cc@795
PS12, Line 795:   }
> Is there some significance behind starting the tablet server back?  If yes,
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool.cc@639
PS12, Line 639:
> healthy non-ignored
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/rebalancer_tool.cc@1243
PS12, Line 1243:     auto s = CheckComplet
> I'm not sure this is correct by the semantics of the 'has_updates' variable
I think the 'has_updates' variable indicates whether the statuses of 
in-progress moves has been updated.

In RebalancerTool::RunWith(Runner* runner, RunStatus* result_status) method, if 
no updates, the rebalancer tool will re-synchronize the state of the cluster.  
I think it's not necessary to re-synchronize the cluster state when the move 
has not been completed.


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/tool_action_cluster.cc
File src/kudu/tools/tool_action_cluster.cc:

http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/tool_action_cluster.cc@131
PS12, Line 131: the source tablet
> the source tablet server
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/tool_action_cluster.cc@132
PS12, Line 132:             "This setting is effective only if the 
'--ignored_tservers' flag "
> nit: add a space between 'flag' and '"'
Done


http://gerrit.cloudera.org:8080/#/c/14154/12/src/kudu/tools/tool_action_cluster.cc@134
PS12, Line 134: then all ignored tablet servers must be placed into "
              :             "the 'maintenance mode'.");
> then all ignored tablet servers must be placed into the 'maintenance mode'
Done



--
To view, visit http://gerrit.cloudera.org:8080/14154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86cfb740030946c13db1a9ca63d241f4907d6c89
Gerrit-Change-Number: 14154
Gerrit-PatchSet: 13
Gerrit-Owner: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Hannah Nguyen <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Comment-Date: Fri, 15 Nov 2019 08:01:36 +0000
Gerrit-HasComments: Yes

Reply via email to