Yifan Zhang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14154 )

Change subject: KUDU-2914: Rebalance tool support moving replicas from some 
specific tablet servers
......................................................................


Patch Set 7:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/14154/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14154/6//COMMIT_MSG@20
PS6, Line 20: other
            : tservers first, and then running the rebalancing on the other
            : tservers in the clu
> Why bother differentiating between unhealthy and healthy ignored servers? I
It is because the rebalancer tool only runs on healthy servers.


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/rebalance/rebalance-test.cc
File src/kudu/rebalance/rebalance-test.cc:

http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/rebalance/rebalance-test.cc@157
PS6, Line 157:     auto get_values = [](pair<ServersByCountMap::const_iterator,
> nit: for local variables, we prefer snake_case, or sometimes for lambdas, w
Done


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/rebalance/rebalance_algo.h
File src/kudu/rebalance/rebalance_algo.h:

http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/rebalance/rebalance_algo.h@93
PS6, Line 93: tservers_to_empty;
> nit: "healthy"
I modified it to "tservers_to_empty"


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool-test.cc
File src/kudu/tools/rebalancer_tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool-test.cc@241
PS6, Line 241: ::testing::WithParamInterface<Kudu1097>
> Why bother testing the 3-2-3 replication? I don't think maintenance mode wi
I'll fix the test in the next patch set.


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.h
File src/kudu/tools/rebalancer_tool.h:

http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.h@322
PS6, Line 322:                    co
> nit: Could you add a class-wide comment explaining what the IgnoredTservers
I added some comments for 'GetReplaceMoves' in each class.


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.cc
File src/kudu/tools/rebalancer_tool.cc:

http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.cc@639
PS6, Line 639:   Substitute("Too many ignored servers; $0 healthy servers exist 
but $1 are require
> nit: for returned statuses, we generally just use a single line message. Al
Done


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.cc@1217
PS6, Line 1217:   LOG(INFO) << Su
> nit: this is trivially true based on the line above, so maybe remove it?
Done


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.cc@1302
PS6, Line 1302:   // desitnation servers for the move of the replica marked with
              :   // the REPLACE attribute is not known.
              :   // Load the least loaded (in terms of
> nit: while I appreciate you taking care of this TODO, could you move this c
OK.


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.cc@1337
PS6, Line 1337:   const auto op_count = op_count_per_ts_[ts_uuid]++;
              :   const auto op_range = ts_per_op_count_.equal_range(op_count);
              :   bool ts_op_count_updated = false;
              :   for (auto it = op_range.first; it != op_range.second; ++it) {
              :     if (it->second == ts_uuid) {
              :       ts_per_op_count_.erase(it);
              :       ts_per_op_count_.emplace(op_count + 1, ts_uuid);
              :       ts_op_count_updated = true;
              :       break;
              :     }
              :   }
              :   D
> Do you know why the policy fixer didn't do this before?
This part has something to do with the TODO above, I'll move them all to a 
seperate patch.


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.cc@1421
PS6, Line 1421: d tserv
> nit: ignored
Done


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.cc@1482
PS6, Line 1482: sKey(ci.t
> nit: tserver ID
Done


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/rebalancer_tool.cc@1506
PS6, Line 1506: et_id, elem.first, "", t
> nit: fix spacing
Done


http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/tool_action_cluster.cc
File src/kudu/tools/tool_action_cluster.cc:

http://gerrit.cloudera.org:8080/#/c/14154/6/src/kudu/tools/tool_action_cluster.cc@131
PS6, Line 131:  If set true, "
             :             "the healthy ignored servers should be set into 
maintenance mode first.
> While this is the most effective way to decommission a tserver, I don't thi
Agreed. But now this tool is stongly dependent on the 'maintenance mode'. If 
use this flag,  we should make sure that new replicas will not be placed on the 
specified tservers.



--
To view, visit http://gerrit.cloudera.org:8080/14154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I86cfb740030946c13db1a9ca63d241f4907d6c89
Gerrit-Change-Number: 14154
Gerrit-PatchSet: 7
Gerrit-Owner: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Comment-Date: Mon, 21 Oct 2019 13:05:43 +0000
Gerrit-HasComments: Yes

Reply via email to