Yuqi Du has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18454 )

Change subject: [master] KUDU-3390 support auto rebalance tablet leaders across 
TServers
......................................................................


Patch Set 34:

(27 comments)

I am glad to receive lots of crs.
Thank you very much.
I have fixed them and you can review them again.

http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer-test.cc
File src/kudu/master/auto_leader_rebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer-test.cc@60
PS32, Line 60: (auto_rebalancing_interval_seconds);
> Is this needed here?
Remove it.
And some DECLARE_* and METRIC_DECLARE* below also not used, they should be 
removed.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer-test.cc@62
PS32, Line 62: (auto_rebalancing_wait_for
> And the same for several other declarations of gflags in this section: are
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer-test.cc@70
PS32, Line 70: class LeaderRebalancerTest : public KuduTest {
             :  public:
             :   Status CreateAndStartCluster() {
> Are these needed in this file?
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer-test.cc@139
PS32, Line 139: rs;
> How do we know that the distribution of the leader replicas is balanced?  I
Yes, the describe is not correct, I will fix it.
when create tables, the table's tablet replicas and leaders is decided by 
algorithms of assign tablets.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer-test.cc@150
PS32, Line 150:
> How do we know at this point that the cluster is leader-balanced?
That's default status, leader not balanced. I fix the comments.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer-test.cc@168
PS32, Line 168:     SleepFor(MonoDelta::FromMillisecond
> That's a really long sleep time: that's not a good idea to have something l
ok


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer-test.cc@169
PS32, Line 169:
> I'm curious why is the number of retries so high?  Is it possible to make t
I set a small value for each iteration to cover more branches, eg:  
auto_leader_balancer.cc:296


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer-test.cc@175
PS32, Line 175:   const int kNumTServers = 4;
> Why is it necessary to have a pause here?  Would be great to add a comment
I looks the sleep is not necessary, but I try to remove it the test is failed.
I think master's tablets meta about roles, distributed is not realtime, it need 
wait heartbeat reports.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.h
File src/kudu/master/auto_leader_rebalancer.h:

http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.h@45
PS32, Line 45: // A CatalogManager background task which auto-rebalances 
tablets' leaders distribution.
> nit: remove this period
I have not understand this.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.h@46
PS32, Line 46: by transferring leadership
> by transferring leadership between tablet replicas
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.h@95
PS32, Line 95: // send rpc messages for ’LeaderStepDown‘
> nit: please document this member as well -- what is it for?
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.cc
File src/kudu/master/auto_leader_rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.cc@55
PS32, Line 55: using kudu::consensus::ConsensusServiceProxy;
             : using kudu::consensus::LeaderStepDownMode;
             : using kudu::consensus::LeaderStepDownRequestPB;
             : using kudu::consensus::LeaderStepDownResponsePB;
             : using kudu::consensus::RaftPeerPB;
             : using kudu::rpc::MessengerBuilder;
             : using kudu::rpc::RpcController;
             : using std::map;
             : using std::nullopt;
             : using std::string;
             : using std::vector;
             : using strings::Substitute;
             :
             : DEFINE_uint32(auto_leader_
> nit: remove empty lines and sort the rest of non-empty lines alphabetically
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.cc@293
PS32, Line 293:
> What's "synchronized rpc"?
I mean the thread would wait the response synchronously.
I fix the comments.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.cc@293
PS32, Line 293: = 0;
> very?
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.cc@299
PS32, Line 299: break;
> I don't think this sort of information is worth it having in the INFO log:
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.cc@304
PS32, Line 304: reques
> nit: could this be 'const string&' or 'const auto&' to avoid copying?
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.cc@330
PS32, Line 330:
> Drop the namespace prefix since there is corresponding 'using' using string
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/auto_leader_rebalancer.cc@337
PS32, Line 337:
> What's VLOG(0), and why not to use VLOG(1) for this?
The log can show information about the leader rebalance runs simply and 
directly.
The log print only 1 record every period (default 1 hour), I think it's a 
necessary for INFO currently, In the future, I think we can add some metrics 
and remove the log.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc@353
PS32, Line 353: automatic leader rebala
> automatic leader rebalancing
The style like line 346, whether it's necessary to change it ?

I change it firstly.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc@1062
PS32, Line 1062: depend
> depends
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc@1063
PS32, Line 1063: auto-rebalancing
> auto-rebalancing
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc@1063
PS32, Line 1063: rebalanci
> rebalancing
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc@1064
PS32, Line 1064:  kudu
> cannot
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc@1064
PS32, Line 1064: the effect of
> What is 'load rebalance'?
the effect of leader rebalancing, I fix the words.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc@1067
PS32, Line 1067: isabled,
> ?
I did not know "?", maybe I should rewrite the comments, I try to do this.


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc@1067
PS32, Line 1067: ven if a
> because
Done


http://gerrit.cloudera.org:8080/#/c/18454/32/src/kudu/master/catalog_manager.cc@1067
PS32, Line 1067: ries to
> followers?
Done



--
To view, visit http://gerrit.cloudera.org:8080/18454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibfb60d8759a93b6a19238637c27df4f6b1cac918
Gerrit-Change-Number: 18454
Gerrit-PatchSet: 34
Gerrit-Owner: Yuqi Du <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Yuqi Du <[email protected]>
Gerrit-Comment-Date: Fri, 07 Oct 2022 08:38:15 +0000
Gerrit-HasComments: Yes

Reply via email to