Yuqi Du has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19608 )

Change subject: [master] Exclude tservers in MAINTENANCE_MODE when leader 
rebalancing
......................................................................


Patch Set 12:

(10 comments)

Thanks for your advices. All fixed.

http://gerrit.cloudera.org:8080/#/c/19608/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19608/9//COMMIT_MSG@14
PS9, Line 14: For this reason, we should exclude
            : such tservers.
> Right -- maintenance mode is just for masters to avoid placing replicas to
Yes, this patch is just to avoid transferring leadership to replicas hosted at 
tablet servers in maintenance node.


http://gerrit.cloudera.org:8080/#/c/19608/12/src/kudu/master/auto_leader_rebalancer-test.cc
File src/kudu/master/auto_leader_rebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/19608/12/src/kudu/master/auto_leader_rebalancer-test.cc@214
PS12, Line 214:   const int kNumTServers = 3;
> Since this test scenario involves many iterations and pauses, consider addi
DONE.

I am not sure about what test cases can be set SKIP_IF_SLOW_NOT_ALLOWED, I have 
added it at this test.
Are there some standards or advices?

In this case, I think it should run at DEBUG mode(DEBUG mode would skip slow 
cases), though it runs many iterations and has some 'SleepFor', in general, it 
runs fast enough, but this is my personal feeing.


http://gerrit.cloudera.org:8080/#/c/19608/12/src/kudu/master/auto_leader_rebalancer-test.cc@215
PS12, Line 215: kNumTablets
> It is possible to provide the essence of the this test scenario with less n
Yes, you are right.

In this case, it's not need a bigger number, a smaller one is ok.


http://gerrit.cloudera.org:8080/#/c/19608/12/src/kudu/master/auto_leader_rebalancer-test.cc@244
PS12, Line 244: not
> is not
Done


http://gerrit.cloudera.org:8080/#/c/19608/12/src/kudu/master/auto_leader_rebalancer-test.cc@244
PS12, Line 244: 20 tries 'leader rebalance'
> 20 'leader rebalance' iterations
Done


http://gerrit.cloudera.org:8080/#/c/19608/12/src/kudu/master/auto_leader_rebalancer-test.cc@245
PS12, Line 245:   // it's enough to reach leader balanced, more tries is not 
necessary and less tries
              :   // may not reach leader rebalanced.
> Does the number somehow depends on the number of tables in the cluster?  If
Done


http://gerrit.cloudera.org:8080/#/c/19608/12/src/kudu/master/auto_leader_rebalancer-test.cc@249
PS12, Line 249: CHECK_OK
> Why CHECK_OK() here instead of ASSERT_OK()?  Why it's so important to crash
Sorry, it should be ASSERT_OK


http://gerrit.cloudera.org:8080/#/c/19608/12/src/kudu/master/auto_leader_rebalancer-test.cc@255
PS12, Line 255: a rebalanced state of leaders
> the state of balanced leadership distribution
Done


http://gerrit.cloudera.org:8080/#/c/19608/12/src/kudu/master/auto_leader_rebalancer-test.cc@261
PS12, Line 261: be
> have
Done


http://gerrit.cloudera.org:8080/#/c/19608/9/src/kudu/master/auto_leader_rebalancer.cc
File src/kudu/master/auto_leader_rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/19608/9/src/kudu/master/auto_leader_rebalancer.cc@405
PS9, Line 405:     CatalogManager::ScopedLeaderSharedLock 
leader_lock(catalog_manager_);
> Right -- having a provisional check to exclude tablet servers in maintenanc
Yes, I have skimmed over the codes , currently it seems tservers in quiesce 
state may reject the transferring leader request. I'll make sure it, and if so, 
I think we can do nothing about this or just patching an unit test if needed.



--
To view, visit http://gerrit.cloudera.org:8080/19608
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f85a675e69fd02a62e2625881dad2ca5e27acd9
Gerrit-Change-Number: 19608
Gerrit-PatchSet: 12
Gerrit-Owner: Yuqi Du <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: KeDeng <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <[email protected]>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Yifan Zhang <[email protected]>
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Yuqi Du <[email protected]>
Gerrit-Comment-Date: Thu, 06 Apr 2023 04:11:30 +0000
Gerrit-HasComments: Yes

Reply via email to