Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18679 )

Change subject: [master] support turning on/off auto rebalancer at runtime
......................................................................


Patch Set 26:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/18679/26//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18679/26//COMMIT_MSG@16
PS26, Line 16: threadpool
nit: thread pool


http://gerrit.cloudera.org:8080/#/c/18679/26//COMMIT_MSG@16
PS26, Line 16: threadpool
nit: thread pool


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer-test.cc
File src/kudu/master/auto_rebalancer-test.cc:

http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer-test.cc@285
PS26, Line 285: // Make sure turn on/off auto-rebalancing is valid.
How about:

  Make sure the auto-rebalancing can be toggled on/off in runtime.


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer-test.cc@287
PS26, Line 287:   const int kNumMasters = 1;
              :   const int kNumTservers = 3;
              :   const int kNumTablets = 8;
nit: these might be constexpr

Also, since each of these constants are used only once in the code below, maybe 
get rid of them at all?


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer-test.cc@303
PS26, Line 303:     // Wait a schedule period.
              :     SleepFor(MonoDelta::FromSeconds(1));
What's the scheduler's period at this point?  It's better to use the actual 
value of the scheduling interval instead of hard-coded value since this might 
become flaky once the default value for the scheduler period changes.


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer-test.cc@307
PS26, Line 307:   int num_iterations = NumLoopIterations(leader_idx);
              :   ASSERT_EVENTUALLY([&] {
              :     ASSERT_EQ(num_iterations, NumLoopIterations(leader_idx));
              :   });
Wouldn't the ASSERT_EQ() be almost always true since it's executed immediately 
after assigning the 'num_iterations' variable at line 307?  Relying on the rare 
case when the scheduler runs right after the assignment and before ASSERT_EQ() 
evaluated is a bit tight -- it could happen that this assert fails only 1/1000 
of all times it runs.


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer.h
File src/kudu/master/auto_rebalancer.h:

http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer.h@81
PS26, Line 81: Schedule `Rebalance()` Task asynchronously.
As far as I can see, the task itself is being _scheduled_ synchronously, no?


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer.h@84
PS26, Line 84: the main loop
This isn't a loop anymore, is it?


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer.h@85
PS26, Line 85:   // When leader master and FLAGS_auto_rebalancing_enabled is
             :   // true, invoke `DoReplicaRebalance()` to move replicas if
             :   // replicas is not imbalanced.
How about:

Leader master invokes 'DoReplicaRebalance()' when 
FLAGS_auto_rebalancing_enabled is set to 'true'.


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer.h@88
PS26, Line 88: Rebalance
Since this class is a task on its own and named AutoRebalancerTask already, 
maybe rename this method into 'Run()'?


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer.h@90
PS26, Line 90: Replica moves if replicas is not imbalanced.
?


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer.cc
File src/kudu/master/auto_rebalancer.cc:

http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/auto_rebalancer.cc@194
PS26, Line 194:   return Status::OK();
This looks a bit strange to me: there was a request to run the rebalancing and 
the method reports "OK, it's been scheduled", not nothing has happened in 
reality.  Should this return Status::IllegalState() in such a case instead of 
Status::OK()?


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/catalog_manager.cc@158
PS26, Line 158: 100
I don't think it makes sense to have the default scheduling interval this short 
given that --auto_rebalancing_interval_seconds has granularity of one second.  
Maybe, increase the default up to 1000 ms?


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/catalog_manager.cc@159
PS26, Line 159: every schedule_period_ms scheduler pool would check time and 
may run some tasks
Every scheduler_period_ms the scheduler checks for tasks to run


http://gerrit.cloudera.org:8080/#/c/18679/26/src/kudu/master/catalog_manager.cc@161
PS26, Line 161: max thread numbers of scheduler thread pool
Maximum number of threads in the scheduler thread pool



--
To view, visit http://gerrit.cloudera.org:8080/18679
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifbcea6392b783843f89e019f1141dd8905366de5
Gerrit-Change-Number: 18679
Gerrit-PatchSet: 26
Gerrit-Owner: Yuqi Du <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Yuqi Du <[email protected]>
Gerrit-Comment-Date: Fri, 05 Aug 2022 04:15:17 +0000
Gerrit-HasComments: Yes

Reply via email to