Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14177 )
Change subject: KUDU-2780: create thread for auto-rebalancing ...................................................................... Patch Set 13: (3 comments) http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc File src/kudu/master/auto_rebalancer-test.cc: http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer-test.cc@300 PS13, Line 300: } // namespace master After looking into the implementation of AutoRebalancerTask::CheckReplicaMovesCompleted() I think it's worth adding a test scenario to ensure the auto-rebalancer doesn't schedule moving next replica while there are still replicas in move-in-progress state, especially if that induces more than the specified number of tablet copying sessions per a tablet server. Could you add a scenario that would do so? The idea is that inserting a few thousand rows into a tablet and adding a long pause for tablet replica copying will make it easy to spot any cases like that (see --tablet_copy_download_file_inject_latency_ms for details). Basically, create a table with many tablets, RF=3, insert some data there, then add a couple of tablet servers into the test cluster. Then monitor the number of tablet copying sessions using the corresponding metric 'tablet_copy_open_client_sessions' (see TabletCopyOpenClientSessions() method and TabletCopyITest.TestTabletCopyMetrics scenario in tablet_copy-itest.cc for examples). http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer.cc@282 PS13, Line 282: CheckReplicaMovesCompleted > What if the tablet has been dropped along with all its replicas? Will the Ah, after closer look at CheckReplicaMovesCompleted() it's clearly not the case. I was confused by the in-out semantics of the replica_moves parameter. http://gerrit.cloudera.org:8080/#/c/14177/13/src/kudu/master/auto_rebalancer.cc@688 PS13, Line 688: Otherwise This comment doesn't reflect what's going on: as Andrew pointed out, the code below is written to remove it from 'replica_moves' regardless. -- To view, visit http://gerrit.cloudera.org:8080/14177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifca25d1063c07047cf2123e6792b3c7395be20e4 Gerrit-Change-Number: 14177 Gerrit-PatchSet: 13 Gerrit-Owner: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 04 Feb 2020 21:25:24 +0000 Gerrit-HasComments: Yes
