Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13891 )
Change subject: WIP KUDU-2780: create thread for auto-rebalancing (1/n) ...................................................................... Patch Set 4: (7 comments) Left some high-level comments, and didn't look much at all into the copy-pasted functions since the intent is to refactor those away anyway. http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer-test.cc File src/kudu/master/autorebalancer-test.cc: http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer-test.cc@55 PS4, Line 55: const int kNumTservers = 3; : const int kNumTablets = 2; : It'd also be worth using a multi-master configuration to see what happens when multiple masters are running your code. http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc File src/kudu/master/autorebalancer.cc: http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@41 PS4, Line 41: #include "kudu/tools/ksck.h" : #include "kudu/tools/ksck_remote.h" : #include "kudu/tools/ksck_results.h" We chatted about this offline; in terms of dependencies and whatnot, it probably makes sense to pull out some of the classes here into the /master or /common, and update /tools to use it, rather than depending on /tools code. http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@81 PS4, Line 81: DEFINE_string(autorebalancing_tables, "", : "Tables to include (comma-separated list of table names)" : "If not specified, includes all tables.");; : : // Renamed from "kudu/tools/tool_action_common.cc" : DEFINE_string(autorebalancing_format, "pretty", : "Format to use for printing list output tables.\n" : "Possible values: pretty, space, tsv, csv, and json"); : : DEFINE_uint32(autorebalancing_interval_seconds, 5, : "How long to sleep in between each check for skew " : "and if possible, performing a rebalancing move."); : TAG_FLAG(autorebalancing_interval_seconds, runtime); : : // Renamed from "kudu/tools/tool_action_cluster.cc" : DEFINE_string(autorebalancing_ignored_tservers, "", : "UUIDs of tablet servers to ignore while rebalancing the cluster " : "(comma-separated list). If specified, allow to run the rebalancing " : "when some tablet servers in 'ignored_tservers' are unhealthy. " : "If not specified, allow to run the rebalancing only when all tablet " : "servers are healthy."); : : // Renamed from "kudu/tools/tool_action_cluster.cc" : DEFINE_uint32(autorebalancing_max_moves_per_server, 5, : "Maximum number of replica moves to perform concurrently on one " : "tablet server: 'move from' and 'move to' are counted " : "as separate move operations."); : : // Renamed from "kudu/tools/tool_action_cluster.cc" : DEFINE_uint32(autorebalancing_max_staleness_interval_sec, 300, : "Maximum duration of the 'staleness' interval, when the " : "rebalancer cannot make any progress in scheduling new moves and " : "no prior scheduled moves are left, even if re-synchronizing " : "against the cluster's state again and again. Such a staleness " : "usually happens in case of a persistent problem with the " : "cluster or when some unexpected concurrent activity is " : "present (such as automatic recovery of failed replicas, etc.)"); : : // Renamed from "kudu/tools/tool_action_cluster.cc" : DEFINE_int64(autorebalancing_max_run_time_sec, 0, : "Maximum time to run the rebalancing, in seconds. Specifying 0 " : "means not imposing any limit on the rebalancing run time."); : : // Renamed from "kudu/tools/tool_action_cluster.cc" : DEFINE_string(autorebalancing_move_single_replicas, "auto", : "Whether to move single replica tablets (i.e. replicas of tablets " : "of replication factor 1). Acceptable values are: " : "'auto', 'enabled', 'disabled'. The value of 'auto' means " : "turn it on/off depending on the replica management scheme " : "and Kudu version."); : : // Renamed from "kudu/tools/tool_action_cluster.cc" : DEFINE_bool(autorebalancing_output_replica_distribution_details, false, : "Whether to output details on per-table and per-server " : "replica distribution"); : : // Renamed from "kudu/tools/tool_action_cluster.cc" : DEFINE_bool(autorebalancing_disable_policy_fixer, false, : "In case of multi-location cluster, whether to detect and fix " : "placement policy violations. Fixing placement policy violations " : "involves moving tablet replicas across different locations " : "of the cluster. " : "This setting is applicable to multi-location clusters only."); : : // Renamed from "kudu/tools/tool_action_cluster.cc" : DEFINE_bool(autorebalancing_disable_cross_location_rebalancing, false, : "In case of multi-location cluster, whether to move tablet " : "replicas between locations in attempt to spread tablet replicas " : "among location evenly (equalizing loads of locations throughout " : "the cluster). " : "This setting is applicable to multi-location clusters only."); : : // Renamed from "kudu/tools/tool_action_cluster.cc" : DEFINE_bool(autorebalancing_disable_intra_location_rebalancing, false, : "In case of multi-location cluster, whether to rebalance tablet " : "replica distribution within each location. " : "This setting is applicable to multi-location clusters only."); : : // Renamed and modified from "kudu/tools/tool_action_cluster.cc" : DEFINE_double(autorebalancing_load_imbalance_threshold, 1.0, As a part of refactoring this, rather than exposing _all_ of these as flags, we should probably go through each of these, determine whether or not it's important for each to be present in the autorebalancer. http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@212 PS4, Line 212: wake_up_cv_(&lock_) { This is used in the HMS log listener as a means of "waking up" the log listener thread to tell it there's work to do. AFAICT this isn't the case here. We might want to in the future (e.g. if we want a tool to indicate sooner than within a polling interval to start the next move), but for now, we can probably leave this out. If you want to learn more about condition variables, check out the big comment at the top of util/condition_variable.h http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@269 PS4, Line 269: void AutoRebalancerTask::RunLoop() { It's probably worth looking into how the HMS log listener only does work when it's the leader. We'll probably want to do the same here. Specifically, I think the beginning of HmsNotificationLogListenerTask::Poll() would be interesting to peruse. http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@285 PS4, Line 285: Status cluster_status = PrintStats(LOG(INFO)); This is a decent start; moving forward, we should be mindful of what's still in need of implementation. You shared this with us, but I'll post here for posterity, in this loop, we'll want a means to the following: * stop_flag: boolean flag (persisted) set by a stop trigger--rebalancing should stop. Cleared by whatever set it (eg. when re-replication finishes, or the number of locations changes so that the placement policy is satisfiable). * restart_flag: boolean flag (persisted) set by a start trigger--rebalancing should start, if it hasn’t, and next move should be recalculated, if it has. Cleared by auto-rebalancer when cluster is balanced. * cluster_balanced: boolean value (run-time), whether or not the cluster is balanced * get_next_move(): function that returns the best replica move to reduce skew * execute_next_move(): function that executes the best replica move http://gerrit.cloudera.org:8080/#/c/13891/4/src/kudu/master/autorebalancer.cc@791 PS4, Line 791: : Status AutoRebalancerTask::Rebalancer::GetClusterRawInfo(const boost::optional<std::string>& location, : ClusterRawInfo* raw_info) { : RETURN_NOT_OK(RefreshKsckResults()); : return KsckResultsToClusterRawInfo(location, ksck_->results(), raw_info); : } I think this represents a good chunk of work, either refactoring, or doing some code shuffling, to get cluster info without having a ton of code duplication. -- To view, visit http://gerrit.cloudera.org:8080/13891 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibc829d14dc8636287bf3a4faa4ba76e86ab6885c Gerrit-Change-Number: 13891 Gerrit-PatchSet: 4 Gerrit-Owner: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hannah Nguyen <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Comment-Date: Tue, 23 Jul 2019 21:05:49 +0000 Gerrit-HasComments: Yes
