Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/21073 )
Change subject: Remove the replace flag if the move fails in auto rebalancing. ...................................................................... Patch Set 12: (8 comments) http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer-test.cc File src/kudu/master/auto_rebalancer-test.cc: http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer-test.cc@823 PS12, Line 823: rebalance nit: rebalancing http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer-test.cc@855 PS12, Line 855: style nit for here and below: stick the ampersand to the type, not the variable name http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer.h File src/kudu/master/auto_rebalancer.h: http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer.h@133 PS12, Line 133: // Finds replicas that are specified in 'replica_moves' and make requests : // to have them moved in order to rebalance the cluster. Please update this in-line doc to reflect the semantics of the changes in this patch. http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer.cc File src/kudu/master/auto_rebalancer.cc: http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer.cc@141 PS12, Line 141: DEFINE_bool(auto_rebalancing_fail_moves_for_test, false, Consider tagging this new flag with the 'unsafe' tag. http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer.cc@142 PS12, Line 142: nit: wrong indent http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer.cc@698 PS12, Line 698: LOG(WARNING) << Substitute("Could not move replica and the replace flag was $0removed: $1", : is_flag_cancelled ? "" : "not ", s.ToString()); The message is now quite confusing. Please separate two logs: logging about the failed move and success/failure of clearing the 'REPLACE' flag. http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/master/auto_rebalancer.cc@724 PS12, Line 724: Failing the check for test. nit: injected artificial test failure http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/rebalance/rebalancer.h File src/kudu/rebalance/rebalancer.h: http://gerrit.cloudera.org:8080/#/c/21073/12/src/kudu/rebalance/rebalancer.h@158 PS12, Line 158: HostPort leader_hp; : std::string leader_uuid; Why to store this information at all? This smells not quite good: * a tablet leader might change since this structure has been populated * in most cases, this is just a waste because the majority of replicas movements usually succeed Is it possible to get the information on the Raft configuration for a particular tablet on-the-fly, upon trying to remove the REPLACE attribute for a particular replica? -- To view, visit http://gerrit.cloudera.org:8080/21073 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I99dafa654878b9d6d8162d84500913ae0655692b Gerrit-Change-Number: 21073 Gerrit-PatchSet: 12 Gerrit-Owner: Song Jiacheng <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Song Jiacheng <[email protected]> Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 27 Mar 2024 03:50:52 +0000 Gerrit-HasComments: Yes
