Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/10336 )
Change subject: WIP [rebalancing] high-level rebalancing algorithm ...................................................................... Patch Set 3: (55 comments) Thank you for the thorough review! http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc File src/kudu/tools/rebalance-algo-test.cc: http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@194 PS3, Line 194: { : { "0", "1", }, : { : { "A", { 1, 1, } }, : { "B", { 1, 2, } }, : }, : }, > Missing a comment on this one. It seems unnecessary given the test case imm The case below has 0 cluster-wise skew, actually. I added a comment. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@221 PS3, Line 221: and > nit: Extra word. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@222 PS3, Line 222: at > nit: on. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@262 PS3, Line 262: > nit: , Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@263 PS3, Line 263: achive > nit: achieve. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@263 PS3, Line 263: balanced state > nit: balance. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@264 PS3, Line 264: Tablet > nit: Table. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@272 PS3, Line 272: "B" > It'd be better if it weren't deterministic which table it uses for the move Yeah, that's the next step, I think. The idea is to specify the moves in some order, but during the verification phase make it loose in terms of order if some additional flag is set for TestClusterConfig instance. For first stages I want to be sure I understand how the algorithm works, so I'd like to keep the order checked as well. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@282 PS3, Line 282: { { "D", "1", "0" }, } > Aside: we'll have to think about single-replica tables. Does 3-4-3 (1-2-1?) Good question: AFAIK the 3-4-3 scheme should work for tables of replication factor 1 while moving replicas around. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@321 PS3, Line 321: { : { "0", "1", }, : { : { "A", { 1, 2, } }, : { "B", { 2, 0, } }, : { "C", { 2, 1, } }, : }, : { : { "B", "0", "1" }, : } : }, > 0 has 5 total replicas; 1 has 3, so it doesn't fit in this test. Good catch! Moved it into the appropriate scenario and updated the typo in this one to make it fit in here. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo-test.cc@483 PS3, Line 483: > Coverage is good but we could also use a randomized test. I could add it in For the randomized test it's a bit harder to specify the sequence of moves to be in the necessary order (basically, it's about writing 'reference' balancing algorithm). However, I think it's easy to compute the reference number of move operations and making sure the algorithm converges, outputting with a balanced result. I think I can add that. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h File src/kudu/tools/rebalance-algo.h: http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@33 PS3, Line 33: Information on t > nit: Howabout just "Balance information for a table." Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@34 PS3, Line 34: TableReplicasInfo > nit: I like TableBalanceInfo. It's parallel with ClusterBalanceInfo below. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@35 PS3, Line 35: // Table's unique identifier. > nit: I think this comment isn't needed. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@38 PS3, Line 38: Total number of tablet replicas of the table per tablet server. > nit: It's the reverse of this relationship. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@41 PS3, Line 41: table_replica_num_by_server > This name's confusing because it's the reverse of the actual relation, e.g. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@41 PS3, Line 41: size_t > nit: The GSG's rule is to use signed integer types for numbers. See https:/ Yeah, GSG has some strange conventions by my understanding -- it seems it was written by java people. :) But since we stick to it, sure -- let's use those. As a side note, I think that using signed integers here would result in necessity to add checks for the non-negativity of the values, so instead of checks for bad arithmetic you need to add checks for wrong values. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@48 PS3, Line 48: replicas_info > The values in this map represent tables, so I think a name like table_balan Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@48 PS3, Line 48: size_t > nit: Same. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@50 PS3, Line 50: Total number of replicas of all tables per tablet server. > nit: Same as comment on L38. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@53 PS3, Line 53: replica_num_by_server > nit: Same naming nit as L41. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@53 PS3, Line 53: size_t > nit: Same. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@55 PS3, Line 55: ToString > Would you mind putting a comment here explaining what the string looks like Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@58 PS3, Line 58: the > nit: Extra "the". Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@58 PS3, Line 58: information information > Repeated word. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@59 PS3, Line 59: The algorithm does not : // differentiate such entities as tablets > After our conversation yesterday I understand what you mean here, but I thi Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@64 PS3, Line 64: // Unique identifier of the table. > I think this comment is redundant. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@65 PS3, Line 65: Unique identifier > nit: I'd say UUID instead of "unique identifier" because I "UUID" always me Yeah, that's right -- the algorithm can consume any form of unique identifiers. Whether it's UUID or some other type of string identifiers, it does not matter (i.e. stringified numbers will do as well). That said, I'd better keep 'unique identifier' instead of UUID. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@66 PS3, Line 66: Unique identifier > nit: Same as above. same http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@79 PS3, Line 79: The method puts the results into the 'moves' output parameter > How would a caller know when to stop calling GetNextMoves? Is there a guara I think the notation on the already-balanced configuration should be the same for all algorithms. I added corresponding comment about that. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@79 PS3, Line 79: into the 'moves' output parameter > nit: Can you make it clear that 'moves' is cleared and filled with the resu Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@81 PS3, Line 81: size_t > nit: Same as L41. I'd like to keep it as size_t to reflect the type of the std::vector<>::size_type parameter. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.h@90 PS3, Line 90: > nit: Insert "an". Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc File src/kudu/tools/rebalance-algo.cc: http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@55 PS3, Line 55: replica_num_by_server > nit: Remember to change this if/when you change the name of the struct fiel Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@57 PS3, Line 57: s << " " << elem.first << ":" << elem.second; > If this is for debug purposes only then it's fine however it's easiest for Yep, I consider this as a draft approach to output information on the replica balance information for a cluster. Sure, that should be updated to be more comprehensive and pretty-printed. Maybe, I will work on that during the next phase (i.e. while implementing the move engine based on ksck info in the kudu CLI tool). http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@60 PS3, Line 60: replicas_info > Likewise. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@77 PS3, Line 77: ClusterBalanceInfo info(cluster_info); > nit: Can you add a comment explaining why we copy? IIUC, it's so we can gen Yep -- I changed the signature to pass the cluster_info parameter by value and added the comment. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@95 PS3, Line 95: { > nit: Add a comment explaining the purpose of this block. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@101 PS3, Line 101: auto it = replica_num_by_server.begin() > Maybe we need to keep both the mapping ts uuid -> count and count -> ts uui Yep, that would be a good alternative as well. I think iterating over the container like this is not a big deal since the number of tablet servers should not be huge. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@115 PS3, Line 115: it = replica_num_by_server.erase(it); > This effectively advances the iterator, right? Yes, this erases the element and updates the iterator to point to the next element. I added comment. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@133 PS3, Line 133: auto it = replicas_info.begin(); it != replicas_info.end() > Likewise I think we could use a "two-way mapping" class to unify much of t Good idea. I think I can come up with an update in the next version :) http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@175 PS3, Line 175: instert > nit: extra 't'. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@181 PS3, Line 181: DCHECK_GE(max_count, min_count) > Since table_info.table_replica_num_by_server is a multimap, isn't this just Yep, that and also testing for typos, if any. It's also important to do so because the skew in that version was of unsigned type. Besides, one day the ordering of the elements might be changed because of update on the comparison operator for the container. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@190 PS3, Line 190: TwoDimensionalGreedyAlgo::GetNextMove > Would you add a short comment on the different major parts of the algorithm Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@204 PS3, Line 204: 2 > nit: I think this could be 1. 1 is typical in tools and this should only fi Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@210 PS3, Line 210: 3 > Could be 1 I think. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@211 PS3, Line 211: 3 > Same. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@213 PS3, Line 213: // The distribution of replicas for this table is already balanced. However, : // it's necessary to check for the opportunity to balance the replicas : // cluster-wise. > I think the code is right here but the comment is wrong/stale. The table is Good catch! Indeed, the comment was stale. http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@224 PS3, Line 224: DCHECK(!max_loaded.empty()); > Maybe we ought to check if cnt_by_server is empty earlier. It's a great idea! http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@227 PS3, Line 227: !max_loaded.empty() > This is unnecessary given the DCHECK. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@236 PS3, Line 236: sort(max_loaded.begin(), max_loaded.end()); : sort(max_loaded_cluster_wise.begin(), max_loaded_cluster_wise.end()); : set_intersection( : max_loaded.begin(), max_loaded.end(), : max_loaded_cluster_wise.begin(), max_loaded_cluster_wise.end(), : back_inserter(max_loaded_intersection)); > We can probably separate this out as a function or something since it's re- Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@249 PS3, Line 249: DCHECK(!min_loaded.empty()); > Likewise seems unnecessary if we check cnt_by_server is nonempty. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@252 PS3, Line 252: !min_loaded.empty() > Redundant. Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@277 PS3, Line 277: 3 > 1 Done http://gerrit.cloudera.org:8080/#/c/10336/3/src/kudu/tools/rebalance-algo.cc@280 PS3, Line 280: 3 > 1 Done -- To view, visit http://gerrit.cloudera.org:8080/10336 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a8050ee79117a2ae2f6f88740ed25656946cfb4 Gerrit-Change-Number: 10336 Gerrit-PatchSet: 3 Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-Comment-Date: Thu, 10 May 2018 03:27:28 +0000 Gerrit-HasComments: Yes