Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/11207 )
Change subject: [location_awareness] replica selection honors placement policy ...................................................................... Patch Set 7: (20 comments) http://gerrit.cloudera.org:8080/#/c/11207/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11207/7//COMMIT_MSG@16 PS7, Line 16: avaialable > available Done http://gerrit.cloudera.org:8080/#/c/11207/7//COMMIT_MSG@17 PS7, Line 17: load > replicas Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/catalog_manager.h@878 PS7, Line 878: Select required number tablet servers to place replicas for the given newly : // created table > Howabout "Select the tablet servers where the given newly-created tablet's Woops, indeed that was a strange wording. It seems I wrote that while heaving that headache last Friday, sorry. Updated. http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/catalog_manager.h@880 PS7, Line 880: is > are Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/catalog_manager.h@882 PS7, Line 882: The > nit: Remove Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/catalog_manager.cc@4336 PS7, Line 4336: RETURN_NOT_OK(policy.PlaceTabletReplicas(nreplicas, &descriptors)); > Can we add the tablet ID and table name to the status returned here? Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/catalog_manager.cc@4345 PS7, Line 4345: // TODO(unknown): this is temporary, we will use only UUIDs > nit: I don't think this is temporary anymore :). If there's no JIRA related Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.h File src/kudu/master/placement_policy.h: http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.h@41 PS7, Line 41: the > Remove Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.h@73 PS7, Line 73: of > Remove Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc File src/kudu/master/placement_policy.cc: http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@55 PS7, Line 55: a > the Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@62 PS7, Line 62: deltas: information location of replicas that not yet placed > Stale? Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@63 PS7, Line 63: ltd > Out of order w.r.t the function parameters. Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@74 PS7, Line 74: auto num_live_replicas = accumulate( : ts_descriptors.begin(), ts_descriptors.end(), 0, : [](int val, const shared_ptr<TSDescriptor>& desc) { : return val + desc->num_live_replicas(); : }); : const auto liit = locations_info.find(location); : if (liit != locations_info.end()) { : num_live_replicas += liit->second; : } > Is this computing R then adding R to it, so we return 2x the load? Nope, that's computing R for already existing replicas, and then adding number of replicas just slated for placement, but not instantiated yet. I'll add a comment to clarify on that. http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@107 PS7, Line 107: > > Should we (D?)CHECK that > doesn't happen? That's a good idea; added CHECK_LE(). http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@115 PS7, Line 115: for > Remove Done http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@115 PS7, Line 115: to > of Done. I really need to re-read my writings at least twice. http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@125 PS7, Line 125: if (location_per_load.empty()) { : // If placing an additional replica to any location will make it containing : // the majority of replilcas of the same tablet, place the replica into : // one of the less loaded locations (see below). : location_per_load.swap(location_per_load_majority); : } > If this happens and the RF is 2k, then each location must have k replicas, That's right, but there is a case when it's not enough tablet serves to place the requested number of replicas. The way how it's written is a convenient way to avoid placing a majority of replicas at the same location, even if the load-based criterion would favor that placement. It seems some comments were wrong/misleading. I updated the comments and added DCHECKS() to make it clearer. http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@147 PS7, Line 147: if (rng) > We don't require rng to be non-null? Elsewhere you required it to outlive t Yep, that's right: no use cases without RNG so far. Done. http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@141 PS7, Line 141: if (distance < 2) { : return it->second; : } : : // In case if multiple location candidates, select a random one : // if RNG is available. : if (rng) { : std::advance(it, rng->Uniform(distance)); : } > This can be handled uniformly, right? I tried to avoid calling an extra function where it's not necessary. OK, let's choose better readability instead. http://gerrit.cloudera.org:8080/#/c/11207/7/src/kudu/master/placement_policy.cc@241 PS7, Line 241: //auto loc = SelectLocation(all_locations, {}, ltd_, location_info, rng_); > Leftover bit of code. Done -- To view, visit http://gerrit.cloudera.org:8080/11207 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4169098abf17d5591d4c1675561afc15b5477fcd Gerrit-Change-Number: 11207 Gerrit-PatchSet: 7 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Thu, 06 Sep 2018 00:44:54 +0000 Gerrit-HasComments: Yes
