Marcel Kornacker has posted comments on this change. Change subject: IMPALA-2979: Fix scheduling on remote hosts ......................................................................
Patch Set 25: (20 comments) the header file looks good structurally http://gerrit.cloudera.org:8080/#/c/2200/24/be/src/scheduling/simple-scheduler.h File be/src/scheduling/simple-scheduler.h: Line 380: const TResourceBrokerReservationResponse& reservation, Coordinator* coord); > Done. Being a class member method it will also affect member variables of t the comments needs to describe the externally visible behavior, which includes side effects. as part of that, you should point out which class variables get updated (unless the answer is 'most of them'). http://gerrit.cloudera.org:8080/#/c/2200/25/be/src/scheduling/simple-scheduler.h File be/src/scheduling/simple-scheduler.h: Line 91: /// List of backends. no need to repeat/paraphrase type name Line 112: /// Construct maps from list of backends. config Line 137: /// Internal structure to track scan range assignments for each backend and prioritize "for a backend". remove comment about prioritization, this structure doesn't do the prioritization. Line 141: /// The number of bytes assigned to a backend host. Part of the key of backend or backend host? Line 142: /// AddressableAssignmentHeap. This gets initialized when elements are inserted into don't comment on other data structures here Line 164: /// will make the top() element of the heap be the backend with the least number of "lowest number" Line 174: /// update their key. For each plan node we create a new heap, so they are not shared the last sentence should go into the structure that contains an object of this class Line 199: /// member in AssignmentCtx. move into AssignmentCtx? is it used outside of that? Line 222: /// data_locations: List of candidate replica IP addresses, from which the block redundant w/ earlier part of comment Line 224: /// break_ties_by_rank: Whether to consider the random rank during backend selection. same here Line 229: /// those will be preferred. Otherwise one is selected from the assignment heap. describe the externally observable behavior (what are the characteristics of the selected backend), not the implementation ("from assignment heap") Line 233: /// pointer. This assumes that a returned backend will also be assigned to. The caller "increment the internal pointer" isn't meaningful Line 242: const TScanRangeLocations& scan_range_locations, role of this param? Line 252: /// Used to lookup hostnames to IP addresses and IP addresses to backend. use verb, not noun Line 291: /// The scheduler's backend maps. When receiving changes to the backend configuration replace use of term 'backend maps' throughout Line 292: /// from the statestore we will make a copy of those maps, apply the updates to the copy this is fairly central to how this class functions, and should be pointed out in the class comment Line 424: /// assigned work is picked, thus aiming to distribute the work as even as possible. as evenly Line 426: /// Finally scan ranges are considered, which do not have an impalad backend running on remove comma Line 446: /// amount of assigned work), then the first one will be picked deterministically. only true for disk replicas, no? -- To view, visit http://gerrit.cloudera.org:8080/2200 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I044f83806fcde820fcb38047cf6b8e780d803858 Gerrit-PatchSet: 25 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Lars Volker <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: anujphadke <[email protected]> Gerrit-HasComments: Yes
