Dan Hecht has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour ......................................................................
Patch Set 12: (25 comments) http://gerrit.cloudera.org:8080/#/c/7223/12/be/src/exec/blocking-join-node.cc File be/src/exec/blocking-join-node.cc: PS12, Line 192: have_async_build_thread_token_ do we still need this? i guess the idea is to acquire the thread resource in Open() as well? Or is there another reason? http://gerrit.cloudera.org:8080/#/c/7223/11/be/src/exec/blocking-join-node.h File be/src/exec/blocking-join-node.h: PS11, Line 111: or for Line 119: /// SendBuildInputToSink is called to allocate resources for this ExecNode. once we do true multithreading, does this go away? since we'll be able to open the build child inside of this Open(), right? Maybe leave a todo about that to clarify why this is needed. http://gerrit.cloudera.org:8080/#/c/7223/11/be/src/exec/exec-node.h File be/src/exec/exec-node.h: PS11, Line 87: o nit extra space http://gerrit.cloudera.org:8080/#/c/7223/12/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: PS12, Line 395: // Total of minimum memory reservations for a host in bytes. Higher than : // per_host_min_reservation if not all reservations are used concurrently. I can't figure out what this means from just reading this comment. Are there more than one per_host_min_reservation for each host? And why would this be higher if not all reservations are used concurrently (seems like that condition would mean the needed reservation is lower). are one of these values related to all the fragment's instances? or are they really only about the fragment (i.e. what single instance would consume)? Also, in the comment before, it says "buffer reservation" but here we say "memory reservation" - are these the same? http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: PS12, Line 651: build I guess here you mean build as a noun rather than a verb. i.e. this is not the memory used when performing the build. maybe reword to clarify as I had to read the code to understand the comment. Either the next comment is sufficient, or you could reword this to explain the case of when no additional resources are needed. Alternatively, can't we think of this as the resources of this node (but not its children) during probing? I.e. call this probePhaseProfile and rename probePhaseProfile to execProfile below. (See comment in ExecPhaseResourceProfiles about naming). PS12, Line 654: !BackendConfig.INSTANCE.isPartitionedHashJoinEnabled() why is that? the old ones don't consume bufferpool reservations. but also, i think we override this flag for certain build types that aren't supported by the old join. PS12, Line 667: . ... because of the async thread. ? http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java File fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java: PS12, Line 85: joinIds where is that used? http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: PS12, Line 108: plan fragment per host is that for all instances of this fragment running on a host? or something else? PS12, Line 139: nodes isn't that always empty? if so, how about just allocating it here rather than taking it as a param? PS12, Line 222: join build sinks but what about the resources of the fragment itself? PS12, Line 233: peakResources maybe call that fInstanceResources or instanceResources or perFInstanceResources, etc. http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: PS12, Line 637: between Open() is that inclusive of Open()? if so, the name postOpenProfile seems a bit misleading. Maybe duringOpenProfile should just be called openProfile and postOpenProfile should be execProfile? PS12, Line 645: computeResourceProfile computeNodeResourceProfile PS12, Line 658: child is open Maybe say "until after the child's Open() returns." It ultimately means the same thing, but it took me a few minutes to follow what this was trying to tell me about the equation below. http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: Line 62: // estimates of zero, even if the contained PlanNodes have estimates of zero. how is this chosen and why do we need it? PS12, Line 353: Peak it seems like we're using peak and sum to mean the same thing in this function. or is there a subtle distinction? Line 354: // Total of per-host minimum reservations across all plan nodes and sinks. why is that a meaningful value? PS12, Line 361: now that we've populated : // all the profiles in the execution tree below it. where did that happen? Line 391: } note to self to look at this function again in the next iteration. http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java File fe/src/main/java/org/apache/impala/planner/ResourceProfile.java: Line 85: public ResourceProfile multiply(int factor) { comment http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/SubplanNode.java File fe/src/main/java/org/apache/impala/planner/SubplanNode.java: Line 105: // therefore the peak resource consumption is simply the sum of all node resources. do we have good test coverage of this case? http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/UnionNode.java File fe/src/main/java/org/apache/impala/planner/UnionNode.java: PS12, Line 140: Ancestor what node is this referring to? ancestor of the union? or the union itself? http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: Line 1024: queryCtx.setDisable_spilling(disableSpilling); not this change, but do we have good test coverage of this w.r.t. reservations (that it works as well as before)? -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
