Dan Hecht has posted comments on this change.

Change subject: IMPALA-4862: make resource profile consistent with backend 
behaviour
......................................................................


Patch Set 12:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/7223/12/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

PS12, Line 192: have_async_build_thread_token_
do we still need this? i guess the idea is to acquire the thread resource in 
Open() as well? Or is there another reason?


http://gerrit.cloudera.org:8080/#/c/7223/11/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:

PS11, Line 111: or
for


Line 119:   /// SendBuildInputToSink is called to allocate resources for this 
ExecNode.
once we do true multithreading, does this go away? since we'll be able to open 
the build child inside of this Open(), right? Maybe leave a todo about that to 
clarify why this is needed.


http://gerrit.cloudera.org:8080/#/c/7223/11/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

PS11, Line 87:  o
nit extra space


http://gerrit.cloudera.org:8080/#/c/7223/12/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

PS12, Line 395:   // Total of minimum memory reservations for a host in bytes. 
Higher than
              :   // per_host_min_reservation if not all reservations are used 
concurrently.
I can't figure out what this means from just reading this comment.  Are there 
more than one per_host_min_reservation for each host? And why would this be 
higher if not all reservations are used concurrently (seems like that condition 
would mean the needed reservation is lower).

are one of these values related to all the fragment's instances? or are they 
really only about the fragment (i.e. what single instance would consume)?

Also, in the comment before, it says "buffer reservation" but here we say 
"memory reservation" - are these the same?


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

PS12, Line 651: build
I guess here you mean build as a noun rather than a verb. i.e. this is not the 
memory used when performing the build. maybe reword to clarify as I had to read 
the code to understand the comment.

Either the next comment is sufficient, or you could reword this to explain the 
case of when no additional resources are needed.

Alternatively, can't we think of this as the resources of this node (but not 
its children) during probing? I.e. call this probePhaseProfile and rename 
probePhaseProfile to execProfile below. (See comment in 
ExecPhaseResourceProfiles about naming).


PS12, Line 654: !BackendConfig.INSTANCE.isPartitionedHashJoinEnabled()
why is that? the old ones don't consume bufferpool reservations.
but also, i think we override this flag for certain build types that aren't 
supported by the old join.


PS12, Line 667: .
... because of the async thread. ?


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java
File fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java:

PS12, Line 85: joinIds
where is that used?


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

PS12, Line 108: plan fragment per host
is that for all instances of this fragment running on a host? or something else?


PS12, Line 139: nodes
isn't that always empty? if so, how about just allocating it here rather than 
taking it as a param?


PS12, Line 222: join build sinks
but what about the resources of the fragment itself?


PS12, Line 233: peakResources
maybe call that fInstanceResources or instanceResources or 
perFInstanceResources, etc.


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

PS12, Line 637: between Open()
is that inclusive of Open()? if so, the name postOpenProfile seems a bit 
misleading. Maybe duringOpenProfile should just be called openProfile and 
postOpenProfile should be execProfile?


PS12, Line 645: computeResourceProfile
computeNodeResourceProfile


PS12, Line 658: child is open
Maybe say "until after the child's Open() returns."  It ultimately means the 
same thing, but it took me a few minutes to follow what this was trying to tell 
me about the equation below.


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 62:   // estimates of zero, even if the contained PlanNodes have estimates 
of zero.
how is this chosen and why do we need it?


PS12, Line 353: Peak
it seems like we're using peak and sum to mean the same thing in this function. 
or is there a subtle distinction?


Line 354:     // Total of per-host minimum reservations across all plan nodes 
and sinks.
why is that a meaningful value?


PS12, Line 361: now that we've populated
              :       // all the profiles in the execution tree below it.
where did that happen?


Line 391:   }
note to self to look at this function again in the next iteration.


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
File fe/src/main/java/org/apache/impala/planner/ResourceProfile.java:

Line 85:   public ResourceProfile multiply(int factor) {
comment


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/SubplanNode.java
File fe/src/main/java/org/apache/impala/planner/SubplanNode.java:

Line 105:     // therefore the peak resource consumption is simply the sum of 
all node resources.
do we have good test coverage of this case?


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/planner/UnionNode.java
File fe/src/main/java/org/apache/impala/planner/UnionNode.java:

PS12, Line 140: Ancestor
what node is this referring to? ancestor of the union? or the union itself?


http://gerrit.cloudera.org:8080/#/c/7223/12/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

Line 1024:     queryCtx.setDisable_spilling(disableSpilling);
not this change, but do we have good test coverage of this w.r.t. reservations 
(that it works as well as before)?


-- 
To view, visit http://gerrit.cloudera.org:8080/7223
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to