Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
......................................................................


Patch Set 17:

(10 comments)

Address review comments and compute the best and the worst processing cost and 
use the geo mean of the two as the query processing cost.

http://gerrit.cloudera.org:8080/#/c/19033/16/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/16/common/thrift/Frontend.thrift@740
PS16, Line 740: select
> nit: selecting
Done


http://gerrit.cloudera.org:8080/#/c/19033/16/common/thrift/Frontend.thrift@750
PS16, Line 750: amount of data
> What's the unit? in bytes?
Yes. The comment is updated.


http://gerrit.cloudera.org:8080/#/c/19033/12/fe/src/main/java/org/apache/impala/analysis/SortInfo.java
File fe/src/main/java/org/apache/impala/analysis/SortInfo.java:

http://gerrit.cloudera.org:8080/#/c/19033/12/fe/src/main/java/org/apache/impala/analysis/SortInfo.java@323
PS12, Line 323:     return result;
> Still not comfortable with the division (by instances/cores) here. This is
Encapsulated processing cost in a new class ProcessingCost which contains the 
totalCost and numInstance.

With that, the sum of two ProcessingCost objects will take the max of 
numInstances of the two assuming the instance/thread can be reused.

The per instance value is only used to establish an order among two 
ProcessingCost objects. The final processing cost is the geo mean of the best 
and worst processing cost.


http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@311
PS16, Line 311: ess
> processing
Done


http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java
File fe/src/main/java/org/apache/impala/planner/ResourceProfile.java:

http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java@137
PS16, Line 137: put.a
> Processing cost is already reported. What's the TODO?
Sorry this change is enabled only for debug on my R&D box and should be removed.

Done.


http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/SubplanNode.java
File fe/src/main/java/org/apache/impala/planner/SubplanNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/SubplanNode.java@106
PS16, Line 106: blic void computeNodeResourceProfile(TQueryOptions 
queryOptions) {
              :     ProcessingCost processingCost = 
AutoScaleUtil.computeAndLogProcessingCost(
              :         getDisplayLabel(), cardinality_, 0, getAvgRowSize(), 
numInstances_);
              :     // TODO: add an estimate
              :     nodeResourceProfile_ = ResourceProfile.noReservation(0, 
processingCost);
              :   }
              :
> Call AutoScaleUtil.computeAndLogProcessingCost()
Done


http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/service/Frontend.java@1728
PS16, Line 1728:
> processing cost
Done


http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/service/Frontend.java@1799
PS16, Line 1799: urr_num_exe
> Why set this value?
This is to setup up a pseudo executor group for testing only.


http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/service/Frontend.java@1846
PS16, Line 1846: lConfig poolConfig =
> Why set the same value as query memory limit?
It is to be replaced by the new API that provides the processing cost limit.


http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/util/AutoScaleUtil.java
File fe/src/main/java/org/apache/impala/util/AutoScaleUtil.java:

http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/util/AutoScaleUtil.java@39
PS16, Line 39:
             :   public static ProcessingCost 
computeAndLogProcessingCost(String label, long cardinality,
             :       float exprsCost, float avgRowSize, int numInstances) {
             :     if (LOG.isTraceEnabled()) {
             :       LOG.trace(getProcessingCostComputationDetails(
> This function call computeProcessingCost() twice. Could we reorganize the c
Done



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 17
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Tue, 04 Oct 2022 18:01:49 +0000
Gerrit-HasComments: Yes

Reply via email to