Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 )
Change subject: IMPALA-11604 Planner changes for CPU usage ...................................................................... Patch Set 17: (10 comments) Address review comments and compute the best and the worst processing cost and use the geo mean of the two as the query processing cost. http://gerrit.cloudera.org:8080/#/c/19033/16/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/19033/16/common/thrift/Frontend.thrift@740 PS16, Line 740: select > nit: selecting Done http://gerrit.cloudera.org:8080/#/c/19033/16/common/thrift/Frontend.thrift@750 PS16, Line 750: amount of data > What's the unit? in bytes? Yes. The comment is updated. http://gerrit.cloudera.org:8080/#/c/19033/12/fe/src/main/java/org/apache/impala/analysis/SortInfo.java File fe/src/main/java/org/apache/impala/analysis/SortInfo.java: http://gerrit.cloudera.org:8080/#/c/19033/12/fe/src/main/java/org/apache/impala/analysis/SortInfo.java@323 PS12, Line 323: return result; > Still not comfortable with the division (by instances/cores) here. This is Encapsulated processing cost in a new class ProcessingCost which contains the totalCost and numInstance. With that, the sum of two ProcessingCost objects will take the max of numInstances of the two assuming the instance/thread can be reused. The per instance value is only used to establish an order among two ProcessingCost objects. The final processing cost is the geo mean of the best and worst processing cost. http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@311 PS16, Line 311: ess > processing Done http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java File fe/src/main/java/org/apache/impala/planner/ResourceProfile.java: http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/ResourceProfile.java@137 PS16, Line 137: put.a > Processing cost is already reported. What's the TODO? Sorry this change is enabled only for debug on my R&D box and should be removed. Done. http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/SubplanNode.java File fe/src/main/java/org/apache/impala/planner/SubplanNode.java: http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/planner/SubplanNode.java@106 PS16, Line 106: blic void computeNodeResourceProfile(TQueryOptions queryOptions) { : ProcessingCost processingCost = AutoScaleUtil.computeAndLogProcessingCost( : getDisplayLabel(), cardinality_, 0, getAvgRowSize(), numInstances_); : // TODO: add an estimate : nodeResourceProfile_ = ResourceProfile.noReservation(0, processingCost); : } : > Call AutoScaleUtil.computeAndLogProcessingCost() Done http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/service/Frontend.java@1728 PS16, Line 1728: > processing cost Done http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/service/Frontend.java@1799 PS16, Line 1799: urr_num_exe > Why set this value? This is to setup up a pseudo executor group for testing only. http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/service/Frontend.java@1846 PS16, Line 1846: lConfig poolConfig = > Why set the same value as query memory limit? It is to be replaced by the new API that provides the processing cost limit. http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/util/AutoScaleUtil.java File fe/src/main/java/org/apache/impala/util/AutoScaleUtil.java: http://gerrit.cloudera.org:8080/#/c/19033/16/fe/src/main/java/org/apache/impala/util/AutoScaleUtil.java@39 PS16, Line 39: : public static ProcessingCost computeAndLogProcessingCost(String label, long cardinality, : float exprsCost, float avgRowSize, int numInstances) { : if (LOG.isTraceEnabled()) { : LOG.trace(getProcessingCostComputationDetails( > This function call computeProcessingCost() twice. Could we reorganize the c Done -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 17 Gerrit-Owner: Qifan Chen <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Tue, 04 Oct 2022 18:01:49 +0000 Gerrit-HasComments: Yes
