Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 )
Change subject: [WIP] IMPALA-11604 Planner changes for CPU usage ...................................................................... Patch Set 10: (7 comments) Good start. I think the following changes are necessary and worth spending extra time on: 1) Fragment cost should conceptually be an amount of time required to process the estimated data for the fragment. That way, we can divide by cores available and factor consider SLA times through direct calculations. Any functions and variables should probably naming like CPUCost or ProcessingCost. 2) It's probably worth spending time making a new set of Expr costing functions instead of multiplying the EXPR constants by the row width. Comments in Expr.java state that those constants are not suitable for absolute cost computation. It should be reasonable to add a mechanism to do actual time measurements for Expr evaluation during init and use measurements to compute accurate time costs. http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Frontend.thrift@752 PS10, Line 752: 5: optional i64 max_data_processed_limit It's probably best to stick with vcores here as what the executor group provides. The coordinator can work backwards from the query costing to determine vcore sizing for a query. http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/19033/10/common/thrift/Query.thrift@873 PS10, Line 873: 13: optional i64 data_processed_bytes; Vcores here also. http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@727 PS10, Line 727: * getIntermediateTupleDesc().getByteSize()) / Math.max(numInstances, 1); Don't divide by numInstances in these functions. Instead accumulate the fragment cost then divide by a (cost/core) number to arrive at a number of cores required. SLA can also be factored in later for to fragment That way. http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@627 PS10, Line 627: .setDataProcessedBytes( Use (abstract) processingCost naming instead of processedBytes. http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@362 PS10, Line 362: public long computeDataProcessedBytes() { rename all of these to computeProcessingCost or similar http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java@308 PS10, Line 308: float eualJoinPredicateEvalCost = eqJoinPredicateEvalCost http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/util/ExprUtil.java File fe/src/main/java/org/apache/impala/util/ExprUtil.java: http://gerrit.cloudera.org:8080/#/c/19033/10/fe/src/main/java/org/apache/impala/util/ExprUtil.java@109 PS10, Line 109: public static float computeConjuctsTotalCost(List<Expr> conjuncts) { Use List<? extends Expr> to share the same function. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 10 Gerrit-Owner: Qifan Chen <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Wed, 28 Sep 2022 19:17:43 +0000 Gerrit-HasComments: Yes
