Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 )
Change subject: IMPALA-11604 Planner changes for CPU usage ...................................................................... Patch Set 43: (10 comments) http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/scheduling/scheduler.h File be/src/scheduling/scheduler.h: http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/scheduling/scheduler.h@441 PS42, Line 441: TPlanFragment.effective_ > nit: TPlanFragment.effective_instance_count. Otherwise, it's hard to find c Done http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc File be/src/util/backend-gflag-util.cc: http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@206 PS42, Line 206: 1 > These are relative costs, right? "1" is the minimum cost? Yes, relative cost. 1 is the minimum. http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@207 PS42, Line 207: p > nit: don't break line here Done http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@211 PS42, Line 211: um processing cost that a frag > what's the normal range? why set default value as 1000000? This is probably subject to tuning. The original intent is to assign "minimum work" per thread, that is 1 million rows per thread/fragment instance. This "min_processing_cost_per_thread" is roughly scale linearly towards num rows per thread, but with the weight factored in as well (C & M explained in the commit message). I did tests against tpcds_3000_parquet (10 nodes, mt_dop=12) and this config value seem to work fine for most queries. Reducing this value will cause planner to schedule more fragment instance, while increasing it will reduce fragment instance. http://gerrit.cloudera.org:8080/#/c/19033/42/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/19033/42/common/thrift/ImpalaService.thrift@767 PS42, Line 767: Allow CPU costing algorithm to schedule fragment instance count higher > The comment is different from the meaning of query option Fixed. http://gerrit.cloudera.org:8080/#/c/19033/42/common/thrift/ImpalaService.thrift@771 PS42, Line 771: q > Why the max value is 64? This follow the scale of MT_DOP. http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java File fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java: http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@47 PS42, Line 47: childProcessingCost_ > Could we move this Precondition.checkState to constructor? I look the code again and think that multiple_ is redundant with numInstanceSupplier_ from the base class. I refactored the class accordingly and move the Precondition to ProcessingCost.setNumInstanceExpected(). http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/DataSink.java File fe/src/main/java/org/apache/impala/planner/DataSink.java: http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/DataSink.java@68 PS42, Line 68: checkState(proc > Should we add Preconditions to check if processingCost_ is valid? Done http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java File fe/src/main/java/org/apache/impala/planner/ExchangeNode.java: http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java@265 PS42, Line 265: per row. > per row? Done http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/util/ExprUtil.java File fe/src/main/java/org/apache/impala/util/ExprUtil.java: http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/util/ExprUtil.java@119 PS42, Line 119: e.getCost() : 1 > What's cost range? Is the value 1 the minimum value? It is between 1 to 10. IMPALA-2805 add these costs at https://github.com/apache/impala/blob/40da36414ff4d46b5cdc53f068b1f0a5b28a0f1d/fe/src/main/java/org/apache/impala/analysis/Expr.java#L79-L94 -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 43 Gerrit-Owner: Qifan Chen <qfc...@hotmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qfc...@hotmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Comment-Date: Mon, 06 Feb 2023 22:56:32 +0000 Gerrit-HasComments: Yes