Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 )
Change subject: IMPALA-11604 Planner changes for CPU usage ...................................................................... Patch Set 46: (7 comments) Thanks a lot Riza for the update to the commit message, in particular the examples which helps me understanding the step 2, 3 and 4 better. I provided an example to emphasize the importance of integrating the logic in step 2, 3 and 4 into the computation of PC(node). Please let me know if this will fly. http://gerrit.cloudera.org:8080/#/c/19033/46//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19033/46//COMMIT_MSG@138 PS46, Line 138: a list of ProcessingCost I think a single processingCost (instead of a list of processingCost) may be sufficient for a fragment, to compute effective DoP. As we walk bottom-up, we compute that for each blocking segment from its children and finally that for the entire fragment. Take exchange(sort(join(sort(scan T1), sort(scan T2))) as example, we can compute 1. PC(sort(scan T1)) = (x, 10) - cost of x, 10 cores 2. PC(sort(scan T2)) = (y, 16) - cost of y, 16 cores 3. PC(join) = (x, 10) + (y, 16) = (x+y, 26), if we want the two sort nodes to work in parallel for a merge join. For hash join, we should take the max on # cores to arrive at (z, 16), since the build side is done first, followed by the probing side. 4. PC(sort(join)) = PC(join) since this top sort can not proceed unless the join below produces all the results. That is, there is no need to allocate extra cores for the this top sort. 5. The # of cores is available: 26 for MJ, 16 for HJ for the entire fragment. Note that in this example, the computation of # of cores takes care of blocking nodes and more importantly the semantics of query nodes. As a bonus, we may be able to get rid of step II, III and IV all together. To make it work efficiently, PC(node) can be extended into (C, N, blocking), where the boolean flag <blocking> = 1. true when the subtree rooted at node contains a blocking node 2. false otherwise. http://gerrit.cloudera.org:8080/#/c/19033/46//COMMIT_MSG@149 PS46, Line 149: F03:PLAN FRAGMENT [HASH(i_class)] hosts=3 instances=3 : fragment-costs=[34550429, 2159270, 23752870, 1] : 08:TOP-N [LIMIT=100] : | cost=900 : | : 07:ANALYTIC : | cost=23751970 : | : 06:SORT : | cost=2159270 : | : 12:AGGREGATE [FINALIZE] : | cost=34548320 : | : 11:EXCHANGE [HASH(i_class)] : cost=2109 nit. Indentation can help readability. http://gerrit.cloudera.org:8080/#/c/19033/46//COMMIT_MSG@185 PS46, Line 185: the query plan tree visiting : PlanFragment from the leaf and going up to the root nit. ,bottom-up, all PlanFragments in the query plan tree. http://gerrit.cloudera.org:8080/#/c/19033/46//COMMIT_MSG@190 PS46, Line 190: its adjacent blocking segments (elements in processingCosts_ : list) See my comment on the removal of the list of processingCost above. http://gerrit.cloudera.org:8080/#/c/19033/46//COMMIT_MSG@206 PS46, Line 206: F04:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 : PLAN-ROOT SINK : | : 13:MERGING-EXCHANGE [UNPARTITIONED] : | : F03:PLAN FRAGMENT [HASH(i_class)] hosts=3 instances=3 (adjusted from 12) : 08:TOP-N [LIMIT=100] : | : 07:ANALYTIC : | : 06:SORT : | : 12:AGGREGATE [FINALIZE] : | : 11:EXCHANGE [HASH(i_class)] : | : F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=12 : 05:AGGREGATE [STREAMING] : | : 04:HASH JOIN [INNER JOIN, BROADCAST] : | : |--F05:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 : | JOIN BUILD : | | : | 10:EXCHANGE [BROADCAST] : | | : | F02:PLAN FRAGMENT [RANDOM] hosts=1 instances=1 : | 02:SCAN HDFS [tpcds10_parquet.date_dim, RANDOM] : | : 03:HASH JOIN [INNER JOIN, BROADCAST] : | : |--F06:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 : | JOIN BUILD : | | : | 09:EXCHANGE [BROADCAST] : | | : | F01:PLAN FRAGMENT [RANDOM] hosts=1 instances=1 : | 01:SCAN HDFS [tpcds10_parquet.item, RANDOM] : | : 00:SCAN HDFS [tpcds10_parquet.web_sales, RANDOM] nit indentation. http://gerrit.cloudera.org:8080/#/c/19033/46//COMMIT_MSG@247 PS46, Line 247: A blocking fragment is a fragment that has a blocking plan node. The : costing algorithm split the query plan tree into several blocking : subtrees divided by blocking fragments. Each blocking subtree has a : blocking fragment as a root and non-blocking fragments as leaves. All : other fragments in the subtree are non-blocking. This part of the para should be merged into the para at line 142. http://gerrit.cloudera.org:8080/#/c/19033/46//COMMIT_MSG@258 PS46, Line 258: This means that all fragments within a : blocking subtree can run in parallel and should be assigned 1 core per : fragment instance This may overestimate when HJ is involved, since the build side and the probe side of the HJ work in sequential. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 46 Gerrit-Owner: Qifan Chen <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Mon, 13 Feb 2023 16:24:01 +0000 Gerrit-HasComments: Yes
