Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21860 )
Change subject: IMPALA-13405: Do tuple analysis to lower AggregationNode cardinality ...................................................................... Patch Set 13: (2 comments) http://gerrit.cloudera.org:8080/#/c/21860/11//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21860/11//COMMIT_MSG@12 PS11, Line 12: AggregationNode > nit: typo "AggragationNode" -> AggregationNode Done http://gerrit.cloudera.org:8080/#/c/21860/11/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: http://gerrit.cloudera.org:8080/#/c/21860/11/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@368 PS11, Line 368: // (in other words, the lowest PlanNode in query plan tree). > Just thinking about a more complex plan where we have nested subqueries or Thank you Aman for the counter example. I craft one planner test under agg-node-max-mem-estimate.test based on your counter example. In DISTRIBUTEDPLAN/PARALLELPLANS, only 03:AGGREGATE and 04:AGGREGATE does post order traversal analysis because they are preaggregation node and has at least 2 expressions that belong to the same tuple (store_sales). The merge-aggregation after them does not redo the num group estimation that already done by the preaggregation node. Indeed, the post order traversal seems at 04:AGGREGATE seems redundant to go all the way to the leaf nodes when 03:AGGREGATE already does that. We can improve this further by tracking tupleIdGenerator_.getNextId() in DescriptorTable.java and take note the associated PlanNode it is assigned to. I choose not to do that immediately in this patch because current way seems reasonable enough for most cases. We can revisit that memoization strategy in the future if other new analysis need them. -- To view, visit http://gerrit.cloudera.org:8080/21860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd589ab5f7ba9566a0d35784f61f5ffaef5696e7 Gerrit-Change-Number: 21860 Gerrit-PatchSet: 13 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Thu, 17 Oct 2024 22:14:18 +0000 Gerrit-HasComments: Yes
