Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21860 )
Change subject: IMPALA-13405: Do tuple analysis to lower AggregationNode cardinality ...................................................................... Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/21860/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test File testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test: http://gerrit.cloudera.org:8080/#/c/21860/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test@48 PS4, Line 48: | tuple-ids=5 row-size=214B cardinality=108.00K cost=1050713 > Why are the old and new values so much higher than with cpu cost disabled? The test with disabled cpu cost reads tpcds_parquet.item with real stats. This test with cpu cost enabled read injected stats taken from 3TB scale TPC-DS. https://github.com/apache/impala/blob/master/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/stats-3TB.json#L299 The injected 3TB scale stats has 20x more rows for item table compared to actual tpcds_partitioned_parquet_snap.date_dim in minicluster. That is why this test has higher cardinality. http://gerrit.cloudera.org:8080/#/c/21860/4/testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test File testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test: http://gerrit.cloudera.org:8080/#/c/21860/4/testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test@384 PS4, Line 384: | row-size=40B cardinality=10 > The actual output of this run is pretty weird Thanks for noticing this. I think this is bug on my code for not noticing NESTED LOOP JOIN [CROSS JOIN]. The post order traversal probably should not continue if we meet exploding operator like this. -- To view, visit http://gerrit.cloudera.org:8080/21860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd589ab5f7ba9566a0d35784f61f5ffaef5696e7 Gerrit-Change-Number: 21860 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Thu, 03 Oct 2024 18:48:05 +0000 Gerrit-HasComments: Yes
