Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21860 )

Change subject: IMPALA-13405: Do tuple analysis to lower AggregationNode 
cardinality
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21860/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test:

http://gerrit.cloudera.org:8080/#/c/21860/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q98.test@48
PS4, Line 48: |  tuple-ids=5 row-size=214B cardinality=108.00K cost=1050713
> Why are the old and new values so much higher than with cpu cost disabled?
The test with disabled cpu cost reads tpcds_parquet.item with real stats.

This test with cpu cost enabled read injected stats taken from 3TB scale TPC-DS.
https://github.com/apache/impala/blob/master/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/stats-3TB.json#L299
The injected 3TB scale stats has 20x more rows for item table compared to 
actual tpcds_partitioned_parquet_snap.date_dim in minicluster. That is why this 
test has higher cardinality.


http://gerrit.cloudera.org:8080/#/c/21860/4/testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
File testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test:

http://gerrit.cloudera.org:8080/#/c/21860/4/testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test@384
PS4, Line 384: |  row-size=40B cardinality=10
> The actual output of this run is pretty weird
Thanks for noticing this. I think this is bug on my code for not noticing 
NESTED LOOP JOIN [CROSS JOIN].
The post order traversal probably should not continue if we meet exploding 
operator like this.



--
To view, visit http://gerrit.cloudera.org:8080/21860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd589ab5f7ba9566a0d35784f61f5ffaef5696e7
Gerrit-Change-Number: 21860
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Thu, 03 Oct 2024 18:48:05 +0000
Gerrit-HasComments: Yes

Reply via email to