Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22032 )

Change subject: IMPALA-13086: Lower AggregationNode estimate using stats 
predicate
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/22032/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/22032/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@417
PS2, Line 417:       // This is done via memo lookup through 
analyzer.getProducingNode().
It's not clear to me how this reduces estimates in some cases. Is it because we 
can now look deeper in some instances that we could before?


http://gerrit.cloudera.org:8080/#/c/22032/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test:

http://gerrit.cloudera.org:8080/#/c/22032/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test@138
PS2, Line 138: |  runtime filters: RF000[bloom] <- customer_id, RF001[min_max] 
<- customer_id
This has a significant effect on the Q4 plan. How does it affect execution 
performance?


http://gerrit.cloudera.org:8080/#/c/22032/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test:

http://gerrit.cloudera.org:8080/#/c/22032/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q11.test@108
PS2, Line 108: |--29:HASH JOIN [INNER JOIN]
Joins also re-ordered here.


http://gerrit.cloudera.org:8080/#/c/22032/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test:

http://gerrit.cloudera.org:8080/#/c/22032/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test@2048
PS2, Line 2048: |  tuple-ids=3 row-size=50B cardinality=1.82K cost=1948896250
Have you done any sanity checks to see if these new estimates seem reasonable 
with the actual execution?



--
To view, visit http://gerrit.cloudera.org:8080/22032
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia840d68f1c4f126d4e928461ec5c44545dbf25f8
Gerrit-Change-Number: 22032
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 07 Nov 2024 22:21:30 +0000
Gerrit-HasComments: Yes

Reply via email to