[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Michael Smith has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner The first pass of condition estimates were derived from a different database. This commit makes the condition estimates a little bit more similar to what is being done with the original Impala planner. Using calculations similar to the current Impala planner is better for the first pass so we have slightly more of an "apples to apples" comparison of the Calcite planner versus the original planner. These calculations should be re-examined later, especially when other features such as generating histogram statistics are implemented. The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY (.1) This gets applied to many different expressions, including things like >=, <=, etc... The disjunct condition was kept fairly straightforward and match the logic in CompoundPredicate.computeSelectivity() However, upon debugging, to obtain closer estimates to the original planner, the conjunction condition uses code found in PlanNode.computeCombinedSelectivity(). IMPALA-14867 has been created for "between" selectivity which has not yet been implemented. An issue with distinct row counts on filters is also fixed with this commit. The distinct row count on a filter only changes if the filter condition contains an input reference that matches the column with which we are trying to find distinct rows. IMPALA-14640 is also fixed by this commit, which now handles the case where there are no statistics provided. Testing: Some TestCalciteStats changed due to this commit as well as some tpcds query plans. Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Reviewed-on: http://gerrit.cloudera.org:8080/23930 Tested-by: Impala Public Jenkins Reviewed-by: Michael Smith --- M java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java M java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 16: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 16 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Fri, 03 Apr 2026 16:49:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 16: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 16 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Fri, 03 Apr 2026 16:45:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/22079/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 16 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Fri, 03 Apr 2026 12:36:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 16: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/13300/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 16 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Fri, 03 Apr 2026 12:15:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/23930 to look at the new patch set (#16). Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner The first pass of condition estimates were derived from a different database. This commit makes the condition estimates a little bit more similar to what is being done with the original Impala planner. Using calculations similar to the current Impala planner is better for the first pass so we have slightly more of an "apples to apples" comparison of the Calcite planner versus the original planner. These calculations should be re-examined later, especially when other features such as generating histogram statistics are implemented. The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY (.1) This gets applied to many different expressions, including things like >=, <=, etc... The disjunct condition was kept fairly straightforward and match the logic in CompoundPredicate.computeSelectivity() However, upon debugging, to obtain closer estimates to the original planner, the conjunction condition uses code found in PlanNode.computeCombinedSelectivity(). IMPALA-14867 has been created for "between" selectivity which has not yet been implemented. An issue with distinct row counts on filters is also fixed with this commit. The distinct row count on a filter only changes if the filter condition contains an input reference that matches the column with which we are trying to find distinct rows. IMPALA-14640 is also fixed by this commit, which now handles the case where there are no statistics provided. Testing: Some TestCalciteStats changed due to this commit as well as some tpcds query plans. Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 --- M java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java M java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 15: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 15 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Fri, 03 Apr 2026 07:18:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/22078/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 15 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Fri, 03 Apr 2026 03:23:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 15: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/13299/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 15 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Fri, 03 Apr 2026 03:00:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/23930 to look at the new patch set (#15). Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner The first pass of condition estimates were derived from a different database. This commit makes the condition estimates a little bit more similar to what is being done with the original Impala planner. Using calculations similar to the current Impala planner is better for the first pass so we have slightly more of an "apples to apples" comparison of the Calcite planner versus the original planner. These calculations should be re-examined later, especially when other features such as generating histogram statistics are implemented. The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY (.1) This gets applied to many different expressions, including things like >=, <=, etc... The disjunct condition was kept fairly straightforward and match the logic in CompoundPredicate.computeSelectivity() However, upon debugging, to obtain closer estimates to the original planner, the conjunction condition uses code found in PlanNode.computeCombinedSelectivity(). IMPALA-14867 has been created for "between" selectivity which has not yet been implemented. An issue with distinct row counts on filters is also fixed with this commit. The distinct row count on a filter only changes if the filter condition contains an input reference that matches the column with which we are trying to find distinct rows. IMPALA-14640 is also fixed by this commit, which now handles the case where there are no statistics provided. Testing: Some TestCalciteStats changed due to this commit as well as some tpcds query plans. Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 --- M java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java M java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Steve Carlin has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java: http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@248 PS13, Line 248: boolean hasSelectivity = false; > This variable is not used. Done -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 13 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Fri, 03 Apr 2026 02:59:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Aman Sinha has posted comments on this change. (
http://gerrit.cloudera.org:8080/23930 )
Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more
similar to original planner
..
Patch Set 14: Code-Review+2
(3 comments)
> Patch Set 13:
>
> (2 comments)
http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
File
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java:
http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@248
PS13, Line 248: boolean hasSelectivity = false;
This variable is not used.
http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@255
PS13, Line 255: // A null is returned when an inner conjunct has an
operand where
> For better or for worse, this matches Impala's logic here:
I see this code is updated in the latest patch set. Marking it done.
http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@257
PS13, Line 257: if (tmpSelectivity == null) {
> Also for better or for worse, this matches Impala's logic:
Ok, let's stick with it; sometime later I would like to see where the second
factor helps.
--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 14
Gerrit-Owner: Steve Carlin
Gerrit-Reviewer: Aman Sinha
Gerrit-Reviewer: Fang-Yu Rao
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell
Gerrit-Reviewer: Michael Smith
Gerrit-Reviewer: Steve Carlin
Gerrit-Comment-Date: Fri, 03 Apr 2026 01:47:28 +
Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/22074/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 14 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Thu, 02 Apr 2026 21:24:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/23930 to look at the new patch set (#14). Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner The first pass of condition estimates were derived from a different database. This commit makes the condition estimates a little bit more similar to what is being done with the original Impala planner. Using calculations similar to the current Impala planner is better for the first pass so we have slightly more of an "apples to apples" comparison of the Calcite planner versus the original planner. These calculations should be re-examined later, especially when other features such as generating histogram statistics are implemented. The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY (.1) This gets applied to many different expressions, including things like >=, <=, etc... The disjunct condition was kept fairly straightforward and match the logic in CompoundPredicate.computeSelectivity() However, upon debugging, to obtain closer estimates to the original planner, the conjunction condition uses code found in PlanNode.computeCombinedSelectivity(). IMPALA-14867 has been created for "between" selectivity which has not yet been implemented. An issue with distinct row counts on filters is also fixed with this commit. The distinct row count on a filter only changes if the filter condition contains an input reference that matches the column with which we are trying to find distinct rows. IMPALA-14640 is also fixed by this commit, which now handles the case where there are no statistics provided. Testing: Some TestCalciteStats changed due to this commit as well as some tpcds query plans. Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 --- M java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java M java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Steve Carlin has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 13: (2 comments) http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java: http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@255 PS13, Line 255: return Expr.DEFAULT_SELECTIVITY; > I didn't notice this closely in the first pass but here it is returning the For better or for worse, this matches Impala's logic here: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java#L195 http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@257 PS13, Line 257: selectivity = selectivity + tmpSelectivity - selectivity * tmpSelectivity; > Taking a second look at this by going through an example: Also for better or for worse, this matches Impala's logic: https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java#L204 -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 13 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Thu, 02 Apr 2026 13:15:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 13: (2 comments) http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java: http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@255 PS13, Line 255: return Expr.DEFAULT_SELECTIVITY; I didn't notice this closely in the first pass but here it is returning the DEFAULT_SELECTIVITY for the whole expression even if a single operand (out of many) get a null from the estimatorSelectivityInternal(). This would negate all the other valid selectivities for other operands. If I compare with how the conjuncts are handled, there the DEFAULT_SELECTIVITY is correctly applied only for the missing estimates. http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@257 PS13, Line 257: selectivity = selectivity + tmpSelectivity - selectivity * tmpSelectivity; Taking a second look at this by going through an example: WHERE qtr = 'Q1' OR qtr = 'Q2'. Suppose we have 1 year's data, and assuming uniform distribution, each of these selectivity is 0.25. So, ideally, I would expect 0.5 for this disjunctive predicate. However, here, it will produce 0.5 - 0.0025. The formula does not seem right. Is it brought over from Calcite ? -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 13 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Thu, 02 Apr 2026 01:05:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/22057/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 13 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Tue, 31 Mar 2026 21:39:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Steve Carlin has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 12: (12 comments) http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@10 PS12, Line 10: little bit more similar to what : is being done with the original Impala planner. > The comment here seems incomplete without a reason for doing this. Can you Done http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@13 PS12, Line 13: conditions > nit: did you mean unknown 'estimates' ? Not sure what unknown 'conditions' Fixed http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@14 PS12, Line 14: functions > nit: expressions rather than functions Done http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@23 PS12, Line 23: X will be filled in after review > nit: Assuming this is created, pls add the jira id. Done http://gerrit.cloudera.org:8080/#/c/23930/11/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java: http://gerrit.cloudera.org:8080/#/c/23930/11/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@206 PS11, Line 206: IMPALA-X > I think you already filed a jira for this ? Best to add the number here. Done http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java: http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@138 PS12, Line 138: it will be ignored. > It's not ignored from what I can tell .. on line #73 it assigns the Expr.DE Changed the comment for clarity http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@254 PS12, Line 254: return Math.max(0.0, Math.min(1.0, selectivity)); > This would return 0 if the loop is not executed. The default should be 1.0 It's a disjunction, so the loop should always be entered. But I did add a preconditions check to ensure that the logic makes sense. http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@273 PS12, Line 273: Collections.sort(selectivities); > nit: pls add a comment about why sorting the selectivities is needed as it Done http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@280 PS12, Line 280: return selectivity; > The corresponding Impala estimator applies a final bounds check between [0, Done http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java File java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java: http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@195 PS12, Line 195: //XXX: FIX THIS > Unclear why we need the hardcoded value. What can be done in the above calc Got rid of the hardcode and provided a description http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@334 PS12, Line 334: 10.0 > Where did this constant 10 come from ? if it's the NDV for this column, can Static constant was declard, just didn't use it :( Fixed now. http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@337 PS12, Line 337: distinctRows > 'distinctRows' is not used anymore ? Also, I am not sure why we care about I just have the general "distinctRows" test in all the unit tests just to test the right value is coming out. Should be 2 here since there are 2 values in the "in" clause. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 12 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Steve Carlin Gerrit-Comment-Date: Tue, 31 Mar 2026 21
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/23930 to look at the new patch set (#13). Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner The first pass of condition estimates were derived from a different database. This commit makes the condition estimates a little bit more similar to what is being done with the original Impala planner. Using calculations similar to the current Impala planner is better for the first pass so we have slightly more of an "apples to apples" comparison of the Calcite planner versus the original planner. These calculations should be re-examined later, especially when other features such as generating histogram statistics are implemented. The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY (.1) This gets applied to many different expressions, including things like >=, <=, etc... The disjunct condition was kept fairly straightforward and match the logic in CompoundPredicate.computeSelectivity() However, upon debugging, to obtain closer estimates to the original planner, the conjunction condition uses code found in PlanNode.computeCombinedSelectivity(). IMPALA-14867 has been created for "between" selectivity which has not yet been implemented. An issue with distinct row counts on filters is also fixed with this commit. The distinct row count on a filter only changes if the filter condition contains an input reference that matches the column with which we are trying to find distinct rows. IMPALA-14640 is also fixed by this commit, which now handles the case where there are no statistics provided. Testing: Some TestCalciteStats changed due to this commit as well as some tpcds query plans. Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 --- M java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java M java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 12: (12 comments) http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@10 PS12, Line 10: little bit more similar to what : is being done with the original Impala planner. The comment here seems incomplete without a reason for doing this. Can you add the rationale here ? I presume it is to avoid regressions. In some selected cases, we should hopefully improve on the original estimates. http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@13 PS12, Line 13: conditions nit: did you mean unknown 'estimates' ? Not sure what unknown 'conditions' means here since the predicates are known patterns. http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@14 PS12, Line 14: functions nit: expressions rather than functions http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@23 PS12, Line 23: X will be filled in after review nit: Assuming this is created, pls add the jira id. http://gerrit.cloudera.org:8080/#/c/23930/11/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java: http://gerrit.cloudera.org:8080/#/c/23930/11/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@206 PS11, Line 206: IMPALA-X I think you already filed a jira for this ? Best to add the number here. http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java: http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@138 PS12, Line 138: it will be ignored. It's not ignored from what I can tell .. on line #73 it assigns the Expr.DEFAULT_SELECTIVITY for such cases. http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@254 PS12, Line 254: return Math.max(0.0, Math.min(1.0, selectivity)); This would return 0 if the loop is not executed. The default should be 1.0, not 0. http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@273 PS12, Line 273: Collections.sort(selectivities); nit: pls add a comment about why sorting the selectivities is needed as it is not obvious unless one is aware of the fact that we want to apply most selective predicates first. http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@280 PS12, Line 280: return selectivity; The corresponding Impala estimator applies a final bounds check between [0, 1]: Math.max(0.0, Math.min(1.0, result)); Best to do that here too. http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java File java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java: http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@195 PS12, Line 195: //XXX: FIX THIS Unclear why we need the hardcoded value. What can be done in the above calculation to provide the expected value. http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@334 PS12, Line 334: 10.0 Where did this constant 10 come from ? if it's the NDV for this column, can you declare it as a static constant in the class so it can be referenced elsewhere too. http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@337 PS12, Line 337: distinctRows 'distinctRows' is not used anymore ? Also, I am not sure why we care about distinct row count. We just want the row count for the IN predicate. It's the same row count we would expect for an OR predicate. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 12 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewe
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/22050/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 12 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Mon, 30 Mar 2026 18:47:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/23930 to look at the new patch set (#12). Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner The first pass of condition estimates were derived from a different database. This commit makes the condition estimates a little bit more similar to what is being done with the original Impala planner. The default for unknown conditions is now taken from Expr.DEFAULT_SELECTIVITY (.1) This gets applied to many different functions, including things like >=, <=, etc... The disjunct condition was kept fairly straightforward and match the logic in CompoundPredicate.computeSelectivity() However, upon debugging, to obtain closer estimates to the original planner, the conjunction condition uses code found in PlanNode.computeCombinedSelectivity(). An Impala Jira (X will be filled in after review) will be filed to make a better match for "between" selectivity. An issue with distinct row counts on filters is also fixed with this commit. The distinct row count on a filter only changes if the filter condition contains an input reference that matches the column with which we are trying to find distinct rows. IMPALA-14640 is also fixed by this commit, which now handles the case where there are no statistics provided. Testing: Some TestCalciteStats changed due to this commit as well as some tpcds query plans. Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 --- M java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java M java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q46.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q48.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q54.test M testdata/workloads/functional-planner/
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 11: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/23930/11/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test File testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test: http://gerrit.cloudera.org:8080/#/c/23930/11/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test@43 PS11, Line 43: | fk/pk conjuncts: none > What happened to these? Offline discussed that this is an fine - they didn't need to be fk/pk conjuncts here, just hash predicates - and the mem-estimate is now more inline with https://github.com/apache/impala/blob/master/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 11 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Mon, 09 Mar 2026 23:14:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/23930/11/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test File testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test: http://gerrit.cloudera.org:8080/#/c/23930/11/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test@43 PS11, Line 43: | fk/pk conjuncts: none What happened to these? -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 11 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Tue, 03 Mar 2026 21:20:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/21803/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 11 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Thu, 26 Feb 2026 15:29:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/23930 to look at the new patch set (#11). Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner The first pass of condition estimates were derived from a different database. This commit makes the condition estimates a little bit more similar to what is being done with the original Impala planner. The default for unknown conditions is now taken from Expr.DEFAULT_SELECTIVITY (.1) This gets applied to many different functions, including things like >=, <=, etc... The disjunct condition was kept fairly straightforward and match the logic in CompoundPredicate.computeSelectivity() However, upon debugging, to obtain closer estimates to the original planner, the conjunction condition uses code found in PlanNode.computeCombinedSelectivity(). An Impala Jira (X will be filled in after review) will be filed to make a better match for "between" selectivity. An issue with distinct row counts on filters is also fixed with this commit. The distinct row count on a filter only changes if the filter condition contains an input reference that matches the column with which we are trying to find distinct rows. IMPALA-14640 is also fixed by this commit, which now handles the case where there are no statistics provided. Testing: Some TestCalciteStats changed due to this commit as well as some tpcds query plans. Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 --- M java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java M java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q46.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q48.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q56.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q58.test M test
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/21586/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 10 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 06 Feb 2026 03:58:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/23930 to look at the new patch set (#10). Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner The first pass of condition estimates were derived from a different database. This commit makes the condition estimates a little bit more similar to what is being done with the original Impala planner. The default for unknown conditions is now taken from Expr.DEFAULT_SELECTIVITY (.1) This gets applied to many different functions, including things like >=, <=, etc... The disjunct condition was kept fairly straightforward and match the logic in CompoundPredicate.computeSelectivity() However, upon debugging, to obtain closer estimates to the original planner, the conjunction condition uses code found in PlanNode.computeCombinedSelectivity(). An Impala Jira (X will be filled in after review) will be filed to make a better match for "between" selectivity. An issue with distinct row counts on filters is also fixed with this commit. The distinct row count on a filter only changes if the filter condition contains an input reference that matches the column with which we are trying to find distinct rows. IMPALA-14640 is also fixed by this commit, which now handles the case where there are no statistics provided. Testing: Some TestCalciteStats changed due to this commit as well as some tpcds query plans. Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 --- M java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java M java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q46.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q48.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q56.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q58.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q59.test M tes
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/23930 ) Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/21550/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/23930 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 Gerrit-Change-Number: 23930 Gerrit-PatchSet: 8 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 02 Feb 2026 23:57:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner
Steve Carlin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/23930 Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner .. IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner The first pass of condition estimates were derived from a different database. This commit makes the condition estimates a little bit more similar to what is being done with the original Impala planner. The default for unknown conditions is now taken from Expr.DEFAULT_SELECTIVITY (.1) This gets applied to many different functions, including things like >=, <=, etc... The disjunct condition was kept fairly straightforward and match the logic in CompoundPredicate.computeSelectivity() However, upon debugging, to obtain closer estimates to the original planner, the conjunction condition uses code found in PlanNode.computeCombinedSelectivity(). An Impala Jira (X will be filled in after review) will be filed to make a better match for "between" selectivity. An issue with distinct row counts on filters is also fixed with this commit. The distinct row count on a filter only changes if the filter condition contains an input reference that matches the column with which we are trying to find distinct rows. IMPALA-14640 is also fixed by this commit, which now handles the case where there are no statistics provided. Testing: Some TestCalciteStats changed due to this commit as well as some tpcds query plans. Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718 --- M java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java M java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java M java/calcite-planner/src/test/java/org/apache/impala/planner/TestCalciteStats.java M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q46.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q48.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q56.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q58.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q59.test M testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-
