[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-03 Thread Michael Smith (Code Review)
Michael Smith has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner. Using calculations similar to
the current Impala planner is better for the first pass so we have slightly
more of an "apples to apples" comparison of the Calcite planner versus the
original planner.  These calculations should be re-examined later, especially
when other features such as generating histogram statistics are implemented.

The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different expressions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

IMPALA-14867 has been created for "between" selectivity which has not yet
been implemented.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Reviewed-on: http://gerrit.cloudera.org:8080/23930
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Smith 
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java
M 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-03 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 16: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 16
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Apr 2026 16:49:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 16: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 16
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Apr 2026 16:45:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/22079/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 16
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Apr 2026 12:36:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 16:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/13300/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 16
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Apr 2026 12:15:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-03 Thread Steve Carlin (Code Review)
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/23930

to look at the new patch set (#16).

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner. Using calculations similar to
the current Impala planner is better for the first pass so we have slightly
more of an "apples to apples" comparison of the Calcite planner versus the
original planner.  These calculations should be re-examined later, especially
when other features such as generating histogram statistics are implemented.

The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different expressions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

IMPALA-14867 has been created for "between" selectivity which has not yet
been implemented.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java
M 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 15: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 15
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Apr 2026 07:18:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/22078/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 15
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Apr 2026 03:23:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/13299/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 15
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Apr 2026 03:00:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-02 Thread Steve Carlin (Code Review)
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/23930

to look at the new patch set (#15).

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner. Using calculations similar to
the current Impala planner is better for the first pass so we have slightly
more of an "apples to apples" comparison of the Calcite planner versus the
original planner.  These calculations should be re-examined later, especially
when other features such as generating histogram statistics are implemented.

The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different expressions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

IMPALA-14867 has been created for "between" selectivity which has not yet
been implemented.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java
M 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-02 Thread Steve Carlin (Code Review)
Steve Carlin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java:

http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@248
PS13, Line 248: boolean hasSelectivity = false;
> This variable is not used.
Done



--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 13
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Apr 2026 02:59:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-02 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 14: Code-Review+2

(3 comments)

> Patch Set 13:
>
> (2 comments)

http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java:

http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@248
PS13, Line 248: boolean hasSelectivity = false;
This variable is not used.


http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@255
PS13, Line 255:   // A null is returned when an inner conjunct has an 
operand where
> For better or for worse, this matches Impala's logic here:
I see this code is updated in the latest patch set.  Marking it done.


http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@257
PS13, Line 257:   if (tmpSelectivity == null) {
> Also for better or for worse, this matches Impala's logic:
Ok, let's stick with it; sometime later I would like to see where the second 
factor helps.



--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 14
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Fri, 03 Apr 2026 01:47:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/22074/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 14
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Thu, 02 Apr 2026 21:24:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-02 Thread Steve Carlin (Code Review)
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/23930

to look at the new patch set (#14).

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner. Using calculations similar to
the current Impala planner is better for the first pass so we have slightly
more of an "apples to apples" comparison of the Calcite planner versus the
original planner.  These calculations should be re-examined later, especially
when other features such as generating histogram statistics are implemented.

The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different expressions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

IMPALA-14867 has been created for "between" selectivity which has not yet
been implemented.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java
M 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-02 Thread Steve Carlin (Code Review)
Steve Carlin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 13:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java:

http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@255
PS13, Line 255: return Expr.DEFAULT_SELECTIVITY;
> I didn't notice this closely in the first pass but here it is returning the
For better or for worse, this matches Impala's logic here:

https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java#L195


http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@257
PS13, Line 257:   selectivity = selectivity + tmpSelectivity - selectivity 
* tmpSelectivity;
> Taking a second look at this by going through an example:
Also for better or for worse, this matches Impala's logic:

https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java#L204



--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 13
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Thu, 02 Apr 2026 13:15:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-04-01 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 13:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java:

http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@255
PS13, Line 255: return Expr.DEFAULT_SELECTIVITY;
I didn't notice this closely in the first pass but here it is returning the 
DEFAULT_SELECTIVITY for the whole expression even if a single operand (out of 
many) get a null from the estimatorSelectivityInternal().  This would negate 
all the other valid selectivities for other operands.  If I compare with how 
the conjuncts are handled, there the DEFAULT_SELECTIVITY is correctly applied 
only for the missing estimates.


http://gerrit.cloudera.org:8080/#/c/23930/13/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@257
PS13, Line 257:   selectivity = selectivity + tmpSelectivity - selectivity 
* tmpSelectivity;
Taking a second look at this by going through an example:
WHERE qtr = 'Q1' OR qtr = 'Q2'.   Suppose we have 1 year's data, and assuming 
uniform distribution, each of these selectivity is  0.25.  So, ideally, I would 
expect 0.5 for this disjunctive predicate.
However, here, it will produce 0.5 - 0.0025.  The formula does not seem right.  
Is it brought over from Calcite ?



--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 13
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Thu, 02 Apr 2026 01:05:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-03-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/22057/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 13
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 31 Mar 2026 21:39:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-03-31 Thread Steve Carlin (Code Review)
Steve Carlin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 12:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@10
PS12, Line 10: little bit more similar to what
 : is being done with the original Impala planner.
> The comment here seems incomplete without a reason for doing this. Can you
Done


http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@13
PS12, Line 13: conditions
> nit: did you mean unknown 'estimates' ?  Not sure what unknown 'conditions'
Fixed


http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@14
PS12, Line 14: functions
> nit: expressions rather than functions
Done


http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@23
PS12, Line 23: X will be filled in after review
> nit: Assuming this is created, pls add the jira id.
Done


http://gerrit.cloudera.org:8080/#/c/23930/11/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java:

http://gerrit.cloudera.org:8080/#/c/23930/11/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@206
PS11, Line 206: IMPALA-X
> I think you already filed a jira for this ? Best to add the number here.
Done


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java:

http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@138
PS12, Line 138: it will be ignored.
> It's not ignored from what I can tell .. on line #73 it assigns the Expr.DE
Changed the comment for clarity


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@254
PS12, Line 254: return Math.max(0.0, Math.min(1.0, selectivity));
> This would return 0 if the loop is not executed.  The default should be 1.0
It's a disjunction, so the loop should always be entered.  But I did add a 
preconditions check to ensure that the logic makes sense.


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@273
PS12, Line 273: Collections.sort(selectivities);
> nit: pls add a comment about why sorting the selectivities is needed as it
Done


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@280
PS12, Line 280: return selectivity;
> The corresponding Impala estimator applies a final bounds check between [0,
Done


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
File 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java:

http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@195
PS12, Line 195: //XXX:  FIX THIS
> Unclear why we need the hardcoded value. What can be done in the above calc
Got rid of the hardcode and provided a description


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@334
PS12, Line 334: 10.0
> Where did this constant 10 come from ? if it's the NDV for this column, can
Static constant was declard, just didn't use it :(

Fixed now.


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@337
PS12, Line 337: distinctRows
> 'distinctRows' is not used anymore ?  Also, I am not sure why we care about
I just have the general "distinctRows" test in all the unit tests just to test 
the right value is coming out.  Should be 2 here since there are 2 values in 
the "in" clause.



--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 12
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 31 Mar 2026 21

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-03-31 Thread Steve Carlin (Code Review)
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/23930

to look at the new patch set (#13).

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner. Using calculations similar to
the current Impala planner is better for the first pass so we have slightly
more of an "apples to apples" comparison of the Calcite planner versus the
original planner.  These calculations should be re-examined later, especially
when other features such as generating histogram statistics are implemented.

The default for unknown estimates is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different expressions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

IMPALA-14867 has been created for "between" selectivity which has not yet
been implemented.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java
M 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-03-30 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 12:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@10
PS12, Line 10: little bit more similar to what
 : is being done with the original Impala planner.
The comment here seems incomplete without a reason for doing this. Can you add 
the rationale here ?  I presume it is to avoid regressions.  In some selected 
cases, we should hopefully improve on the original estimates.


http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@13
PS12, Line 13: conditions
nit: did you mean unknown 'estimates' ?  Not sure what unknown 'conditions' 
means here since the predicates are known patterns.


http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@14
PS12, Line 14: functions
nit: expressions rather than functions


http://gerrit.cloudera.org:8080/#/c/23930/12//COMMIT_MSG@23
PS12, Line 23: X will be filled in after review
nit: Assuming this is created, pls add the jira id.


http://gerrit.cloudera.org:8080/#/c/23930/11/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java:

http://gerrit.cloudera.org:8080/#/c/23930/11/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@206
PS11, Line 206: IMPALA-X
I think you already filed a jira for this ? Best to add the number here.


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
File 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java:

http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@138
PS12, Line 138: it will be ignored.
It's not ignored from what I can tell .. on line #73 it assigns the 
Expr.DEFAULT_SELECTIVITY for such cases.


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@254
PS12, Line 254: return Math.max(0.0, Math.min(1.0, selectivity));
This would return 0 if the loop is not executed.  The default should be 1.0, 
not 0.


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@273
PS12, Line 273: Collections.sort(selectivities);
nit: pls add a comment about why sorting the selectivities is needed as it is 
not obvious unless one is aware of the fact that we want to apply most 
selective predicates first.


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@280
PS12, Line 280: return selectivity;
The corresponding Impala estimator applies a final bounds check between [0, 1]:
 Math.max(0.0, Math.min(1.0, result));
Best to do that here too.


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
File 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java:

http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@195
PS12, Line 195: //XXX:  FIX THIS
Unclear why we need the hardcoded value. What can be done in the above 
calculation to provide the expected value.


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@334
PS12, Line 334: 10.0
Where did this constant 10 come from ? if it's the NDV for this column, can you 
declare it as a static constant in the class so it can be referenced elsewhere 
too.


http://gerrit.cloudera.org:8080/#/c/23930/12/java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java@337
PS12, Line 337: distinctRows
'distinctRows' is not used anymore ?  Also, I am not sure why we care about 
distinct row count.  We just want the row count for the IN predicate.  It's the 
same row count we would expect for an OR predicate.



--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 12
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewe

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-03-30 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/22050/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 12
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Mon, 30 Mar 2026 18:47:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-03-30 Thread Steve Carlin (Code Review)
Hello Aman Sinha, Fang-Yu Rao, Joe McDonnell, Michael Smith, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/23930

to look at the new patch set (#12).

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner.

The default for unknown conditions is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different functions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

An Impala Jira (X will be filled in after review) will be filed to make
a better match for "between" selectivity.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java
M 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q41.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q46.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q48.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q54.test
M 
testdata/workloads/functional-planner/

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-03-09 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 11: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/23930/11/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test:

http://gerrit.cloudera.org:8080/#/c/23930/11/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test@43
PS11, Line 43: |  fk/pk conjuncts: none
> What happened to these?
Offline discussed that this is an fine - they didn't need to be fk/pk conjuncts 
here, just hash predicates - and the mem-estimate is now more inline with 
https://github.com/apache/impala/blob/master/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q82.test.



--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 11
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Mon, 09 Mar 2026 23:14:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-03-03 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/23930/11/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test:

http://gerrit.cloudera.org:8080/#/c/23930/11/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q82.test@43
PS11, Line 43: |  fk/pk conjuncts: none
What happened to these?



--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 11
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Tue, 03 Mar 2026 21:20:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-02-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/21803/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 11
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Thu, 26 Feb 2026 15:29:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-02-26 Thread Steve Carlin (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/23930

to look at the new patch set (#11).

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner.

The default for unknown conditions is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different functions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

An Impala Jira (X will be filled in after review) will be filed to make
a better match for "between" selectivity.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java
M 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q25.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q29.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q46.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q48.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q54.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q56.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q58.test
M 
test

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-02-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/21586/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 10
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 06 Feb 2026 03:58:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-02-05 Thread Steve Carlin (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/23930

to look at the new patch set (#10).

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner.

The default for unknown conditions is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different functions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

An Impala Jira (X will be filled in after review) will be filed to make
a better match for "between" selectivity.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java
M 
java/calcite-planner/src/test/java/org/apache/impala/calcite/planner/TestCalciteStats.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q14b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q46.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q48.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q54.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q56.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q58.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q59.test
M 
tes

[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-02-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23930 )

Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/21550/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/23930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
Gerrit-Change-Number: 23930
Gerrit-PatchSet: 8
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 02 Feb 2026 23:57:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-14716: Calcite Planner: Make condition estimates more similar to original planner

2026-02-02 Thread Steve Carlin (Code Review)
Steve Carlin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/23930


Change subject: IMPALA-14716: Calcite Planner: Make condition estimates more 
similar to original planner
..

IMPALA-14716: Calcite Planner: Make condition estimates more similar to 
original planner

The first pass of condition estimates were derived from a different database.
This commit makes the condition estimates a little bit more similar to what
is being done with the original Impala planner.

The default for unknown conditions is now taken from Expr.DEFAULT_SELECTIVITY 
(.1)
This gets applied to many different functions, including things like >=, <=,
etc...

The disjunct condition was kept fairly straightforward and match the logic in
CompoundPredicate.computeSelectivity()

However, upon debugging, to obtain closer estimates to the original planner,
the conjunction condition uses code found in 
PlanNode.computeCombinedSelectivity().

An Impala Jira (X will be filled in after review) will be filed to make
a better match for "between" selectivity.

An issue with distinct row counts on filters is also fixed with this commit.
The distinct row count on a filter only changes if the filter condition
contains an input reference that matches the column with which we are trying
to find distinct rows. IMPALA-14640 is also fixed by this commit, which now
handles the case where there are no statistics provided.

Testing: Some TestCalciteStats changed due to this commit as well as some
tpcds query plans.

Change-Id: I3b9a25259916504296dbd9a9cb9466be8fac8718
---
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaRexExecutor.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java
M 
java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/ImpalaRelMdDistinctRowCount.java
M 
java/calcite-planner/src/test/java/org/apache/impala/planner/TestCalciteStats.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q04.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q06.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q07.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q11.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q14b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q15.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q16.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q17.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q18.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q19.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q24b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q26.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q27.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q30.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q31.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q32.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q33.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q35a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q39b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q40.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q45.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q46.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q48.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q54.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q56.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q58.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q59.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-