Steve Carlin has posted comments on this change. ( http://gerrit.cloudera.org:8080/23122 )
Change subject: IMPALA-14094: Calcite planner: Use table and column statistics for optimization ...................................................................... Patch Set 7: (8 comments) http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/CalciteTable.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/CalciteTable.java: http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/CalciteTable.java@91 PS7, Line 91: HdfsEstimatedTableStats > In the other related commit, I added a review comment about the class namin Done http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/CalciteTable.java@237 PS7, Line 237: estimatedStats_ > Based on line 101, the estimatedStats_ can be null. This should do a prec Done http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java: http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@56 PS7, Line 56: COMPARISON_SELECTIVITY > Snce this is specific to range comparison, not equality comparison, can we Done http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@215 PS7, Line 215: * Disjunction Selectivity -> (1 D(1-m1/n)(1-m2/n)) where n is the total > Is this similar to what Calcite does for disjunctions ? Would be useful to I actually grabbed this code (and comment) from Hive, so we wouldn't be able to use the common code there. http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/schema/FilterSelectivityEstimator.java@262 PS7, Line 262: Currently it picks the : * col with max no of nulls. > Not clear what 'picks the col with max no of nulls' means here .. since the Made this more specific and added some preconditions http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/test/java/org/apache/impala/planner/TestCalciteStats.java File java/calcite-planner/src/test/java/org/apache/impala/planner/TestCalciteStats.java: http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/test/java/org/apache/impala/planner/TestCalciteStats.java@270 PS7, Line 270: (1.0 / 3.0) > See prior comment about referencing the default RANGE_COMPARISON_SELECTIVIT Done http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/test/java/org/apache/impala/planner/TestCalciteStats.java@273 PS7, Line 273: 10.0 > Shouldn't this reference BIGINT_NDV instead of hardcoding the value here ? Done http://gerrit.cloudera.org:8080/#/c/23122/7/java/calcite-planner/src/test/java/org/apache/impala/planner/TestCalciteStats.java@288 PS7, Line 288: 1.0 / 9.0) > Same as above. Done -- To view, visit http://gerrit.cloudera.org:8080/23122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d5bb50eb562c28e4b7c7a6529d140f98e77295c Gerrit-Change-Number: 23122 Gerrit-PatchSet: 7 Gerrit-Owner: Steve Carlin <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Anonymous Coward (816) Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Steve Carlin <[email protected]> Gerrit-Comment-Date: Mon, 21 Jul 2025 22:09:07 +0000 Gerrit-HasComments: Yes
