Steve Carlin has posted comments on this change. ( http://gerrit.cloudera.org:8080/23679 )
Change subject: IMPALA-14525: Calcite planner: Add support for RexSimplify ...................................................................... Patch Set 7: (7 comments) http://gerrit.cloudera.org:8080/#/c/23679/6/fe/src/main/java/org/apache/impala/analysis/InPredicate.java File fe/src/main/java/org/apache/impala/analysis/InPredicate.java: http://gerrit.cloudera.org:8080/#/c/23679/6/fe/src/main/java/org/apache/impala/analysis/InPredicate.java@169 PS6, Line 169: boolean useSetLookup = isAllConstant && (children_.size() - 1 >= setLookupThreshold); > This inverts the previous logic Not intentional. Great catch, thanks! http://gerrit.cloudera.org:8080/#/c/23679/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/coercenodes/CoerceOperandShuttle.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/coercenodes/CoerceOperandShuttle.java: http://gerrit.cloudera.org:8080/#/c/23679/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/coercenodes/CoerceOperandShuttle.java@362 PS6, Line 362: * Returns true if RexNode is nullable. Literals are only nullable if they contain > nit: double space inconsistent with the other sentences My 80's-ness is showing. http://gerrit.cloudera.org:8080/#/c/23679/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/RexLiteralConverter.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/RexLiteralConverter.java: http://gerrit.cloudera.org:8080/#/c/23679/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/functions/RexLiteralConverter.java@98 PS6, Line 98: // Always treat all string literals as type STRING > This change isn't mentioned in the commit message. What prompted it? This is just a bug that showed up. String literals should always be type string, and should never be varchar. Will include in commit message. http://gerrit.cloudera.org:8080/#/c/23679/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaInOperator.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaInOperator.java: http://gerrit.cloudera.org:8080/#/c/23679/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/operators/ImpalaInOperator.java@30 PS6, Line 30: private ImpalaInOperator() { > Should this be public? You seem to create a canonical version in OP. Done http://gerrit.cloudera.org:8080/#/c/23679/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaCoreRules.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaCoreRules.java: http://gerrit.cloudera.org:8080/#/c/23679/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaCoreRules.java@50 PS6, Line 50: * can differ from Impala's constant folding. > nit: "differ from Impala's constant folding." Done http://gerrit.cloudera.org:8080/#/c/23679/6/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/ImpalaCoreRules.java@56 PS6, Line 56: private static final RelBuilder.Config CONFIG_NO_SIMPLIFY = > This seems like it could be annoying to maintain. I don't see any way aroun Sigh, ack, but I think the one saving grace here is that I put all the rules in here. If someone were to add a new rule, they would see that all the rules are defined in ImpalaCoreRules and hopefully notice to use the NO_SIMPLIFY Another helpful thing is that not all rules call simplify either. I decided to use the NO_SIMPLIFY RelBuilderFactory for all rules for safety purposes in case someone changes a rule within Calcite which adds simplify. Lastly, the worst thing that really happens if it's not maintained properly isn't really a huge disaster and is prolly something no one will notice. The Calcite simplify changes the way NaN is used which doesn't map the way Impala uses it. Some unit tests will break without this change. If a bug creeps out into the real world, it would prolly be extremely rare since and perhaps not even noticed since NaN behavior is kinda undefined across databases, though theoretically consistent within ours. But yeah, I admit, it's not pretty. http://gerrit.cloudera.org:8080/#/c/23679/6/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q01.test File testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q01.test: http://gerrit.cloudera.org:8080/#/c/23679/6/testdata/workloads/functional-planner/queries/PlannerTest/calcite_tpcds/tpcds-q01.test@29 PS6, Line 29: Per-Host Resource Estimates: Memory=6.13GB > This is a large change. It might be related to https://gerrit.cloudera.org/ This was actually mentioned in the commit message already, but I only mentioned it as a cardinality change. It prolly changed the memory estimate too. If you look at 01: SCAN HDFS, you will see that the cardinality estimate changed from 16.08M to 147M. Previous to this change, the predicate used the function "is_not_null_pred". The filter selectivity estimator treated this as a normal function and assumed 90% of the rows would be filtered. But since it is really a "IS NOT NULL" predicate, we can be more precise since there are statistics that handle this. And most rows are not null here, so reducing by 90% was a bad estimate before this commit. -- To view, visit http://gerrit.cloudera.org:8080/23679 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I44792688f361bf15affa565e5de5709f64dcf18c Gerrit-Change-Number: 23679 Gerrit-PatchSet: 7 Gerrit-Owner: Steve Carlin <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Pranav Lodha <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Steve Carlin <[email protected]> Gerrit-Comment-Date: Tue, 30 Dec 2025 17:18:13 +0000 Gerrit-HasComments: Yes
