Alex Behm has posted comments on this change. Change subject: IMPALA-5003: Constant propagation in scan nodes ......................................................................
Patch Set 15: (14 comments) Code is looking good, will take another pass over the tests http://gerrit.cloudera.org:8080/#/c/6389/15/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: Line 923: * Propagates constant expressions of the form <slot ref> = <constant> to Explain candidate param Line 924: * other uses of slot ref in all expressions in conjuncts; returns a BitSet with in all expressions in conjuncts -> in the given conjnucts Line 942: // Don't rewrite with our own substitution! We may have already processed Explanation seems more complicated than necessary. Clearly it would be wrong to propagate a constant to the originating binary predicate. Your function comment is quite accurate "to other uses of slot ref" :) Line 960: * Returns true if the conjuncts may be true and false if a contradiction has Shorter: Returns false if a contradiction has been implied, true otherwise. Line 968: BitSet candidates = new BitSet(conjuncts.size()); > Alex, thanks for the BitSet suggest - this worked out far cleaner than any Yea, this looks pretty good to me! Line 980: // applied by the rewriter. We must also re-analyze result exprs to I think we can remove everything after "We must also re-analyze ..." to keep the comment short. Also, I think constant propagation should fall under the "enable_expr_rewrites" query option, i.e., if false we should only remove duplicates in here. Otherwise, it's hard to say what that query option means at a high level. Line 985: if (index < 0) continue; > This can be Preconditions.checkState(index >= 0) now. In the prior form of Agree. Makes sense. Line 988: // Reset Expr to force updating cost Re-analyze expr to add implicit casts, resolve function signatures, and compute cost Line 998: // because they can't be evaluated if expr rewriting is turned off. > I can't see a good way to test this with the fe tests. Should we write a c You should be able to do this with a PlannerTest. For example, look at PlannerTest.testParquetFiltering() which sets a query option and then runs a .test file. But the easier solution, imo, is to not propagate constants when expr rewriting is disabled. http://gerrit.cloudera.org:8080/#/c/6389/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: Line 1266: > Optimization should still be done in the form of eliminating duplicates, ev Looks like the constant propagation is applied even when expr rewrites are disabled. Imo, it would be preferable not to propagate constants in that case. Otherwise, it's kind of hard to reason about the meaning of that query option. http://gerrit.cloudera.org:8080/#/c/6389/15/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: Line 1273: // Allow constant propagation within conjuncts. This actually has two purposes: 1. constant propagation 2. expr optimization on fully-resolved conjuncts http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: Line 101: # XXX - this seems correct, but is this a legal transformation? > I'm fairly convinced at this point that considering slotrefs through implic I agree that implicit casts on both sides seem ok. It does not seem fine if there is an explicit cast on the slotref. In your example, the 'cast(10000 as tinyint)' is constant and will be folded (truncated), so this case seems fine. http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/joins.test File testdata/workloads/functional-planner/queries/PlannerTest/joins.test: Line 381: # hbase-hdfs join with scan filtering (bogus) Can you be more specific than "(bogus)"? What behavior are we testing here? Line 394: 02:HASH JOIN [INNER JOIN] > This is a very sad plan. Agree that this can be improved. -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Zach Amsden <[email protected]> Gerrit-HasComments: Yes
