Alex Behm has posted comments on this change.

Change subject: PREVIEW: IMPALA-1788: Fold constant expressions during plan 
generation.
......................................................................


Patch Set 1:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/5109/1/be/src/exprs/expr-context.cc
File be/src/exprs/expr-context.cc:

Line 155:   void* value = GetValue(NULL);
> how does this handle/bail out of cases where exprctx is accidentally not a 
It the non-constant expr has a SlotRef it will hit a SIGSEGV  because we are 
passing a NULL TupleRow. We can make this more defensive if you prefer, but it 
might hide bugs. We should really never call this on non-constant exprs.


http://gerrit.cloudera.org:8080/#/c/5109/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 1541:     // The rewrite rule for extracting conjuncts simplifies these to 
NULL.
> rather than special-casing it here, it would make more sense to special-cas
Not necessary anymore. Types are preserved.


http://gerrit.cloudera.org:8080/#/c/5109/1/be/src/exprs/literal.h
File be/src/exprs/literal.h:

Line 26: #include "runtime/string-value.h"
> duplicated
Done


http://gerrit.cloudera.org:8080/#/c/5109/1/common/thrift/Exprs.thrift
File common/thrift/Exprs.thrift:

Line 41:   TIMESTAMP_LITERAL
> please group all the literals together, so we can easily see what's still m
Done. Also got rid of LITERAL_PRED which isn't used.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

Line 242:       fnName = path.get(path.size() - 1);
> why is this necessary?
To make isConstant() work even when a FunctionCallExpr has not yet been 
analyzed.

Added comment.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
File fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java:

Line 204:           if (Double.isNaN(val.double_val) || 
Double.isInfinite(val.double_val)) return null;
> nice catch!
Done. These changes went in as a separate JIRA/commit. Rebased this patch on it.


Line 226:           for (byte b: bytes) {
> single line
Done


Line 233:             result = new StringLiteral(strVal, constExpr.getType(), 
false);
> did you add a new test that blows up with the previous default unescaping?
These cases are covered by expr-test.cc.
I also added a few interesting cases to ExprRewriteRulesTest for convenience, 
but they don't give any new coverage in addition to expr-test.cc


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java
File fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java:

Line 34:  * produce, so it better to defer to a single source of truth (the BE 
implementation).
> it -> it's
Done


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

Line 164:     if (analyzer.getQueryOptions().enable_expr_rewrites && rewriter 
!= null) {
> pass in null for the rewriter if it's disabled, so you don't have to check 
Not relevant anymore in the new approach.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

Line 523:       // constant into nested-loop joins.
> refer to cross products, to avoid confusion with future index nlj.
Not relevant anymore in new approach.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

Line 373
> do we still need this function for anything?
Added a constant folder to Analyzer.GlobalState and removed 
Expr.foldConstantChildren()


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

Line 434:   public void initNoRewrite(Analyzer analyzer) throws ImpalaException 
{
> why not just init(Analyzer)? that would also make it clear that you're not 
Not relevant anymore in new approach.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

Line 577:       emptySetNode.init(ctx_.getRewriter(), analyzer);
> instead of passing this in, maybe each PlanNode should have access to the c
Not relevant anymore in new approach (but we should probably still do that in 
the future)


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
File fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java:

Line 93:   public void rewriteList(List<Expr> exprs, Analyzer analyzer) throws 
AnalysisException {
> why rewrite as opposed to apply here?
For consistency with rewrite(). I think it makes esne to have different APIs 
for the rules vs. the rewriter (which applies the rules).


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java:

Line 204:    * Only contains very basic tests for a few interesting cases. A 
more thorough
> ". More thorough"
Done


http://gerrit.cloudera.org:8080/#/c/5109/1/testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test:

Line 92:    predicates: a.int_col = 0, a.timestamp_col > TIMESTAMP '2000-01-01 
00:00:00'
> cool, that'll make timestamp operations a lot faster.
I can work with Mostafa to add one to the perf primitives.


http://gerrit.cloudera.org:8080/#/c/5109/1/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test:

Line 121: |  output: sum(1 + 1 + id), count:merge(*)
> hm, a bit confusing that it prints the original "column names'. it would be
Agree. It's not super easy to fix because the agg info is computed during 
analysis. This is fixed in the new approach.

I'm not sure we even need these planner tests now. Should I get rid of them?


http://gerrit.cloudera.org:8080/#/c/5109/1/testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
File testdata/workloads/functional-planner/queries/PlannerTest/join-order.test:

Line 1502: |     predicates: timestamp_col = TIMESTAMP '2016-11-15 
21:38:54.000000618'
> that's not good, this means every time someone runs the test it'll break. m
Agree. Rephrased tests to not use now().


http://gerrit.cloudera.org:8080/#/c/5109/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test:

Line 205:    predicates: CAST(o_orderdate AS TIMESTAMP) <= TIMESTAMP 
'1995-01-01 00:00:00'
> aren't there tests for the specialized constant folding we have for kudu?
Those tests are in this file (L117 and following)


-- 
To view, visit http://gerrit.cloudera.org:8080/5109
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If672b703db1ba0bfc26e5b9130161798b40a69e9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-HasComments: Yes

Reply via email to