Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15371 )

Change subject: IMPALA-9429: Unioned partition columns break partition pruning
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/15371/3/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
File fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java:

http://gerrit.cloudera.org:8080/#/c/15371/3/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java@105
PS3, Line 105:    * Conjuncts used for filtering will be removed from the list 
'conjuncts' and
             :    * returned as the second item in the returned Pair.
Now we should mention that the returned conjuncts are rewritten ones.


http://gerrit.cloudera.org:8080/#/c/15371/3/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java@134
PS3, Line 134:         Expr clonedConjunct = 
exprRewriter_.rewrite(conjunct.clone(), analyzer);
Should we catch the AnalysisException thrown by FoldConstantsRule (just like 
the try-catch clauses in canEvalUsingPartitionMd)? But I'm not sure how BE can 
fail in evaluating constant expressions here but succeed in later code path. 
Looks like it's ok to terminate query planning if constant folding fails here.


http://gerrit.cloudera.org:8080/#/c/15371/3/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java@199
PS3, Line 199:         expr = analyzer.getConstantFolder().rewrite(expr, 
analyzer);
Given that we have done constant folding on 'expr' before calling this method, 
I think we don't need this now.


http://gerrit.cloudera.org:8080/#/c/15371/3/fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java@225
PS3, Line 225:         analyzer.getConstantFolder().rewrite(expr, analyzer);
Same here. I think we don't need this now.


http://gerrit.cloudera.org:8080/#/c/15371/1/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/15371/1/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1734
PS1, Line 1734:         op.getAnalyzer().registerConjuncts(opConjuncts);
> Adding as a rule works. Also changed to add clonedConjunct to partitionConj
Sorry, I realize that if we don't do constant folding here, we won't have 
chance anymore. Because the normal code path for expr rewrite is in analyze 
phase but now we are in planner phase. If the nonoptimal predicate is on 
non-partition columns, BE will evaluates it for each row, which can introduce 
perf issues. E.g. the first query runs 1x slower than the second one (2.05s vs 
1.04s):

 select count(*) from (
   select "aaa" s, l_orderkey from tpch.lineitem
   union all
   select "abc" s, l_orderkey from tpch.lineitem
  ) t
  where s like "a%" or l_orderkey = 1;

 select count(*) from (
   select "aaa" s, l_orderkey from tpch.lineitem
   where "aaa" like "a%" or l_orderkey = 1
   union all
   select "abc" s, l_orderkey from tpch.lineitem
   where "abc" like "a%" or l_orderkey = 1) t;

(Predicates in the second query can be rewritten in the analyze phase so they 
are optimized to "true")

I suggest we do constant folding in both HdfsPartitionPruner and here. Also do 
constant folding after we substitute predicates in inlineView: 
https://github.com/apache/impala/blob/030c12ab2c995a8618dcf601296e310203f70f8c/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java#L1308-L1309


http://gerrit.cloudera.org:8080/#/c/15371/3/testdata/workloads/functional-planner/queries/PlannerTest/union.test
File testdata/workloads/functional-planner/queries/PlannerTest/union.test:

http://gerrit.cloudera.org:8080/#/c/15371/3/testdata/workloads/functional-planner/queries/PlannerTest/union.test@4207
PS3, Line 4207: # IMPALA-9429: Test partition elimination merging inputs with 
different
These tests are more related to partition pruning than UNION statements, 
especially the last two don't have UNION op. It'd be better to move them to 
partition-pruning.test. At there we don't need to explicitly set 
explain_level=2:
https://github.com/apache/impala/blob/030c12ab2c995a8618dcf601296e310203f70f8c/fe/src/test/java/org/apache/impala/planner/PlannerTest.java#L794



--
To view, visit http://gerrit.cloudera.org:8080/15371
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c1384c2cd1ad5f7024449196f9a348ecdccb60b
Gerrit-Change-Number: 15371
Gerrit-PatchSet: 3
Gerrit-Owner: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Anurag Mantripragada <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Fredy Wijaya <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Comment-Date: Fri, 20 Mar 2020 12:44:06 +0000
Gerrit-HasComments: Yes

Reply via email to