[ 
https://issues.apache.org/jira/browse/HIVE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459075#comment-16459075
 ] 

Jesus Camacho Rodriguez commented on HIVE-19358:
------------------------------------------------

[~vgarg], would you mind taking a look at this issue?

Latest patch has some updated q files for which the plan improves, including 
TPC-DS queries. The reason is that {{ReduceExpressions}} rule matches on 
{{HiveProject}}, hence we were not matching {{LogicalProject}} operator and we 
were missing optimization opportunities, including predicate propagation.

In turn, there are some plans that change quite a bit (e.g. new semijoin 
operators appear), results are still the same but probably we need to double 
check whether all is correct.

Finally, there are tests that are now failing. For instance, 
{{subquery_unqualcolumnrefs.q}} is failing. The stacktrace is below.

{code:sql}
explain
select *
from src b
where b.key in
        (select distinct key
         from src
         where b.value = value and key > '9'
        );
{code}

{code}
2018-04-30T13:27:41,704 DEBUG [1c1ab7bd-d3ae-4957-b15c-6e960c853230 main] 
calcite.sql2rel: Plan after trimming unused fields
HiveFilter(condition=[IN($0, {
HiveProject(key=[$0])
  HiveAggregate(group=[{0}])
    HiveProject($f0=[$0])
      HiveFilter(condition=[AND(=($cor0.value, $1), >($0, _UTF-16LE'9'))])
        HiveTableScan(table=[[default.src]], table:alias=[src])
})])
  HiveProject(key=[$0], value=[$1])
    HiveTableScan(table=[[default.src]], table:alias=[b])

2018-04-30T13:27:41,704 DEBUG [1c1ab7bd-d3ae-4957-b15c-6e960c853230 main] 
parse.CalcitePlanner: Plan before removing subquery:
HiveProject(key=[$0], value=[$1])
  HiveFilter(condition=[IN($0, {
HiveProject(key=[$0])
  HiveAggregate(group=[{0}])
    HiveProject($f0=[$0])
      HiveFilter(condition=[AND(=($cor0.value, $1), >($0, _UTF-16LE'9'))])
        HiveTableScan(table=[[default.src]], table:alias=[src])
})])
    HiveTableScan(table=[[default.src]], table:alias=[b])

2018-04-30T13:27:41,716 DEBUG [1c1ab7bd-d3ae-4957-b15c-6e960c853230 main] 
parse.CalcitePlanner: Plan just after removing subquery:
HiveProject(key=[$0], value=[$1])
  HiveFilter(condition=[=($0, $5)])
    LogicalCorrelate(correlation=[$cor0], joinType=[semi], 
requiredColumns=[{1}])
      HiveTableScan(table=[[default.src]], table:alias=[b])
      HiveProject(key=[$0])
        HiveAggregate(group=[{0}])
          HiveProject($f0=[$0])
            HiveFilter(condition=[AND(=($cor0.value, $1), >($0, _UTF-16LE'9'))])
              HiveTableScan(table=[[default.src]], table:alias=[src])

2018-04-30T13:27:41,718 ERROR [1c1ab7bd-d3ae-4957-b15c-6e960c853230 main] 
parse.CalcitePlanner: CBO failed, skipping CBO.
java.lang.IllegalStateException: Unable to convert SEMI to JoinRelType
        at org.apache.calcite.sql.SemiJoinType.toJoinType(SemiJoinType.java:83) 
~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator$RemoveCorrelationForScalarAggregateRule.onMatch(HiveRelDecorrelator.java:2470)
 ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
 ~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) 
~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415) 
~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:252) 
~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
 ~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211) 
~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198) 
~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.removeCorrelationViaRule(HiveRelDecorrelator.java:327)
 ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateQuery(HiveRelDecorrelator.java:220)
 ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709)
 ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1613)
 ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118) 
~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
 ~[calcite-core-1.16.0.jar:1.16.0]
        at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154) 
~[calcite-core-1.16.0.jar:1.16.0]
        at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111) 
~[calcite-core-1.16.0.jar:1.16.0]
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1418)
 ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1434)
 ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:454)
 [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12077)
 [hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
...
{code}

Cc [~ashutoshc]

> CBO decorrelation logic should generate Hive operators
> ------------------------------------------------------
>
>                 Key: HIVE-19358
>                 URL: https://issues.apache.org/jira/browse/HIVE-19358
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>    Affects Versions: 3.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Major
>         Attachments: HIVE-19358.01.patch, HIVE-19358.patch
>
>
> Decorrelation logic may generate logical instances of the operators in the 
> plan (e.g., LogicalFilter instead of HiveFilter). This leads to errors while 
> costing the tree in the Volcano planner (used in MV rewriting), since logical 
> operators do not have a cost associated to them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to