Artem Velykorodnyi created HIVE-22031:
-----------------------------------------

             Summary: HiveRelDecorrelator fails with IndexOutOfBoundsException 
if the query contains several "constant" columns
                 Key: HIVE-22031
                 URL: https://issues.apache.org/jira/browse/HIVE-22031
             Project: Hive
          Issue Type: Bug
          Components: CBO
    Affects Versions: 2.3.5
            Reporter: Artem Velykorodnyi
            Assignee: Artem Velykorodnyi


Steps for reproducing:
{code}
1. Create table orders
create table orders (ORD_NUM INT, CUST_CODE STRING);
2. Create table customers
create table customers (CUST_CODE STRING);
3. Make select with constants and with a subquery:
select DISTINCT(CUST_CODE), '777' as ANY, ORD_NUM, '888' as CONSTANT
from orders 
WHERE not exists 
(select 1 
from customers 
WHERE CUST_CODE=orders.CUST_CODE
);
{code}
Query fails with IndexOutOfBoundsException
{code}
Exception in thread "main" java.lang.AssertionError: Internal error: While 
invoking method 'public 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator$Frame 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject)
 throws org.apache.hadoop.hive.ql.parse.SemanticException'
        at org.apache.calcite.util.Util.newInternal(Util.java:792)
        at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:534)
        at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.getInvoke(HiveRelDecorrelator.java:660)
        at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelate(HiveRelDecorrelator.java:252)
        at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateQuery(HiveRelDecorrelator.java:218)
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1347)
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1261)
        at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113)
        at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:997)
        at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149)
        at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106)
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1069)
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1085)
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:364)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11138)
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286)
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
        at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
        ... 32 more
Caused by: java.lang.AssertionError: Internal error: While invoking method 
'public 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator$Frame 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveAggregate)
 throws org.apache.hadoop.hive.ql.parse.SemanticException'
        at org.apache.calcite.util.Util.newInternal(Util.java:792)
        at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:534)
        at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.getInvoke(HiveRelDecorrelator.java:660)
        at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(HiveRelDecorrelator.java:854)
        ... 37 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
        ... 39 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 2
        at java.util.ArrayList.rangeCheckForAdd(ArrayList.java:665)
        at java.util.ArrayList.add(ArrayList.java:477)
        at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(HiveRelDecorrelator.java:833)
        ... 44 more
{code}

HiveRelDecorrelator looking for omitted constants and put them into TreeMap 
where keys are the numbers of columns in top-level select query.
For query from example TreeMap contains:
{code}
0 = {TreeMap$Entry@8389} "1" -> "_UTF-16LE'777'"
1 = {TreeMap$Entry@8390} "3" -> "_UTF-16LE'888'"
{code}

After that, there is step where List of fields is combined with contsants from 
TreeMap
{code}
if (!omittedConstants.isEmpty()) {
  final List<RexNode> postProjects = new ArrayList<>(relBuilder.fields());
  for (Map.Entry<Integer, RexLiteral> entry
  : omittedConstants.descendingMap().entrySet()) {
  postProjects.add(entry.getKey() + frame.corDefOutputs.size(),
  entry.getValue());
  }
  relBuilder.project(postProjects);
  }
{code}

But TreeMap is descending, so  firstly goes constant columns with high position 
number, greater than target List size. 
(For query from example there is an attempt to add an element to the List with 
index 3, but the size of List is only 2).
If we use TreeMap without descending - everything goes as expected. Also, there 
is no difference between descending and ascending map, because the List is 
filled using indexes but not sequential position.

"Q file" with the query from the example works fine but fails on the Hive 2.3.5.




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to