Artem Velykorodnyi created HIVE-22031: -----------------------------------------
Summary: HiveRelDecorrelator fails with IndexOutOfBoundsException if the query contains several "constant" columns Key: HIVE-22031 URL: https://issues.apache.org/jira/browse/HIVE-22031 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 2.3.5 Reporter: Artem Velykorodnyi Assignee: Artem Velykorodnyi Steps for reproducing: {code} 1. Create table orders create table orders (ORD_NUM INT, CUST_CODE STRING); 2. Create table customers create table customers (CUST_CODE STRING); 3. Make select with constants and with a subquery: select DISTINCT(CUST_CODE), '777' as ANY, ORD_NUM, '888' as CONSTANT from orders WHERE not exists (select 1 from customers WHERE CUST_CODE=orders.CUST_CODE ); {code} Query fails with IndexOutOfBoundsException {code} Exception in thread "main" java.lang.AssertionError: Internal error: While invoking method 'public org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator$Frame org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject) throws org.apache.hadoop.hive.ql.parse.SemanticException' at org.apache.calcite.util.Util.newInternal(Util.java:792) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:534) at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.getInvoke(HiveRelDecorrelator.java:660) at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelate(HiveRelDecorrelator.java:252) at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateQuery(HiveRelDecorrelator.java:218) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1347) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1261) at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:997) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1069) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1085) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:364) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11138) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:286) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531) ... 32 more Caused by: java.lang.AssertionError: Internal error: While invoking method 'public org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator$Frame org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveAggregate) throws org.apache.hadoop.hive.ql.parse.SemanticException' at org.apache.calcite.util.Util.newInternal(Util.java:792) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:534) at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.getInvoke(HiveRelDecorrelator.java:660) at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(HiveRelDecorrelator.java:854) ... 37 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531) ... 39 more Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 2 at java.util.ArrayList.rangeCheckForAdd(ArrayList.java:665) at java.util.ArrayList.add(ArrayList.java:477) at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(HiveRelDecorrelator.java:833) ... 44 more {code} HiveRelDecorrelator looking for omitted constants and put them into TreeMap where keys are the numbers of columns in top-level select query. For query from example TreeMap contains: {code} 0 = {TreeMap$Entry@8389} "1" -> "_UTF-16LE'777'" 1 = {TreeMap$Entry@8390} "3" -> "_UTF-16LE'888'" {code} After that, there is step where List of fields is combined with contsants from TreeMap {code} if (!omittedConstants.isEmpty()) { final List<RexNode> postProjects = new ArrayList<>(relBuilder.fields()); for (Map.Entry<Integer, RexLiteral> entry : omittedConstants.descendingMap().entrySet()) { postProjects.add(entry.getKey() + frame.corDefOutputs.size(), entry.getValue()); } relBuilder.project(postProjects); } {code} But TreeMap is descending, so firstly goes constant columns with high position number, greater than target List size. (For query from example there is an attempt to add an element to the List with index 3, but the size of List is only 2). If we use TreeMap without descending - everything goes as expected. Also, there is no difference between descending and ascending map, because the List is filled using indexes but not sequential position. "Q file" with the query from the example works fine but fails on the Hive 2.3.5. -- This message was sent by Atlassian JIRA (v7.6.14#76016)