Nicholas Brenwald created HIVE-11603: ----------------------------------------
Summary: IndexOutOfBoundsException thrown when accessing a union all subquery and filtering on a column which does not exist in all underlying tables Key: HIVE-11603 URL: https://issues.apache.org/jira/browse/HIVE-11603 Project: Hive Issue Type: Bug Affects Versions: 1.3.0 Environment: Hadoop 2.6 Reporter: Nicholas Brenwald Priority: Minor Fix For: 2.0.0 Create two empty tables t1 and t2 {code} CREATE TABLE t1(c1 STRING); CREATE TABLE t2(c1 STRING, c2 INT); {code} Create a view on these two tables {code} CREATE VIEW v1 AS SELECT c1, c2 FROM ( SELECT c1, CAST(NULL AS INT) AS c2 FROM t1 UNION ALL SELECT c1, c2 FROM t2 ) x; {code} Then run {code} SELECT COUNT(*) from v1 WHERE c2 = 0; {code} We expect to get a result of zero, but instead the query fails with stack trace: {code} Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:119) ... 22 more {code} Workarounds include disabling ppd, {code} set hive.optimize.ppd=false; {code} Or changing the view so that column c2 is null cast to double: {code} CREATE VIEW v1_workaround AS SELECT c1, c2 FROM ( SELECT c1, CAST(NULL AS DOUBLE) AS c2 FROM t1 UNION ALL SELECT c1, c2 FROM t2 ) x; {code} The problem seems to occur in branch-1.1, branch-1.2, branch-1 but seems to be resolved in master (2.0.0) -- This message was sent by Atlassian JIRA (v6.3.4#6332)