Per Ullberg created HIVE-10867: ---------------------------------- Summary: ArrayIndexOutOfBoundsException LazyBinaryUtils.byteArrayToLong with Hive on Tez Key: HIVE-10867 URL: https://issues.apache.org/jira/browse/HIVE-10867 Project: Hive Issue Type: Bug Components: Hive, Tez Affects Versions: 0.14.0 Environment: Hortwonworks distribution 2.2.4-2 Hive 0.14.0 Tez 0.5.2.2.2.4.2-2 on cluster Tez 0.7.0 in local setup Reporter: Per Ullberg
Hi, The following query runs fine on map reduce engine but when setting the hive.exection.engine to tez it produces an ArrayIndexOutOfBoundsException. Query {code} create external table table_1 (id string, date string, amount bigint); insert into table table_1 values (305,'2013-03-02',3790); create external table table_2 (id string); insert into table table_2 VALUES (305); create external table table_3 (id string, date_3 string, amount_3 bigint); insert into table table_3 values (305,'2013-03-01',-1600); create external table table_4 (id bigint, str_4 string, amount_4 bigint); create table table_5 as SELECT c.diff FROM ( SELECT id AS id, date AS create_date, -amount AS diff FROM table_1 UNION ALL SELECT p.id AS id, p.str_4 AS create_date, -p.amount_4 AS diff FROM table_4 p UNION ALL SELECT id, create_date, diff FROM ( SELECT i.id AS id, tp.date_3 AS create_date, cast(amount_3 as double) AS diff FROM table_3 tp INNER JOIN table_2 i ON cast(tp.id as string) = cast(i.id as string) ) fees ) c INNER JOIN table_2 i ON cast(c.id as string) = cast(i.id as string); {code} Results with map reduce engine: {code} hive> select * from table_5; OK -1600.0 -3790.0 Time taken: 0.061 seconds, Fetched: 2 row(s) {code} Exception with tez engine: {code} Status: Failed Vertex failed, vertexName=Reducer 4, vertexId=vertex_1432809678493_0891_4_06, diagnostics=[Task failed, taskId=task_1432809678493_0891_4_06_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"305"},"value":{"_col1":-1600.0}} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"305"},"value":{"_col1":-1600.0}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:337) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:218) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:168) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163) ... 13 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 6 at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.byteArrayToLong(LazyBinaryUtils.java:84) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryDouble.init(LazyBinaryDouble.java:43) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264) at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201) at org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:98) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.JoinUtil.computeValues(JoinUtil.java:193) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getFilteredValue(CommonJoinOperator.java:408) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processOp(CommonMergeJoinOperator.java:162) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:328) ... 16 more {code} Secondly, adding a column to table_5 gets rid of the Exception, but instead the result set is corrupted when using the tez engine. This is even more scary! Query {code} create table table_5 as SELECT c.create_date, c.diff FROM ( SELECT id AS id, date AS create_date, -amount AS diff FROM table_1 UNION ALL SELECT p.id AS id, p.str_4 AS create_date, -p.amount_4 AS diff FROM table_4 p UNION ALL SELECT id, create_date, diff FROM ( SELECT i.id AS id, tp.date_3 AS create_date, cast(amount_3 as double) AS diff FROM table_3 tp INNER JOIN table_2 i ON cast(tp.id as string) = cast(i.id as string) ) fees ) c INNER JOIN table_2 i ON cast(c.id as string) = cast(i.id as string); {code} Result: {code} hive> select * from with_mr.table_5; OK 2013-03-02 -3790.0 2013-03-01 -1600.0 Time taken: 8.107 seconds, Fetched: 2 row(s) hive> select * from with_tez.table_5; OK 2013-03-01 -1600.0 2013-03-02 -1.6968199793927886E-279 Time taken: 0.047 seconds, Fetched: 2 row(s) {code} This ticket could possibly be related to https://issues.apache.org/jira/browse/HIVE-9517? -- This message was sent by Atlassian JIRA (v6.3.4#6332)