[
https://issues.apache.org/jira/browse/HIVE-14564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
zhihai xu updated HIVE-14564:
-----------------------------
Description:
Column Pruning generates out of order columns in SelectOperator which cause
ArrayIndexOutOfBoundsException.
{code}
2016-07-26 21:49:24,390 FATAL [main]
org.apache.hadoop.hive.ql.exec.mr.ExecMapper:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ArrayIndexOutOfBoundsException
at
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at org.apache.hadoop.io.Text.set(Text.java:225)
at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
at
org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
at
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
at
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
at
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
... 13 more
{code}
The exception is because the serialization and deserialization doesn't match.
The serialization by LazyBinarySerDe from previous MapReduce job used different
order of columns. When the current MapReduce job deserialized the intermediate
sequence file generated by previous MapReduce job, it will get corrupted data
from the deserialization using wrong order of columns by LazyBinaryStruct. The
unmatched columns between serialization and deserialization caused by
SelectOperator's Column Pruning {{ColumnPrunerSelectProc}}.
was:
Column Pruning generates out of order columns in SelectOperator which cause
ArrayIndexOutOfBoundsException.
{code}
2016-07-26 21:49:24,390 FATAL [main]
org.apache.hadoop.hive.ql.exec.mr.ExecMapper:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row
{"_col0":null,"_col1":0,"_col2":36,"_col3":"499ec44-6dd2-4709-a019-33d6d484ed90�\u0001U5�\u001c��\t\u001b�\u0000","_col4":"5264db53-d650-4678-9261-cdd51efab8bb","_col5":"cb5233dd-214a-4b0b-b43e-0f41befb5c5c","_col6":"","_col8":48,"_col9":null,"_col10":"1befb5c5c�\u00192016-06-09T15:31:15+00:00\u0002\u0005Rider\u0011svc-dash","_col11":64,"_col12":null,"_col13":null,"_col14":"ber.com�\u0001U5ߨP�\u0001U5ᷨider)
-
1000\u0005Rider\[email protected]�\u0001U4�;x�\u0001U5\u0004��\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000","_col15":"","_col16":null}
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ArrayIndexOutOfBoundsException
at
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at org.apache.hadoop.io.Text.set(Text.java:225)
at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
at
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
at
org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
at
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
at
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
at
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
at
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
at
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
... 13 more
{code}
The exception is because the serialization and deserialization doesn't match.
The serialization by LazyBinarySerDe from previous MapReduce job used different
order of columns. When the current MapReduce job deserialized the intermediate
sequence file generated by previous MapReduce job, it will get corrupted data
from the deserialization using wrong order of columns by LazyBinaryStruct. The
unmatched columns between serialization and deserialization caused by
SelectOperator's Column Pruning {{ColumnPrunerSelectProc}}.
> Column Pruning generates out of order columns in SelectOperator which cause
> ArrayIndexOutOfBoundsException.
> -----------------------------------------------------------------------------------------------------------
>
> Key: HIVE-14564
> URL: https://issues.apache.org/jira/browse/HIVE-14564
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Affects Versions: 2.1.0
> Reporter: zhihai xu
> Assignee: zhihai xu
> Priority: Critical
>
> Column Pruning generates out of order columns in SelectOperator which cause
> ArrayIndexOutOfBoundsException.
> {code}
> 2016-07-26 21:49:24,390 FATAL [main]
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing row
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.ArrayIndexOutOfBoundsException
> at
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
> at
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> ... 9 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at org.apache.hadoop.io.Text.set(Text.java:225)
> at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryString.init(LazyBinaryString.java:48)
> at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.uncheckedGetField(LazyBinaryStruct.java:264)
> at
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getField(LazyBinaryStruct.java:201)
> at
> org.apache.hadoop.hive.serde2.lazybinary.objectinspector.LazyBinaryStructObjectInspector.getStructFieldData(LazyBinaryStructObjectInspector.java:64)
> at
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator._evaluate(ExprNodeColumnEvaluator.java:94)
> at
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
> at
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
> at
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.makeValueWritable(ReduceSinkOperator.java:550)
> at
> org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:377)
> ... 13 more
> {code}
> The exception is because the serialization and deserialization doesn't match.
> The serialization by LazyBinarySerDe from previous MapReduce job used
> different order of columns. When the current MapReduce job deserialized the
> intermediate sequence file generated by previous MapReduce job, it will get
> corrupted data from the deserialization using wrong order of columns by
> LazyBinaryStruct. The unmatched columns between serialization and
> deserialization caused by SelectOperator's Column Pruning
> {{ColumnPrunerSelectProc}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)