[
https://issues.apache.org/jira/browse/HIVE-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prasanth Jayachandran updated HIVE-17085:
-----------------------------------------
Resolution: Fixed
Fix Version/s: 2.4.0
3.0.0
Status: Resolved (was: Patch Available)
Test failures are unrelated to this patch.
Committed to branch-2 and master. Thanks Zoltan for the review!
> ORC file merge/concatenation should do full schema check
> --------------------------------------------------------
>
> Key: HIVE-17085
> URL: https://issues.apache.org/jira/browse/HIVE-17085
> Project: Hive
> Issue Type: Bug
> Components: ORC
> Affects Versions: 2.2.0, 2.3.0, 3.0.0
> Reporter: Prasanth Jayachandran
> Assignee: Prasanth Jayachandran
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17085.1.patch, HIVE-17085.2.patch
>
>
> ORC merging/concatenation compatibility check just looks for column count
> match at outer level. ORC schema evolution now supports inner structs as
> well. With that outer level column count will match but inner column level
> will not match. Compatibility check should do full schema match before
> merging/concatenation. This issue will not cause data loss but will cause
> task failures with exception like below
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close
> OrcFileMergeOperator
> at
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
> at
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
> at
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
> at
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
> ... 16 more
> Caused by: java.lang.IllegalArgumentException: Column has wrong number of
> index entries found: 0 expected: 1
> at
> org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
> at
> org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
> at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
> at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
> at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
> at
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
> ... 19 more
> {code}
> Concatenation should also make sure writer version is matching (it currently
> checks only file version match).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)