[ 
https://issues.apache.org/jira/browse/HIVE-17085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094937#comment-16094937
 ] 

Prasanth Jayachandran commented on HIVE-17085:
----------------------------------------------

[~gopalv] Can you please review this patch?

> ORC file merge/concatenation should do full schema check
> --------------------------------------------------------
>
>                 Key: HIVE-17085
>                 URL: https://issues.apache.org/jira/browse/HIVE-17085
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC
>    Affects Versions: 2.2.0, 2.3.0, 3.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-17085.1.patch, HIVE-17085.2.patch
>
>
> ORC merging/concatenation compatibility check just looks for column count 
> match at outer level. ORC schema evolution now supports inner structs as 
> well. With that outer level column count will match but inner column level 
> will not match. Compatibility check should do full schema match before 
> merging/concatenation. This issue will not cause data loss but will cause 
> task failures with exception like below
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to close 
> OrcFileMergeOperator
>       at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:247)
>       at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.processKeyValuePairs(OrcFileMergeOperator.java:172)
>       at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.process(OrcFileMergeOperator.java:72)
>       at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.processRow(MergeFileRecordProcessor.java:212)
>       ... 16 more
> Caused by: java.lang.IllegalArgumentException: Column has wrong number of 
> index entries found: 0 expected: 1
>       at 
> org.apache.orc.impl.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:695)
>       at 
> org.apache.orc.impl.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:2147)
>       at org.apache.orc.impl.WriterImpl.flushStripe(WriterImpl.java:2661)
>       at org.apache.orc.impl.WriterImpl.close(WriterImpl.java:2834)
>       at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:321)
>       at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:243)
>       ... 19 more
> {code}
> Concatenation should also make sure writer version is matching (it currently 
> checks only file version match).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to