[ 
https://issues.apache.org/jira/browse/HIVE-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichun Wu updated HIVE-7847:
-----------------------------

    Attachment: HIVE-7847.1.patch

> query orc partitioned table fail when table column type change
> --------------------------------------------------------------
>
>                 Key: HIVE-7847
>                 URL: https://issues.apache.org/jira/browse/HIVE-7847
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats
>    Affects Versions: 0.11.0, 0.12.0, 0.13.0
>            Reporter: Zhichun Wu
>            Assignee: Zhichun Wu
>             Fix For: 0.14.0
>
>         Attachments: HIVE-7847.1.patch
>
>
> I use the following script to test orc column type change with partitioned 
> table on branch-0.13:
> {code}
> use test;
> DROP TABLE if exists orc_change_type_staging;
> DROP TABLE if exists orc_change_type;
> CREATE TABLE orc_change_type_staging (
>     id int
> );
> CREATE TABLE orc_change_type (
>     id int
> ) PARTITIONED BY (`dt` string)
> stored as orc;
> --- load staging table
> LOAD DATA LOCAL INPATH '../hive/examples/files/int.txt' OVERWRITE INTO TABLE 
> orc_change_type_staging;
> --- populate orc hive table
> INSERT OVERWRITE TABLE orc_change_type partition(dt='20140718') select * FROM 
> orc_change_type_staging limit 1;
> --- change column id from int to bigint
> ALTER TABLE orc_change_type CHANGE id id bigint;
> INSERT OVERWRITE TABLE orc_change_type partition(dt='20140719') select * FROM 
> orc_change_type_staging limit 1;
> SELECT id FROM orc_change_type where dt between '20140718' and '20140719';
> {code}
> if fails in the last query "SELECT id FROM orc_change_type where dt between 
> '20140718' and '20140719';" with exception:
> {code}
> Error: java.io.IOException: java.io.IOException: 
> java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast 
> to org.apache.hadoop.io.LongWritable
>         at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
>         at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
>         at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:197)
>         at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:183)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.io.IOException: java.lang.ClassCastException: 
> org.apache.hadoop.io.IntWritable cannot be cast to 
> org.apache.hadoop.io.LongWritable
>         at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
>         at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
>         at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
>         at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
>         at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
>         ... 11 more
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
> cannot be cast to org.apache.hadoop.io.LongWritable
>         at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:717)
>         at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1788)
>         at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2997)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:153)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:127)
>         at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339)
>         ... 15 more
> {code}
> The value object would be reused each time we deserialize the row,  it will 
> fail when we start to process the next path with different schema.  Resetting 
> value each time we finish reading one path would solve this problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to