[jira] [Comment Edited] (ORC-539) Exception in double to timestamp schema evolution
[ https://issues.apache.org/jira/browse/ORC-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928308#comment-16928308 ] Laszlo Bodor edited comment on ORC-539 at 9/12/19 10:57 AM: seems like that internal branch needs some changes, and then this exception is gone, but I found another issue, which is about floats, and it produces the same result at apache/master with applying https://issues.apache.org/jira/secure/attachment/12980169/ORC-539.repro.patch {code} org.junit.ComparisonFailure: row 0 expected:<1960-01-27 12:3[4:56.1] Australia/Sydney> but was:<1960-01-27 12:3[5:12.0] Australia/Sydney> at org.junit.Assert.assertEquals(Assert.java:115) at org.apache.orc.impl.TestSchemaEvolution.testEvolutionToTimestamp(TestSchemaEvolution.java:2224) ... {code} filed ORC-554 about this as it's another issue was (Author: abstractdog): seems like that internal branch needs some changes, and then this exception is gone, but I found another issue, which is about floats, and it produces the same result at apache/master with applying https://issues.apache.org/jira/secure/attachment/12980169/ORC-539.repro.patch {code} org.junit.ComparisonFailure: row 0 expected:<1960-01-27 12:3[4:56.1] Australia/Sydney> but was:<1960-01-27 12:3[5:12.0] Australia/Sydney> at org.junit.Assert.assertEquals(Assert.java:115) at org.apache.orc.impl.TestSchemaEvolution.testEvolutionToTimestamp(TestSchemaEvolution.java:2224) ... {code} > Exception in double to timestamp schema evolution > - > > Key: ORC-539 > URL: https://issues.apache.org/jira/browse/ORC-539 > Project: ORC > Issue Type: Bug >Affects Versions: 1.6.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Laszlo Bodor >Priority: Major > Attachments: ORC-539.repro.patch > > > I backported ORC-189 to my own branch and run tests in Hive. I am getting the > following exception in a test related to schema evolution from double to > timestamp after applying ORC-189: > {noformat} > Caused by: java.io.IOException: Error reading file: > file:/Users/jcamachorodriguez/src/workspaces/hive/itests/qtest/target/localfs/warehouse/part_change_various_various_timestamp_n6/part=1/00_0 > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1289) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:87) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:103) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:252) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:227) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361) > ... 23 more > Caused by: java.io.EOFException: Read past EOF for compressed stream Stream > for column 7 kind DATA position: 15 length: 15 range: 0 offset: 122 limit: > 122 range 0 = 0 to 15 uncompressed: 12 to 12 > at > org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:125) > at > org.apache.orc.impl.SerializationUtils.readLongLE(SerializationUtils.java:108) > at > org.apache.orc.impl.SerializationUtils.readDouble(SerializationUtils.java:104) > at > org.apache.orc.impl.TreeReaderFactory$DoubleTreeReader.nextVector(TreeReaderFactory.java:783) > at > org.apache.orc.impl.ConvertTreeReaderFactory$TimestampFromDoubleTreeReader.nextVector(ConvertTreeReaderFactory.java:1883) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:2012) > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1282) > ... 28 more > {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Comment Edited] (ORC-539) Exception in double to timestamp schema evolution
[ https://issues.apache.org/jira/browse/ORC-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927546#comment-16927546 ] Laszlo Bodor edited comment on ORC-539 at 9/11/19 3:00 PM: --- small repro without partitions and with a single column {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} the problem is that on the internal branch ORC-531 cannot be found, which is responsible for handling float / double types in the convert tree reader: https://github.com/apache/orc/blame/master/java/core/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java#L1397-L1399 so it probably tries to read float as it was double, hence the error with this check the issue disappears (however I got result mismatch, still checking), but I think TestSchemaEvolution#testEvolutionToTimestamp still needs to be improved for testing float evolution, because it can reproduce the error even without an "external" hive test https://github.com/apache/orc/commit/a7255f3669146e7697215e75720c74ca831b374c#diff-a6311862d24b863a3d394b89ed9d0495R2158-R2159 was (Author: abstractdog): small repro without partitions and with a single column {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} the problem is that on the internal branch ORC-531 cannot be found, which is responsible for handling float / double types in the convert tree reader: https://github.com/apache/orc/blame/master/java/core/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java#L828-L830 so it probably tries to read float as it was double, hence the error with this check the issue disappears (however I got result mismatch, still checking), but I think TestSchemaEvolution#testEvolutionToTimestamp still needs to be improved for testing float evolution, because it can reproduce the error even without an "external" hive test https://github.com/apache/orc/commit/a7255f3669146e7697215e75720c74ca831b374c#diff-a6311862d24b863a3d394b89ed9d0495R2158-R2159 > Exception in double to timestamp schema evolution > - > > Key: ORC-539 > URL: https://issues.apache.org/jira/browse/ORC-539 > Project: ORC > Issue Type: Bug >Affects Versions: 1.6.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Laszlo Bodor >Priority: Major > > I backported ORC-189 to my own branch and run tests in Hive. I am getting the > following exception in a test related to schema evolution from double to > timestamp after applying ORC-189: > {noformat} > Caused by: java.io.IOException: Error reading file: > file:/Users/jcamachorodriguez/src/workspaces/hive/itests/qtest/target/localfs/warehouse/part_change_various_various_timestamp_n6/part=1/00_0 > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1289) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:87) > at > org.apache.hadoop.hive.ql.io.orc.Recor
[jira] [Comment Edited] (ORC-539) Exception in double to timestamp schema evolution
[ https://issues.apache.org/jira/browse/ORC-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927546#comment-16927546 ] Laszlo Bodor edited comment on ORC-539 at 9/11/19 2:52 PM: --- small repro without partitions and with a single column {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} the problem is that on the internal branch ORC-531 cannot be found, which is responsible for handling float / double types in the convert tree reader: https://github.com/apache/orc/blame/master/java/core/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java#L828-L830 so it probably tries to read float as it was double, hence the error with this check the issue disappears (however I got result mismatch, still checking), but I think TestSchemaEvolution#testEvolutionToTimestamp still needs to be improved for testing float evolution, because it can reproduce the error even without an "external" hive test https://github.com/apache/orc/commit/a7255f3669146e7697215e75720c74ca831b374c#diff-a6311862d24b863a3d394b89ed9d0495R2158-R2159 was (Author: abstractdog): small repro without partitions and with a single column {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} the problem is that on the internal branch ORC-531 cannot be found, which is responsible for handling float / double types in the convert tree reader: https://github.com/apache/orc/blame/master/java/core/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java#L828-L830 so it probably tries to read float as it was double, hence the error with this check the issue disappears (however I got result mismatch, still checking), but I think TestSchemaEvolution#testEvolutionToTimestamp still needs to be improved for testing float evolution, because it can reproduce the error even without an "external" hive test > Exception in double to timestamp schema evolution > - > > Key: ORC-539 > URL: https://issues.apache.org/jira/browse/ORC-539 > Project: ORC > Issue Type: Bug >Affects Versions: 1.6.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Laszlo Bodor >Priority: Major > > I backported ORC-189 to my own branch and run tests in Hive. I am getting the > following exception in a test related to schema evolution from double to > timestamp after applying ORC-189: > {noformat} > Caused by: java.io.IOException: Error reading file: > file:/Users/jcamachorodriguez/src/workspaces/hive/itests/qtest/target/localfs/warehouse/part_change_various_various_timestamp_n6/part=1/00_0 > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1289) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:87) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:103) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.n
[jira] [Comment Edited] (ORC-539) Exception in double to timestamp schema evolution
[ https://issues.apache.org/jira/browse/ORC-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927546#comment-16927546 ] Laszlo Bodor edited comment on ORC-539 at 9/11/19 2:51 PM: --- small repro without partitions and with a single column {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} the problem is that on the internal branch ORC-531 cannot be found, which is responsible for handling float / double types in the convert tree reader: https://github.com/apache/orc/blame/master/java/core/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java#L828-L830 so it probably tries to read float as it was double, hence the error with this check the issue disappears (however I got result mismatch, still checking), but I think TestSchemaEvolution#testEvolutionToTimestamp still needs to be improved for testing float evolution, because it can reproduce the error even without an "external" hive test was (Author: abstractdog): small repro without partitions and with a single column {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} the problem is that on the internal branch ORC-531 cannot be found, which is responsible for handling float / double types in the convert tree reader: https://github.com/apache/orc/blame/master/java/core/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java#L828-L830 so it probably tries to read float as it was double, hence the error with this check the issue disappears (however I got result mismatch, still checking), but I think TestSchemaEvolution#testEvolutionToTimestamp still needs to be improved for testing float evolution > Exception in double to timestamp schema evolution > - > > Key: ORC-539 > URL: https://issues.apache.org/jira/browse/ORC-539 > Project: ORC > Issue Type: Bug >Affects Versions: 1.6.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Laszlo Bodor >Priority: Major > > I backported ORC-189 to my own branch and run tests in Hive. I am getting the > following exception in a test related to schema evolution from double to > timestamp after applying ORC-189: > {noformat} > Caused by: java.io.IOException: Error reading file: > file:/Users/jcamachorodriguez/src/workspaces/hive/itests/qtest/target/localfs/warehouse/part_change_various_various_timestamp_n6/part=1/00_0 > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1289) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:87) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:103) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:252) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:227) > at > org.apache.hadoop.hive.ql.io.HiveContextAware
[jira] [Comment Edited] (ORC-539) Exception in double to timestamp schema evolution
[ https://issues.apache.org/jira/browse/ORC-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927546#comment-16927546 ] Laszlo Bodor edited comment on ORC-539 at 9/11/19 2:49 PM: --- small repro without partitions and with a single column {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} the problem is that on the internal branch ORC-531 cannot be found, which is responsible for handling float / double types in the convert tree reader: https://github.com/apache/orc/blame/master/java/core/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java#L828-L830 so it probably tries to read float as it was double, hence the error with this check the issue disappears (however I got result mismatch, still checking), but I think TestSchemaEvolution#testEvolutionToTimestamp still needs to be improved for testing float evolution was (Author: abstractdog): it fails for float and double too, simple repro which can be used with double/float source: (1 column, no partitions) {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} > Exception in double to timestamp schema evolution > - > > Key: ORC-539 > URL: https://issues.apache.org/jira/browse/ORC-539 > Project: ORC > Issue Type: Bug >Affects Versions: 1.6.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Laszlo Bodor >Priority: Major > > I backported ORC-189 to my own branch and run tests in Hive. I am getting the > following exception in a test related to schema evolution from double to > timestamp after applying ORC-189: > {noformat} > Caused by: java.io.IOException: Error reading file: > file:/Users/jcamachorodriguez/src/workspaces/hive/itests/qtest/target/localfs/warehouse/part_change_various_various_timestamp_n6/part=1/00_0 > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1289) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:87) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:103) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:252) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:227) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361) > ... 23 more > Caused by: java.io.EOFException: Read past EOF for compressed stream Stream > for column 7 kind DATA position: 15 length: 15 range: 0 offset: 122 limit: > 122 range 0 = 0 to 15 uncompressed: 12 to 12 > at > org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:125) > at > org.apache.orc.impl.SerializationUtils.readLongLE(SerializationUtils.java:108) > at > org.apache.orc.impl.SerializationUtils.readDouble(SerializationUtil
[jira] [Comment Edited] (ORC-539) Exception in double to timestamp schema evolution
[ https://issues.apache.org/jira/browse/ORC-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927546#comment-16927546 ] Laszlo Bodor edited comment on ORC-539 at 9/11/19 1:15 PM: --- it fails for float and double too, simple repro which can be used with double/float source: (1 column, no partitions) {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} was (Author: abstractdog): it fails for float and double too, simple repro which can be used with double/float source: {code} CREATE TABLE schema_evolution_data_n41(insert_num int, boolean1 boolean, tinyint1 tinyint, smallint1 smallint, int1 int, bigint1 bigint, decimal1 decimal(38,18), float1 float, double1 double, string1 string, string2 string, date1 date, timestamp1 timestamp, boolean_str string, tinyint_str string, smallint_str string, int_str string, bigint_str string, decimal_str string, float_str string, double_str string, date_str string, timestamp_str string, filler string) row format delimited fields terminated by '|' stored as textfile; load data local inpath '../../data/files/schema_evolution/schema_evolution_data.txt' overwrite into table schema_evolution_data_n41; CREATE TABLE part_change_various_various_timestamp_n6(c6 FLOAT); insert into table part_change_various_various_timestamp_n6 SELECT float1 FROM schema_evolution_data_n41; alter table part_change_various_various_timestamp_n6 replace columns (c6 TIMESTAMP); select c6 from part_change_various_various_timestamp_n6; {code} > Exception in double to timestamp schema evolution > - > > Key: ORC-539 > URL: https://issues.apache.org/jira/browse/ORC-539 > Project: ORC > Issue Type: Bug >Affects Versions: 1.6.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Laszlo Bodor >Priority: Major > > I backported ORC-189 to my own branch and run tests in Hive. I am getting the > following exception in a test related to schema evolution from double to > timestamp after applying ORC-189: > {noformat} > Caused by: java.io.IOException: Error reading file: > file:/Users/jcamachorodriguez/src/workspaces/hive/itests/qtest/target/localfs/warehouse/part_change_various_various_timestamp_n6/part=1/00_0 > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1289) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:87) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:103) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:252) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:227) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361) > ... 23 more > Caused by: java.io.EOFException: Read past EOF for compressed stream Stream > for column 7 kind DATA position: 15 length: 15 range: 0 offset: 122 limit: > 122 range 0 = 0 to 15 uncompressed: 12 to 12 > at > org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:125) > at > org.apache.orc.impl.SerializationUtils.readLongLE(SerializationUtils.java:108) > at > org.apache.orc.impl.SerializationUtils.readDouble(SerializationUtils.java:104) > at > org.apache.orc.impl.TreeReaderFactory$DoubleTreeReader.nextVector(TreeReaderFactory.java:783) > at > org.apache.orc.impl.ConvertTreeReaderFactory$TimestampFromDoubleTreeReader.nextVector(ConvertTreeReaderFactory.java:1883) > at > org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:2012) > at > org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1282) > ... 28 more > {noformat} -- This me