[
https://issues.apache.org/jira/browse/HIVE-16480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jesus Camacho Rodriguez updated HIVE-16480:
-------------------------------------------
Resolution: Fixed
Fix Version/s: 2.2.1
2.1.2
Status: Resolved (was: Patch Available)
bq. This patch applies to branch-2.1 and branch-2.2. In branch-2.3 and above
Hive uses the ORC project artifacts, so we'll need to release from ORC. Once
the patch goes in, we should start that process.
Patch has been pushed to branch-2.1 and branch-2.2. For consuming new ORC
release from branch-2.3 and fix that issue in 2.3.x, a new issue can be
created. Closing this issue. Thanks [~owen.omalley]
> ORC file with empty array<double> and array<float> fails to read
> ----------------------------------------------------------------
>
> Key: HIVE-16480
> URL: https://issues.apache.org/jira/browse/HIVE-16480
> Project: Hive
> Issue Type: Bug
> Affects Versions: 2.1.1, 2.2.0
> Reporter: David Capwell
> Assignee: Owen O'Malley
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.1.2, 2.2.1
>
>
> We have a schema that has a array<double> in it. We were unable to read this
> file and digging into ORC it seems that the issue is when the array is empty.
> Here is the stack trace
> {code:title=EmptyList.log|borderStyle=solid}
> ERROR 2017-04-19 09:29:17,075 [main] [EmptyList] [line 56] Failed to work
> with type float
> java.io.IOException: Error reading file:
> /var/folders/t8/t5x1031d7mn17f6xpwnkkv_40000gn/T/1492619355819-0/file-float.orc
> at
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1052)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:135)
> ~[hive-exec-2.1.1.jar:2.1.1]
> at EmptyList.emptyList(EmptyList.java:49) ~[test-classes/:na]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[na:1.8.0_121]
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> ~[na:1.8.0_121]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[na:1.8.0_121]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> [junit-4.12.jar:4.12]
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> [junit-4.12.jar:4.12]
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> [junit-4.12.jar:4.12]
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> [junit-4.12.jar:4.12]
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> [junit-4.12.jar:4.12]
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> [junit-4.12.jar:4.12]
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137) [junit-4.12.jar:4.12]
> at
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> [junit-rt.jar:na]
> at
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
> [junit-rt.jar:na]
> at
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
> [junit-rt.jar:na]
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> [junit-rt.jar:na]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[na:1.8.0_121]
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> ~[na:1.8.0_121]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[na:1.8.0_121]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> [idea_rt.jar:na]
> Caused by: java.io.EOFException: Read past EOF for compressed stream Stream
> for column 1 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0
> at
> org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:118)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.SerializationUtils.readFloat(SerializationUtils.java:78)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.TreeReaderFactory$FloatTreeReader.nextVector(TreeReaderFactory.java:619)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.TreeReaderFactory$TreeReader.nextBatch(TreeReaderFactory.java:154)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
> ~[hive-orc-2.1.1.jar:2.1.1]
> ... 29 common frames omitted
> INFO 2017-04-19 09:29:17,091 [main] [WriterImpl] [line 205] ORC writer
> created for path:
> /var/folders/t8/t5x1031d7mn17f6xpwnkkv_40000gn/T/1492619355819-0/file-double.orc
> with stripeSize: 67108864 blockSize: 268435456 compression: ZLIB bufferSize:
> 262144
> INFO 2017-04-19 09:29:17,100 [main] [ReaderImpl] [line 357] Reading ORC rows
> from
> /var/folders/t8/t5x1031d7mn17f6xpwnkkv_40000gn/T/1492619355819-0/file-double.orc
> with {include: null, offset: 0, length: 9223372036854775807}
> INFO 2017-04-19 09:29:17,101 [main] [RecordReaderImpl] [line 142] Schema on
> read not provided -- using file schema array<double>
> ERROR 2017-04-19 09:29:17,104 [main] [EmptyList] [line 56] Failed to work
> with type double
> java.io.IOException: Error reading file:
> /var/folders/t8/t5x1031d7mn17f6xpwnkkv_40000gn/T/1492619355819-0/file-double.orc
> at
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1052)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.nextBatch(RecordReaderImpl.java:135)
> ~[hive-exec-2.1.1.jar:2.1.1]
> at EmptyList.emptyList(EmptyList.java:49) ~[test-classes/:na]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[na:1.8.0_121]
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> ~[na:1.8.0_121]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[na:1.8.0_121]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> [junit-4.12.jar:4.12]
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> [junit-4.12.jar:4.12]
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> [junit-4.12.jar:4.12]
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> [junit-4.12.jar:4.12]
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> [junit-4.12.jar:4.12]
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> [junit-4.12.jar:4.12]
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> [junit-4.12.jar:4.12]
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137) [junit-4.12.jar:4.12]
> at
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> [junit-rt.jar:na]
> at
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
> [junit-rt.jar:na]
> at
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
> [junit-rt.jar:na]
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> [junit-rt.jar:na]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> ~[na:1.8.0_121]
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> ~[na:1.8.0_121]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[na:1.8.0_121]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
> [idea_rt.jar:na]
> Caused by: java.io.EOFException: Read past EOF for compressed stream Stream
> for column 1 kind DATA position: 0 length: 0 range: 0 offset: 0 limit: 0
> at
> org.apache.orc.impl.SerializationUtils.readFully(SerializationUtils.java:118)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.SerializationUtils.readLongLE(SerializationUtils.java:101)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.SerializationUtils.readDouble(SerializationUtils.java:97)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.TreeReaderFactory$DoubleTreeReader.nextVector(TreeReaderFactory.java:713)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.TreeReaderFactory$TreeReader.nextBatch(TreeReaderFactory.java:154)
> ~[hive-orc-2.1.1.jar:2.1.1]
> at
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
> ~[hive-orc-2.1.1.jar:2.1.1]
> ... 29 common frames omitted
> {code}
> If you create a ORC file with one row as the following
> {code}
> orc.addRow(Lists.newArrayList());
> {code}
> then try to read it
> {code}
> VectorizedRowBatch batch = reader.getSchema().createRowBatch();
> while (rows.nextBatch(batch)) { }
> {code}
> You will produce the above stack trace.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)