Hi!

我看到 spark 也有类似的问题
https://stackoverflow.com/questions/63792314/spark-arrayindexoutofboundsexception-with-timestamp-columns
,你的情况是否和这个链接中一样呢?如果是的话可能是 orc 本身的 bug。

如果方便的话,能否提供出错的 orc 文件呢?

Asahi Lee <[email protected]> 于2021年9月4日周六 下午4:28写道:

> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
>         at
> org.apache.orc.impl.TreeReaderFactory$TreeReader.nextVector(TreeReaderFactory.java:269)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.orc.impl.TreeReaderFactory$TimestampTreeReader.nextVector(TreeReaderFactory.java:1007)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.orc.impl.ConvertTreeReaderFactory$DateFromTimestampTreeReader.nextVector(ConvertTreeReaderFactory.java:2115)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:2012)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1300)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.flink.orc.nohive.shim.OrcNoHiveShim.nextBatch(OrcNoHiveShim.java:94)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.flink.orc.nohive.shim.OrcNoHiveShim.nextBatch(OrcNoHiveShim.java:41)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.flink.orc.AbstractOrcFileInputFormat$OrcVectorizedReader.readBatch(AbstractOrcFileInputFormat.java:260)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:67)
> ~[flink-table-blink_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:56)
> ~[flink-table-blink_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:138)
> ~[flink-table-blink_2.11-1.13.1.jar:1.13.1]
>         at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:101)
> ~[flink-table-blink_2.11-1.13.1.jar:1.13.1]
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[?:1.8.0_141]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[?:1.8.0_141]
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_141]
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_141]

回复