[
https://issues.apache.org/jira/browse/SPARK-27913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085852#comment-17085852
]
Giri Dandu commented on SPARK-27913:
------------------------------------
[~viirya] Sorry for late reply.
I re-ran the same test in spark 2.4.5 and it *is NOT working* but it works in
2.3.0. I get the same error in spark 2.4.5
{code:java}
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1Caused by:
java.lang.ArrayIndexOutOfBoundsException: 1 at
org.apache.orc.mapred.OrcStruct.getFieldValue(OrcStruct.java:49) at
org.apache.spark.sql.execution.datasources.orc.OrcDeserializer$$anonfun$org$apache$spark$sql$execution$datasources$orc$OrcDeserializer$$newWriter$14.apply(OrcDeserializer.scala:133)
at
org.apache.spark.sql.execution.datasources.orc.OrcDeserializer$$anonfun$org$apache$spark$sql$execution$datasources$orc$OrcDeserializer$$newWriter$14.apply(OrcDeserializer.scala:123)
at
org.apache.spark.sql.execution.datasources.orc.OrcDeserializer$$anonfun$2$$anonfun$apply$1.apply(OrcDeserializer.scala:51)
at
org.apache.spark.sql.execution.datasources.orc.OrcDeserializer$$anonfun$2$$anonfun$apply$1.apply(OrcDeserializer.scala:51)
at
org.apache.spark.sql.execution.datasources.orc.OrcDeserializer.deserialize(OrcDeserializer.scala:64)
at
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat$$anonfun$buildReaderWithPartitionValues$2$$anonfun$apply$8.apply(OrcFileFormat.scala:234)
at
org.apache.spark.sql.execution.datasources.orc.OrcFileFormat$$anonfun$buildReaderWithPartitionValues$2$$anonfun$apply$8.apply(OrcFileFormat.scala:233)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:410) at
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.next(FileScanRDD.scala:104)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
Source) at
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:636)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:255)
{code}
> Spark SQL's native ORC reader implements its own schema evolution
> -----------------------------------------------------------------
>
> Key: SPARK-27913
> URL: https://issues.apache.org/jira/browse/SPARK-27913
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.3.3
> Reporter: Owen O'Malley
> Priority: Major
>
> ORC's reader handles a wide range of schema evolution, but the Spark SQL
> native ORC bindings do not provide the desired schema to the ORC reader. This
> causes a regression when moving spark.sql.orc.impl from 'hive' to 'native'.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]