[ https://issues.apache.org/jira/browse/SPARK-30267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arseniy Tashoyan reopened SPARK-30267: -------------------------------------- With Spark 3.0.0 preview 2, I have the following failure here: {code:java} java.lang.ClassCastException: scala.collection.convert.Wrappers$SeqWrapper cannot be cast to org.apache.avro.generic.GenericData$Array at org.apache.spark.sql.avro.AvroDeserializer.$anonfun$newWriter$19(AvroDeserializer.scala:170) {code} This means that the fix here [https://github.com/apache/spark/pull/26907] is not actually a fix, because Scala Seq cannot be cast to java.util.Collection[Any]. I have Scala Seq, because my Avro GenericRecord is generated from a case class by Avro4s. We can expect, that everybody using Avro4s (or other Scala-written generator like Avrohugger) will face the same ClassCastException. > avro deserializer: ArrayList cannot be cast to GenericData$Array > ---------------------------------------------------------------- > > Key: SPARK-30267 > URL: https://issues.apache.org/jira/browse/SPARK-30267 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.4 > Reporter: Steven Aerts > Assignee: Steven Aerts > Priority: Major > Fix For: 3.0.0 > > > On some more complex avro objects, the Avro Deserializer fails with the > following stack trace: > {code} > java.lang.ClassCastException: java.util.ArrayList cannot be cast to > org.apache.avro.generic.GenericData$Array > at > org.apache.spark.sql.avro.AvroDeserializer.$anonfun$newWriter$19(AvroDeserializer.scala:170) > at > org.apache.spark.sql.avro.AvroDeserializer.$anonfun$newWriter$19$adapted(AvroDeserializer.scala:169) > at > org.apache.spark.sql.avro.AvroDeserializer.$anonfun$getRecordWriter$1(AvroDeserializer.scala:314) > at > org.apache.spark.sql.avro.AvroDeserializer.$anonfun$getRecordWriter$1$adapted(AvroDeserializer.scala:310) > at > org.apache.spark.sql.avro.AvroDeserializer.$anonfun$getRecordWriter$2(AvroDeserializer.scala:332) > at > org.apache.spark.sql.avro.AvroDeserializer.$anonfun$getRecordWriter$2$adapted(AvroDeserializer.scala:329) > at > org.apache.spark.sql.avro.AvroDeserializer.$anonfun$converter$3(AvroDeserializer.scala:56) > at > org.apache.spark.sql.avro.AvroDeserializer.deserialize(AvroDeserializer.scala:70) > {code} > This is because the Deserializer assumes that an array is always of the very > specific {{org.apache.avro.generic.GenericData$Array}} which is not always > the case. > Making it a normal list works. > A github PR is coming up to fix this. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org