HyukjinKwon commented on a change in pull request #23735: [SPARK-26801][SQL]
Read avro types other than record
URL: https://github.com/apache/spark/pull/23735#discussion_r253813710
##########
File path:
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroFileFormat.scala
##########
@@ -67,13 +67,18 @@ private[avro] class AvroFileFormat extends FileFormat
spark.sessionState.conf.ignoreCorruptFiles)
}
- SchemaConverters.toSqlType(avroSchema).dataType match {
+ val schemaType = SchemaConverters.toSqlType(avroSchema)
+
+ schemaType.dataType match {
case t: StructType => Some(t)
- case _ => throw new RuntimeException(
- s"""Avro schema cannot be converted to a Spark SQL StructType:
- |
- |${avroSchema.toString(true)}
- |""".stripMargin)
+ case _ => Some(StructType(Seq(StructField("value", schemaType.dataType,
nullable = false))))
Review comment:
Wait, does this PR target to support to read values not the Avro records?
Since Spark assumes rows, I think it's weird to allow values alone. I think
JSON doesn't support this one. How did you generate the Avro files out of
curiosity?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]