[ https://issues.apache.org/jira/browse/SPARK-34378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17415783#comment-17415783 ]
Apache Spark commented on SPARK-34378: -------------------------------------- User 'xkrogen' has created a pull request for this issue: https://github.com/apache/spark/pull/34009 > Support extra optional Avro fields in AvroSerializer > ---------------------------------------------------- > > Key: SPARK-34378 > URL: https://issues.apache.org/jira/browse/SPARK-34378 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.0.1 > Reporter: Erik Krogen > Priority: Major > > Currently, when writing out Avro data using a custom schema ({{avroSchema}}), > if there are any extra Avro fields which do not have a matching field in the > Catalyst schema, the serialization will fail. This is much more strict than > on the deserialization path, where Avro fields not present in the Catalyst > schema are ignored, and Catalyst fields not present in the Avro schema are > allowed as long as they are nullable. I believe it will be more user-friendly > if extra Avro fields are allowed, as long as they are optional. This makes it > easier for users to write out data with Avro schemas which may be outside of > their control. > If there is concern about the safety of this approach (i.e. there are use > cases where users want strict matching), we can make it configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org