[jira] [Updated] (SPARK-27506) Function `from_avro` doesn't allow deserialization of data using other compatible schemas

Fokko Driesprong (Jira) Wed, 11 Dec 2019 01:41:09 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-27506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Fokko Driesprong updated SPARK-27506:
-------------------------------------
    Fix Version/s: 3.0.0

> Function `from_avro` doesn't allow deserialization of data using other 
> compatible schemas
> -----------------------------------------------------------------------------------------
>
>                 Key: SPARK-27506
>                 URL: https://issues.apache.org/jira/browse/SPARK-27506
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Gianluca Amori
>            Assignee: Fokko Driesprong
>            Priority: Major
>             Fix For: 3.0.0
>
>
>  SPARK-24768 and subtasks introduced support to read and write Avro data by 
> parsing a binary column of Avro format and converting it into its 
> corresponding catalyst value (and viceversa).
>  
> The current implementation has the limitation of requiring deserialization of 
> an event with the exact same schema with which it was serialized. This breaks 
> one of the most important features of Avro, schema evolution 
> [https://docs.confluent.io/current/schema-registry/avro.html] - most 
> importantly, the ability to read old data with a newer (compatible) schema 
> without breaking the consumer.
>  
> The GenericDatumReader in the Avro library already supports passing an 
> optional *writer's schema* (the schema with which the record was serialized) 
> alongside a mandatory *reader's schema* (the schema with which the record is 
> going to be deserialized). The proposed change is to do the same in the 
> from_avro function, allowing the possibility to pass an optional writer's 
> schema to be used in the deserialization.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-27506) Function `from_avro` doesn't allow deserialization of data using other compatible schemas

Reply via email to