[
https://issues.apache.org/jira/browse/HUDI-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083712#comment-17083712
]
Vinoth Chandar commented on HUDI-741:
-------------------------------------
[~pwason] once we merge this, could you also write a small blog post around the
content posted in this JIRA.. this is worth upleveling and sharing broadly as
schema evolution does come up often..
> Fix Hoodie's schema evolution checks
> ------------------------------------
>
> Key: HUDI-741
> URL: https://issues.apache.org/jira/browse/HUDI-741
> Project: Apache Hudi (incubating)
> Issue Type: Bug
> Reporter: Prashant Wason
> Assignee: Prashant Wason
> Priority: Minor
> Labels: pull-request-available
> Original Estimate: 120h
> Time Spent: 10m
> Remaining Estimate: 119h 50m
>
> HUDI requires a Schema to be specified in HoodieWriteConfig and is used by
> the HoodieWriteClient to create the records. The schema is also saved in the
> data files (parquet format) and log files (avro format).
> Since a schema is required each time new data is ingested into a HUDI
> dataset, schema can be evolved over time. But HUDI should ensure that the
> evolved schema is compatible with the older schema.
> HUDI specific validation of schema evolution should ensure that a newer
> schema can be used for the dataset by checking that the data written using
> the old schema can be read using the new schema.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)