[ 
https://issues.apache.org/jira/browse/HUDI-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17081014#comment-17081014
 ] 

Prashant Wason commented on HUDI-741:
-------------------------------------

Update: [~varadarb] informed me that schema is also available in the Hoodie 
commit as extraMetadata. This simplifies getting the last used schema for the 
checks.

> Fix Hoodie's schema evolution checks
> ------------------------------------
>
>                 Key: HUDI-741
>                 URL: https://issues.apache.org/jira/browse/HUDI-741
>             Project: Apache Hudi (incubating)
>          Issue Type: Bug
>            Reporter: Prashant Wason
>            Assignee: Prashant Wason
>            Priority: Minor
>              Labels: pull-request-available
>   Original Estimate: 120h
>          Time Spent: 10m
>  Remaining Estimate: 119h 50m
>
> HUDI requires a Schema to be specified in HoodieWriteConfig and is used by 
> the HoodieWriteClient to create the records. The schema is also saved in the 
> data files (parquet format) and log files (avro format).
> Since a schema is required each time new data is ingested into a HUDI 
> dataset, schema can be evolved over time. But HUDI should ensure that the 
> evolved schema is compatible with the older schema.
> HUDI specific validation of schema evolution should ensure that a newer 
> schema can be used for the dataset by checking that the data written using 
> the old schema can be read using the new schema.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to