nsivabalan opened a new pull request, #7859:
URL: https://github.com/apache/hudi/pull/7859

   ### Change Logs
   
   This patch introduce reconcile strategy and add dynamic schema strategy. 
Existing reconcile strategy is deemed as "legacy". 
   
   Legacy reconcile strategy:
   if newer incoming has more columns than table schema, newer incoming will be 
chosen as the new table schema. 
   if newer incoming has few columns than table schema, table schema will 
remain as is. 
   No other flows are supported. 
   
   Dynamic schema reconcile strategy:
   This is a super set of legacy. In this, newer incoming can have some dropped 
columns and could have new columns as well compared to table schema. New table 
schema will be last known table schema + new columns in new batch (even if new 
batch had some dropped columns, hudi will auto fill nulls) 
   
   ### Impact
   
   More flexibility in evolving schemas w/ hudi. 
   
   ### Risk level (write none, low medium or high below)
   
   low. 
   
   ### Documentation Update
   
   Introducing a new config named 
`hoodie.datasource.write.reconcile.schema.strategy`. Default value is 
`legacy_reconcile_strategy`. and to leverage dynamic schema, value to set is 
`dynamic_schema_reconcile_strategy`. 
   Users have to set reconcile `hoodie.datasource.write.reconcile.schema` to 
true to leverage this.
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to