nsivabalan opened a new pull request, #7859: URL: https://github.com/apache/hudi/pull/7859
### Change Logs This patch introduce reconcile strategy and add dynamic schema strategy. Existing reconcile strategy is deemed as "legacy". Legacy reconcile strategy: if newer incoming has more columns than table schema, newer incoming will be chosen as the new table schema. if newer incoming has few columns than table schema, table schema will remain as is. No other flows are supported. Dynamic schema reconcile strategy: This is a super set of legacy. In this, newer incoming can have some dropped columns and could have new columns as well compared to table schema. New table schema will be last known table schema + new columns in new batch (even if new batch had some dropped columns, hudi will auto fill nulls) ### Impact More flexibility in evolving schemas w/ hudi. ### Risk level (write none, low medium or high below) low. ### Documentation Update Introducing a new config named `hoodie.datasource.write.reconcile.schema.strategy`. Default value is `legacy_reconcile_strategy`. and to leverage dynamic schema, value to set is `dynamic_schema_reconcile_strategy`. Users have to set reconcile `hoodie.datasource.write.reconcile.schema` to true to leverage this. ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Change Logs and Impact were stated clearly - [ ] Adequate tests were added if applicable - [ ] CI passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
