nsivabalan opened a new pull request, #7681:
URL: https://github.com/apache/hudi/pull/7681
### Change Logs
As of now, record key generation and partition path generation are tightly
coupled based on the key gen class used. But these could be de-coupled. For eg,
we can't have a complex record key generation with non-partitioned dataset
since the NonpartitionedKeyGen uses simple key generation logic for record key
generation. So, this patch introduces a separate RecordKeyGenerator interface
and uses a factory to determine the record key generation strategy. Users don't
need to set any additional param.
After this patch, this is how the record key generation strategy is
determined.
- Users will have to configure the right value for
"hoodie.datasource.write.recordkey.field" or enable
"hoodie.auto.generate.record.keys".
a. If "hoodie.auto.generate.record.keys" is enabled, record keys will be
auto generated internally and users don't need to set any value for
"hoodie.datasource.write.recordkey.field".
b. else, if single field is set for
"hoodie.datasource.write.recordkey.field", simple record key generation.
c. else its deduced as multi record key generation.
### Impact
Enables users to choose any record key generation along w/ any partition
path generation strategy.
### Risk level (write none, low medium or high below)
low.
### Documentation Update
_Describe any necessary documentation update if there is any new feature,
config, or user-facing change_
- _The config description must be updated if new configs are added or the
default value of the configs are changed_
- _Any new feature or user-facing change requires updating the Hudi website.
Please create a Jira ticket, attach the
ticket number here and follow the
[instruction](https://hudi.apache.org/contribute/developer-setup#website) to
make
changes to the website._
### Contributor's checklist
- [ ] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [ ] Change Logs and Impact were stated clearly
- [ ] Adequate tests were added if applicable
- [ ] CI passed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]