Hi,
I planned to refactor the current Hudi configuration framework. 
lamberken<https://github.com/lamberken> did similar things before: 
https://github.com/apache/hudi/pull/1094 and I’d like to continue this work and 
add more features in ConfigOption class.

The motivation of this change is, as lamberken mentioned, “Currently, config 
items and their default value are dispersed in the java class file. It's could 
be confused when config items are defined more and more”. Having this change 
would make Hudi developers easy to use and check these configurations.

Also, we can also bind configuration description within the ConfigOption class. 
And for the next step, we could also do something similar to Flink to 
automatically add/update property description on the Hudi website: 
https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/description/Description.java.
Besides, we can also bind an inference function within the ConfigOption class 
which can provide an inference mechanism for some of our configurations based 
on rules. For example, we can infer the key generator class based on the Hudi 
record fields & partition fields. For example, if the record key field contains 
comma which indicate that there are multiple record keys, then by default we 
should use ComplexKeyGenerator. If there’s no partition column, we should use 
NonpartitionedKeyGenerator. Having this inference mechanism can make Hudi be 
more intelligent so that users don’t need to set so many parameters from their 
client side.
The disadvantage of this change is for users who are now using e.g. 
HoodieBootstrapConfig.BOOTSTRAP_BASE_PATH_PROP in their client code, they need 
to either replace it with HoodieBootstrapConfig.BOOTSTRAP_BASE_PATH_PROP.key() 
or hoodie.bootstrap.base.path.
I opened a demo for this: https://github.com/apache/hudi/pull/2833.  Feel free 
to discuss under this thread and provide any suggestion!

Related JIRAs:
https://issues.apache.org/jira/browse/HUDI-89
https://issues.apache.org/jira/projects/HUDI/issues/HUDI-375

Thanks,
Wenning



Reply via email to