Biggest difference from PR 1094 and the current PR open, is the addition of
fallback support and that no moving around of configs in the same PR.
This would make this effort straightforward IMO.

>HoodieBootstrapConfig.BOOTSTRAP_BASE_PATH_PROP in their client code, they
need to either replace it with
HoodieBootstrapConfig.BOOTSTRAP_BASE_PATH_PROP.key()
I think this is a small cost we can take, in return for much better docs
and maintainability.

On Mon, Apr 19, 2021 at 1:16 PM Vinoth Chandar <vin...@apache.org> wrote:

> +1 from me. Long time coming.
>
> On Mon, Apr 19, 2021 at 12:02 PM Ding, Wenning <wenni...@amazon.com.invalid>
> wrote:
>
>> Hi,
>> I planned to refactor the current Hudi configuration framework. lamberken<
>> https://github.com/lamberken> did similar things before:
>> https://github.com/apache/hudi/pull/1094 and I’d like to continue this
>> work and add more features in ConfigOption class.
>>
>> The motivation of this change is, as lamberken mentioned, “Currently,
>> config items and their default value are dispersed in the java class file.
>> It's could be confused when config items are defined more and more”. Having
>> this change would make Hudi developers easy to use and check these
>> configurations.
>>
>> Also, we can also bind configuration description within the ConfigOption
>> class. And for the next step, we could also do something similar to Flink
>> to automatically add/update property description on the Hudi website:
>> https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/description/Description.java
>> .
>> Besides, we can also bind an inference function within the ConfigOption
>> class which can provide an inference mechanism for some of our
>> configurations based on rules. For example, we can infer the key generator
>> class based on the Hudi record fields & partition fields. For example, if
>> the record key field contains comma which indicate that there are multiple
>> record keys, then by default we should use ComplexKeyGenerator. If there’s
>> no partition column, we should use NonpartitionedKeyGenerator. Having this
>> inference mechanism can make Hudi be more intelligent so that users don’t
>> need to set so many parameters from their client side.
>> The disadvantage of this change is for users who are now using e.g.
>> HoodieBootstrapConfig.BOOTSTRAP_BASE_PATH_PROP in their client code, they
>> need to either replace it with
>> HoodieBootstrapConfig.BOOTSTRAP_BASE_PATH_PROP.key() or
>> hoodie.bootstrap.base.path.
>> I opened a demo for this: https://github.com/apache/hudi/pull/2833.
>> Feel free to discuss under this thread and provide any suggestion!
>>
>> Related JIRAs:
>> https://issues.apache.org/jira/browse/HUDI-89
>> https://issues.apache.org/jira/projects/HUDI/issues/HUDI-375
>>
>> Thanks,
>> Wenning
>>
>>
>>
>>

Reply via email to