Hi @Vinoth,
It's time to pick up this topic. Based on the content we talked about, here are my thoughts 1, Initial proposal aims to rework configuration framework includes(DataSource and WriteClient level), for compatibility, we can introduce a ConfigOption class and rework it on DataSource level. 2, It's very right that the scoped down version does not need a RFC[1], so change state from 'Under Discussion' to 'Close' ? [1] https://cwiki.apache.org/confluence/display/HUDI/RFC+-+11+%3A+Refactor+of+the+configuration+framework+of+hudi+project Best, Lamber-Ken At 2019-12-19 11:05:16, "Vinoth Chandar" <[email protected]> wrote: >Sounds good.. This scoped down version per se, does not need a RFC. > >On Wed, Dec 18, 2019 at 3:09 PM lamberken <[email protected]> wrote: > >> >> >> Hi @Vinoth >> >> >> I understand what you mean, I will continue to work on this when I finish >> reworking the new UI. :) >> >> >> best, >> lamber-ken >> >> >> >> >> At 2019-12-18 11:39:30, "Vinoth Chandar" <[email protected]> wrote: >> >Expect most users to use inputDF.write() approach... Uber uses the lower >> >level RDD apis, like the DeltaStreamer tool does.. >> >If we don't rename configs and still support a builder, it should be fine. >> > >> >I think we can scope this down to introducing a ConfigOption class that >> >ties, the key,value, default together.. That definitely seems like a >> better >> >abstraction. >> > >> >On Fri, Dec 13, 2019 at 5:18 PM lamberken <[email protected]> wrote: >> > >> >> >> >> >> >> Hi, @vinoth >> >> >> >> >> >> Okay, I see. If we don't want existing users to do any upgrading or >> >> reconfigurations, then this refactor work will not make much sense. >> >> This issue can be closed, because ConfigOptions and these builders do >> the >> >> same things. >> >> From another side, if we finish this work before a stable release, we >> will >> >> benefit a lot from it. We need to make a choice. >> >> >> >> >> >> btw, I have a question that users will use HoodieWriteConfig / >> >> HoodieWriteClient in their program? >> >> >> >> >> /---------------------------------------------------------------------------------------- >> >> HoodieWriteConfig cfg = HoodieWriteConfig.newBuilder() >> >> .withPath(basePath) >> >> .forTable(tableName) >> >> .withSchema(schemaStr) >> >> .withProps(props) // pass raw k,v pairs from a property file. >> >> >> >> >> .withCompactionConfig(HoodieCompactionConfig.newBuilder().withXXX(...).build()) >> >> >> >> .withIndexConfig(HoodieIndexConfig.newBuilder().withXXX(...).build()) >> >> ... >> >> .build(); >> >> >> >> >> /---------------------------------------------------------------------------------------- >> >> OR >> >> >> >> >> /---------------------------------------------------------------------------------------- >> >> inputDF.write() >> >> .format("org.apache.hudi") >> >> .options(clientOpts) // any of the Hudi client opts can be passed in >> >> as well >> >> .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY(), >> "_row_key") >> >> .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY(), >> >> "partition") >> >> .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY(), >> "timestamp") >> >> .option(HoodieWriteConfig.TABLE_NAME, tableName) >> >> .mode(SaveMode.Append) >> >> .save(basePath); >> >> >> >> >> /---------------------------------------------------------------------------------------- >> >> >> >> >> >> >> >> >> >> best, >> >> lamber-ken >> >> >> >> At 2019-12-14 08:43:06, "Vinoth Chandar" <[email protected]> wrote: >> >> >Hi, >> >> > >> >> >Are you saying these classes needs to change? If so, understood. But >> are >> >> >you planning on renaming configs or relocating them? We dont want >> existing >> >> >users to do any upgrading or reconfigurations >> >> > >> >> >On Fri, Dec 13, 2019 at 10:28 AM lamberken <[email protected]> wrote: >> >> > >> >> >> >> >> >> >> >> >> Hi, >> >> >> >> >> >> >> >> >> They need to change due to this, because only HoodieWriteConfig and >> >> >> *Options will be kept. >> >> >> >> >> >> >> >> >> best, >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> At 2019-12-14 01:23:35, "Vinoth Chandar" <[email protected]> wrote: >> >> >> >Hi, >> >> >> > >> >> >> >We are trying to understand if existing jobs (datasource, >> >> deltastreamer, >> >> >> >anything else) needs to change due to this. >> >> >> > >> >> >> >On Wed, Dec 11, 2019 at 7:18 PM lamberken <[email protected]> >> wrote: >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> Hi, @vinoth >> >> >> >> >> >> >> >> >> >> >> >> 1, Hoodie*Config classes are only used to set default value when >> call >> >> >> >> their build method currently. >> >> >> >> They will be replaced by HoodieMemoryOptions, HoodieIndexOptions, >> >> >> >> HoodieHBaseIndexOptions, etc... >> >> >> >> 2, I don't understand the question "It is not clear to me whether >> >> there >> >> >> is >> >> >> >> any external facing changes which changes this model.". >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-12 11:01:36, "Vinoth Chandar" <[email protected]> >> wrote: >> >> >> >> >I actually prefer the builder pattern for making the configs, >> >> because I >> >> >> >> can >> >> >> >> >do `builder.` in the IDE and actually see all the options... That >> >> said, >> >> >> >> >most developers program against the Spark datasource and so this >> may >> >> >> not >> >> >> >> be >> >> >> >> >useful, unless we expose a builder for that.. I will concede that >> >> since >> >> >> >> its >> >> >> >> >also subjective anyway. >> >> >> >> > >> >> >> >> >But, to clarify Siva's question, you do intend to keep the >> different >> >> >> >> >component level config classes right - HoodieIndexConfig, >> >> >> >> >HoodieCompactionConfig? >> >> >> >> > >> >> >> >> >Once again, can you please explicitly address the following >> >> question, >> >> >> so >> >> >> >> we >> >> >> >> >can get on the same page? >> >> >> >> >>> It is not clear to me whether there is any external facing >> >> changes >> >> >> >> which >> >> >> >> >changes this model. >> >> >> >> >This is still the most critical question from both me and balaji. >> >> >> >> > >> >> >> >> >On Wed, Dec 11, 2019 at 11:35 AM lamberken <[email protected]> >> >> wrote: >> >> >> >> > >> >> >> >> >> hi, @Sivabalan >> >> >> >> >> >> >> >> >> >> Yes, thanks very much for help me explain my initial proposal. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Answer your question, we can call HoodieWriteConfig as a >> >> >> SystemConfig, >> >> >> >> we >> >> >> >> >> need to pass it everywhere. >> >> >> >> >> Actually, it may just contains a few custom configurations( >> does >> >> not >> >> >> >> >> include default configurations) >> >> >> >> >> Because each component has its own ConfigOptions. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> The old version HoodieWriteConfig includes all keys(custom >> >> >> >> configurations, >> >> >> >> >> default configurations), it is a fat. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-12 03:14:11, "Sivabalan" <[email protected]> >> wrote: >> >> >> >> >> >Let me summarize your initial proposal and then will get into >> >> >> details. >> >> >> >> >> >- Introduce ConfigOptions for ease of handling of default >> values. >> >> >> >> >> >- Remove all Hoodie*Config classes and just have >> >> HoodieWriteConfig. >> >> >> >> What >> >> >> >> >> >this means is that, every other config file will be replaced >> by >> >> >> >> >> >ConfigOptions. eg, HoodieIndexConfigOption, >> >> >> >> HoodieCompactionConfigOption, >> >> >> >> >> >etc. >> >> >> >> >> >- Config option will take care of returning defaults for any >> >> >> property, >> >> >> >> >> even >> >> >> >> >> >if an entire Config(eg IndexConfig) is not explicitly set. >> >> >> >> >> > >> >> >> >> >> >Here are the positives I see. >> >> >> >> >> >- By way of having component level ConfigOptions, we bucketize >> >> the >> >> >> >> configs >> >> >> >> >> >and have defaults set(same as before) >> >> >> >> >> >- User doesn't need to set each component's config(eg >> >> IndexConfig) >> >> >> >> >> >explicitly with HoodieWriteConfig. >> >> >> >> >> > >> >> >> >> >> >But have one question: >> >> >> >> >> >- I see Bucketizing only in write path. How does one get hold >> of >> >> >> >> >> >IndexConfigOptions as a consumer? For eg, If some class is >> using >> >> >> just >> >> >> >> >> >IndexConfig alone, how will it consume? From your eg, I see >> only >> >> >> >> >> >HoodieWriteConfig. Do we pass in HoodieWriteConfig everywhere >> >> then? >> >> >> >> >> >Wouldn't that contradicts your initial proposal to not have a >> fat >> >> >> >> config >> >> >> >> >> >class? May be can you expand your example below to show how a >> >> >> consumer >> >> >> >> of >> >> >> >> >> >IndexConfig look like. >> >> >> >> >> > >> >> >> >> >> >Your eg: >> >> >> >> >> >/** >> >> >> >> >> > * New version >> >> >> >> >> > */ >> >> >> >> >> >// set value overrite the default value >> >> >> >> >> >HoodieWriteConfig config = new HoodieWriteConfig(); >> >> >> >> >> >config.set(HoodieIndexConfigOptions.INDEX_TYPE, >> >> >> >> >> >HoodieIndex.IndexType.HBASE.name < >> >> >> >> >> http://hoodieindex.indextype.hbase.name/> >> >> >> >> >> >()) >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> >On Wed, Dec 11, 2019 at 8:33 AM lamberken <[email protected]> >> >> >> wrote: >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi, >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On 1,2. Yes, you are right, moving the getter to the >> component >> >> >> level >> >> >> >> >> >> Config class itself. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On 3, HoodieWriteConfig can also set value through >> >> ConfigOption, >> >> >> >> small >> >> >> >> >> >> code snippets. >> >> >> >> >> >> From the bellow snippets, we can see that clients need to >> know >> >> >> each >> >> >> >> >> >> component's builders >> >> >> >> >> >> and also call their "with" methods to override the default >> >> value >> >> >> in >> >> >> >> old >> >> >> >> >> >> version. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> But, in new version, clients just need to know each >> component's >> >> >> >> public >> >> >> >> >> >> config options, just like constants. >> >> >> >> >> >> So, these builders are redundant. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> public class HoodieIndexConfigOptions { >> >> >> >> >> >> public static final ConfigOption<String> INDEX_TYPE = >> >> >> ConfigOption >> >> >> >> >> >> .key("hoodie.index.type") >> >> >> >> >> >> .defaultValue(HoodieIndex.IndexType.BLOOM.name >> ()); >> >> >> >> >> >> } >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> public class HoodieWriteConfig { >> >> >> >> >> >> public void setString(ConfigOption<String> option, String >> >> >> value) { >> >> >> >> >> >> this.props.put(option.key(), value); >> >> >> >> >> >> } >> >> >> >> >> >> } >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /** >> >> >> >> >> >> * New version >> >> >> >> >> >> */ >> >> >> >> >> >> // set value overrite the default value >> >> >> >> >> >> HoodieWriteConfig config = new HoodieWriteConfig(); >> >> >> >> >> >> config.set(HoodieIndexConfigOptions.INDEX_TYPE, >> >> >> >> >> >> HoodieIndex.IndexType.HBASE.name()) >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /** >> >> >> >> >> >> * Old version >> >> >> >> >> >> */ >> >> >> >> >> >> HoodieWriteConfig.Builder builder = >> >> HoodieWriteConfig.newBuilder() >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> builder.withIndexConfig(HoodieIndexConfig.newBuilder().withIndexType(HoodieIndex.IndexType.BLOOM).build()) >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Another, users use hudi like bellow, here're all keys. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> df.write.format("hudi"). >> >> >> >> >> >> option("hoodie.insert.shuffle.parallelism", "10"). >> >> >> >> >> >> option("hoodie.upsert.shuffle.parallelism", "10"). >> >> >> >> >> >> option("hoodie.delete.shuffle.parallelism", "10"). >> >> >> >> >> >> option("hoodie.bulkinsert.shuffle.parallelism", "10"). >> >> >> >> >> >> option("hoodie.datasource.write.recordkey.field", >> "name"). >> >> >> >> >> >> option("hoodie.datasource.write.partitionpath.field", >> >> >> >> "location"). >> >> >> >> >> >> option("hoodie.datasource.write.precombine.field", >> "ts"). >> >> >> >> >> >> option("hoodie.table.name", tableName). >> >> >> >> >> >> mode(Overwrite). >> >> >> >> >> >> save(basePath); >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Last, as I responsed to @vino, it's reasonable to handle >> >> >> >> fallbackkeys. I >> >> >> >> >> >> think we need to do this step by step, >> >> >> >> >> >> it's easy to integrate FallbackKey in the future, it is not >> >> what >> >> >> we >> >> >> >> need >> >> >> >> >> >> right now in my opinion. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> If some places are still not very clear, feel free to >> feedback. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-11 23:41:31, "Vinoth Chandar" <[email protected] >> > >> >> >> wrote: >> >> >> >> >> >> >Hi Lamber-ken, >> >> >> >> >> >> > >> >> >> >> >> >> >I looked at the sample PR you put up as well. >> >> >> >> >> >> > >> >> >> >> >> >> >On 1,2 => Seems your intent is to replace these with moving >> >> the >> >> >> >> getter >> >> >> >> >> to >> >> >> >> >> >> >the component level Config class itself? I am fine with >> that >> >> >> >> (although >> >> >> >> >> I >> >> >> >> >> >> >think its not that big of a hurdle really to use atm). But, >> >> once >> >> >> we >> >> >> >> do >> >> >> >> >> >> that >> >> >> >> >> >> >we could pass just the specific component config into >> parts of >> >> >> code >> >> >> >> >> versus >> >> >> >> >> >> >passing in the entire HoodieWriteConfig object. I am fine >> with >> >> >> >> moving >> >> >> >> >> the >> >> >> >> >> >> >classes to a ConfigOption class as you suggested as well. >> >> >> >> >> >> > >> >> >> >> >> >> >On 3, I still we feel we will need the builder pattern >> going >> >> >> >> forward. >> >> >> >> >> to >> >> >> >> >> >> >build the HoodieWriteConfig object. Like below? Cannot >> >> understand >> >> >> >> why >> >> >> >> >> we >> >> >> >> >> >> >would want to change this. Could you please clarify? >> >> >> >> >> >> > >> >> >> >> >> >> >HoodieWriteConfig.Builder builder = >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> HoodieWriteConfig.newBuilder().withPath(cfg.targetBasePath).combineInput(cfg.filterDupes, >> >> >> >> >> >> >true) >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> .withCompactionConfig(HoodieCompactionConfig.newBuilder().withPayloadClass(cfg.payloadClassName) >> >> >> >> >> >> > // Inline compaction is disabled for continuous >> >> mode. >> >> >> >> >> >> >otherwise enabled for MOR >> >> >> >> >> >> > >> >> >> >> >> >> >> .withInlineCompaction(cfg.isInlineCompactionEnabled()).build()) >> >> >> >> >> >> > .forTable(cfg.targetTableName) >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> .withIndexConfig(HoodieIndexConfig.newBuilder().withIndexType(HoodieIndex.IndexType.BLOOM).build()) >> >> >> >> >> >> > .withAutoCommit(false).withProps(props); >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> >Typically, we write RFCs for large changes that breaks >> >> existing >> >> >> >> >> behavior >> >> >> >> >> >> or >> >> >> >> >> >> >introduces significantly complex new features.. If you are >> >> just >> >> >> >> >> planning >> >> >> >> >> >> to >> >> >> >> >> >> >do the refactoring into ConfigOption class, per se you >> don't >> >> >> need a >> >> >> >> >> RFC. >> >> >> >> >> >> >But , if you plan to address the fallback keys (or) your >> >> changes >> >> >> are >> >> >> >> >> going >> >> >> >> >> >> >to break/change existing jobs, we would need a RFC. >> >> >> >> >> >> > >> >> >> >> >> >> >>> It is not clear to me whether there is any external >> facing >> >> >> >> changes >> >> >> >> >> >> which >> >> >> >> >> >> >changes this model. >> >> >> >> >> >> >I am still unclear on this as well. can you please >> explicitly >> >> >> >> clarify? >> >> >> >> >> >> > >> >> >> >> >> >> >thanks >> >> >> >> >> >> >vinoth >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> >On Tue, Dec 10, 2019 at 12:35 PM lamberken < >> [email protected] >> >> > >> >> >> >> wrote: >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi, @Balaji @Vinoth >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> I'm sorry, some places are not very clear, >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> 1, We can see that HoodieMetricsConfig, >> HoodieStorageConfig, >> >> >> etc.. >> >> >> >> >> >> already >> >> >> >> >> >> >> defined in project. >> >> >> >> >> >> >> But we get property value by methods which defined in >> >> >> >> >> >> >> HoodieWriteConfig, like >> >> >> HoodieWriteConfig#getParquetMaxFileSize, >> >> >> >> >> >> >> HoodieWriteConfig#getParquetBlockSize, etc. It's means >> >> that >> >> >> >> >> >> >> Hoodie*Config are redundant. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> 2, These Hoodie*Config classes are used to set default >> value >> >> >> when >> >> >> >> >> call >> >> >> >> >> >> >> their build method, nothing else. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> 3, For current plan is keep the Builder pattern when >> >> >> configuring, >> >> >> >> >> when >> >> >> >> >> >> we >> >> >> >> >> >> >> are familiar with the config framework, >> >> >> >> >> >> >> We will find that Hoodie*Config class are redundant >> and >> >> >> methods >> >> >> >> >> >> >> prefixed with "get" in HoodieWriteConfig are also >> redundant. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> In addition, I create a pr[1] for initializing with a >> demo. >> >> At >> >> >> >> this >> >> >> >> >> >> demo, >> >> >> >> >> >> >> I create >> >> >> >> >> >> >> MetricsGraphiteReporterOptions which contains HOST, PORT, >> >> >> PREFIX, >> >> >> >> and >> >> >> >> >> >> >> remove getGraphiteServerHost, >> >> >> >> >> >> >> getGraphiteServerPort, getGraphiteMetricPrefix in >> >> >> >> >> HoodieMetricsConfig. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://github.com/apache/incubator-hudi/pull/1094 >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-11 02:35:30, "Balaji Varadarajan" >> >> >> >> >> <[email protected] >> >> >> >> >> >> > >> >> >> >> >> >> >> wrote: >> >> >> >> >> >> >> > Hi Lamber-Ken, >> >> >> >> >> >> >> >Thanks for the time writing the proposal and thinking >> about >> >> >> >> >> improving >> >> >> >> >> >> >> Hudi usability. >> >> >> >> >> >> >> >My preference would be to keep the Builder pattern when >> >> >> >> >> configuring. It >> >> >> >> >> >> >> is something I find it natural when configuring. It is >> not >> >> >> clear >> >> >> >> to >> >> >> >> >> me >> >> >> >> >> >> >> whether there is any external facing changes which >> changes >> >> this >> >> >> >> >> model. >> >> >> >> >> >> >> Would you mind adding some more details on the RFC. It >> would >> >> >> save >> >> >> >> >> time >> >> >> >> >> >> to >> >> >> >> >> >> >> read it in one place as opposed to checking out github >> repo >> >> :) >> >> >> >> >> >> >> >Thanks,Balaji.V >> >> >> >> >> >> >> > On Tuesday, December 10, 2019, 07:55:01 AM PST, >> Vinoth >> >> >> >> Chandar < >> >> >> >> >> >> >> [email protected]> wrote: >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > Hi , >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >Thanks for the proposal. Some parts I agree, some parts >> I >> >> >> don't >> >> >> >> and >> >> >> >> >> >> some >> >> >> >> >> >> >> >parts are unclear >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >Agree : >> >> >> >> >> >> >> >- On introducing a class that binds key, default value, >> >> >> provided >> >> >> >> >> value, >> >> >> >> >> >> >> and >> >> >> >> >> >> >> >also may be a doc along with it (?). >> >> >> >> >> >> >> >- Designing the framework to have fallback keys is good >> >> IMO. >> >> >> It >> >> >> >> >> helps >> >> >> >> >> >> us >> >> >> >> >> >> >> do >> >> >> >> >> >> >> >things like >> https://issues.apache.org/jira/browse/HUDI-89 >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >Disagree : >> >> >> >> >> >> >> >- Not all configuration values are in HoodieWriteConfig, >> >> its >> >> >> not >> >> >> >> >> >> accurate. >> >> >> >> >> >> >> >Configs are already split by components into >> >> >> HoodieIndexConfig, >> >> >> >> >> >> >> >HoodieCompactionConfig etc.. >> >> >> >> >> >> >> >- There are helpers for all these conveniently located >> in >> >> >> >> >> >> >> >HoodieWriteConfig. I think some of the claims of >> usability >> >> >> seem >> >> >> >> >> >> subjective >> >> >> >> >> >> >> >to me, speaking from hands-on experience writing jobs. >> So, >> >> if >> >> >> you >> >> >> >> >> >> >> proposing >> >> >> >> >> >> >> >a large shake up (e.g not have a single properties file >> >> load >> >> >> all >> >> >> >> >> >> >> >components), I would love to understand this at more >> depth. >> >> >> From >> >> >> >> my >> >> >> >> >> >> >> >experience, well namespaced configs in a single >> properties >> >> >> file >> >> >> >> >> keeps >> >> >> >> >> >> it >> >> >> >> >> >> >> >simple and understandable. >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >Unclear : >> >> >> >> >> >> >> >- What is impact on existing jobs - using >> RDD/WriteClient >> >> >> API, >> >> >> >> >> >> >> DataSource, >> >> >> >> >> >> >> >DeltaStreamer level? Do you intend to change >> namespacing of >> >> >> >> configs? >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >Thanks >> >> >> >> >> >> >> >Vinoth >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >On Tue, Dec 10, 2019 at 6:44 AM lamberken < >> >> [email protected]> >> >> >> >> >> wrote: >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi, vino >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Reasonable, we can refactor this step by step. The >> first >> >> >> step >> >> >> >> now >> >> >> >> >> >> is to >> >> >> >> >> >> >> >> introduce the config framework. >> >> >> >> >> >> >> >> When our community is familiar with the config >> framework >> >> >> >> >> mechanism, >> >> >> >> >> >> it's >> >> >> >> >> >> >> >> easy to integrate FallbackKey in the future. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-10 11:51:22, "vino yang" < >> >> [email protected]> >> >> >> >> >> wrote: >> >> >> >> >> >> >> >> >Hi Lamber, >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >Thanks for the proposal. +1 from my side. >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >When it comes to configuration, it will involve how >> we >> >> >> handle >> >> >> >> >> >> >> deprecated >> >> >> >> >> >> >> >> >configuration items in the future. In my opinion, we >> >> need >> >> >> to >> >> >> >> take >> >> >> >> >> >> this >> >> >> >> >> >> >> >> into >> >> >> >> >> >> >> >> >consideration when designing. There are already some >> >> >> >> successful >> >> >> >> >> >> >> practices >> >> >> >> >> >> >> >> >for our reference. For example, Flink defines some >> >> >> deprecated >> >> >> >> >> >> >> >> >configurations as FallbackKey[1]. Maybe we can learn >> >> from >> >> >> >> these >> >> >> >> >> >> >> designs. >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >WDYT? >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >[1]: >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/FallbackKey.java >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >Best, >> >> >> >> >> >> >> >> >Vino >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >lamberken <[email protected]> 于2019年12月9日周一 >> 下午11:19写道: >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi, all >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Currently, many configuration items and their >> default >> >> >> values >> >> >> >> >> are >> >> >> >> >> >> >> >> dispersed >> >> >> >> >> >> >> >> >> in the config file like HoodieWriteConfig. It’s >> very >> >> >> >> confused >> >> >> >> >> for >> >> >> >> >> >> >> >> >> developers, and it's easy for developers to use >> them >> >> in a >> >> >> >> wrong >> >> >> >> >> >> place >> >> >> >> >> >> >> >> >> especially when there are more and more >> configuration >> >> >> items. >> >> >> >> >> If we >> >> >> >> >> >> >> can >> >> >> >> >> >> >> >> >> solve this, developers will benefit from it and the >> >> code >> >> >> >> >> structure >> >> >> >> >> >> >> will >> >> >> >> >> >> >> >> be >> >> >> >> >> >> >> >> >> more concise. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> I had create a JIRA[1] and a under discuss RFC[2] >> to >> >> >> explain >> >> >> >> >> how >> >> >> >> >> >> to >> >> >> >> >> >> >> >> solve >> >> >> >> >> >> >> >> >> the problem, if you are interested in this, you can >> >> visit >> >> >> >> jira >> >> >> >> >> and >> >> >> >> >> >> >> RFC >> >> >> >> >> >> >> >> for >> >> >> >> >> >> >> >> >> detail. Any comments and feedback are welcome, >> WDYT? >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> [1] >> >> >> >> >> https://issues.apache.org/jira/projects/HUDI/issues/HUDI-375 >> >> >> >> >> >> >> >> >> [2] >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://cwiki.apache.org/confluence/display/HUDI/RFC-11+%3A+Refactor+of+the+configuration+framework+of+hudi+project >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> >-- >> >> >> >> >> >Regards, >> >> >> >> >> >-Sivabalan >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>
