hi @Vinoth Chandar,
Got it, thanks. best, lamber-ken At 2020-01-09 23:52:52, "Vinoth Chandar" <[email protected]> wrote: >Hi lamber-ken, > >A ConfigOption class would be good indeed. +1 on starting incrementally >with DataSource first and then iterating.. > >Thanks >Vinoth > >On Tue, Jan 7, 2020 at 6:58 PM lamberken <[email protected]> wrote: > >> >> >> Hi @Vinoth, >> >> >> It's time to pick up this topic. Based on the content we talked about, >> here are my thoughts >> >> >> 1, Initial proposal aims to rework configuration framework >> includes(DataSource and WriteClient level), >> for compatibility, we can introduce a ConfigOption class and rework it on >> DataSource level. >> >> >> 2, It's very right that the scoped down version does not need a RFC[1], so >> change state from 'Under Discussion' to 'Close' ? >> >> >> [1] >> https://cwiki.apache.org/confluence/display/HUDI/RFC+-+11+%3A+Refactor+of+the+configuration+framework+of+hudi+project >> >> >> Best, >> Lamber-Ken >> >> >> >> >> >> >> >> >> >> At 2019-12-19 11:05:16, "Vinoth Chandar" <[email protected]> wrote: >> >Sounds good.. This scoped down version per se, does not need a RFC. >> > >> >On Wed, Dec 18, 2019 at 3:09 PM lamberken <[email protected]> wrote: >> > >> >> >> >> >> >> Hi @Vinoth >> >> >> >> >> >> I understand what you mean, I will continue to work on this when I >> finish >> >> reworking the new UI. :) >> >> >> >> >> >> best, >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> At 2019-12-18 11:39:30, "Vinoth Chandar" <[email protected]> wrote: >> >> >Expect most users to use inputDF.write() approach... Uber uses the >> lower >> >> >level RDD apis, like the DeltaStreamer tool does.. >> >> >If we don't rename configs and still support a builder, it should be >> fine. >> >> > >> >> >I think we can scope this down to introducing a ConfigOption class that >> >> >ties, the key,value, default together.. That definitely seems like a >> >> better >> >> >abstraction. >> >> > >> >> >On Fri, Dec 13, 2019 at 5:18 PM lamberken <[email protected]> wrote: >> >> > >> >> >> >> >> >> >> >> >> Hi, @vinoth >> >> >> >> >> >> >> >> >> Okay, I see. If we don't want existing users to do any upgrading or >> >> >> reconfigurations, then this refactor work will not make much sense. >> >> >> This issue can be closed, because ConfigOptions and these builders do >> >> the >> >> >> same things. >> >> >> From another side, if we finish this work before a stable release, we >> >> will >> >> >> benefit a lot from it. We need to make a choice. >> >> >> >> >> >> >> >> >> btw, I have a question that users will use HoodieWriteConfig / >> >> >> HoodieWriteClient in their program? >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------------------- >> >> >> HoodieWriteConfig cfg = HoodieWriteConfig.newBuilder() >> >> >> .withPath(basePath) >> >> >> .forTable(tableName) >> >> >> .withSchema(schemaStr) >> >> >> .withProps(props) // pass raw k,v pairs from a property file. >> >> >> >> >> >> >> >> >> .withCompactionConfig(HoodieCompactionConfig.newBuilder().withXXX(...).build()) >> >> >> >> >> >> .withIndexConfig(HoodieIndexConfig.newBuilder().withXXX(...).build()) >> >> >> ... >> >> >> .build(); >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------------------- >> >> >> OR >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------------------- >> >> >> inputDF.write() >> >> >> .format("org.apache.hudi") >> >> >> .options(clientOpts) // any of the Hudi client opts can be >> passed in >> >> >> as well >> >> >> .option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY(), >> >> "_row_key") >> >> >> .option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY(), >> >> >> "partition") >> >> >> .option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY(), >> >> "timestamp") >> >> >> .option(HoodieWriteConfig.TABLE_NAME, tableName) >> >> >> .mode(SaveMode.Append) >> >> >> .save(basePath); >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------------------- >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> best, >> >> >> lamber-ken >> >> >> >> >> >> At 2019-12-14 08:43:06, "Vinoth Chandar" <[email protected]> wrote: >> >> >> >Hi, >> >> >> > >> >> >> >Are you saying these classes needs to change? If so, understood. But >> >> are >> >> >> >you planning on renaming configs or relocating them? We dont want >> >> existing >> >> >> >users to do any upgrading or reconfigurations >> >> >> > >> >> >> >On Fri, Dec 13, 2019 at 10:28 AM lamberken <[email protected]> >> wrote: >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> Hi, >> >> >> >> >> >> >> >> >> >> >> >> They need to change due to this, because only HoodieWriteConfig >> and >> >> >> >> *Options will be kept. >> >> >> >> >> >> >> >> >> >> >> >> best, >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-14 01:23:35, "Vinoth Chandar" <[email protected]> >> wrote: >> >> >> >> >Hi, >> >> >> >> > >> >> >> >> >We are trying to understand if existing jobs (datasource, >> >> >> deltastreamer, >> >> >> >> >anything else) needs to change due to this. >> >> >> >> > >> >> >> >> >On Wed, Dec 11, 2019 at 7:18 PM lamberken <[email protected]> >> >> wrote: >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi, @vinoth >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> 1, Hoodie*Config classes are only used to set default value >> when >> >> call >> >> >> >> >> their build method currently. >> >> >> >> >> They will be replaced by HoodieMemoryOptions, >> HoodieIndexOptions, >> >> >> >> >> HoodieHBaseIndexOptions, etc... >> >> >> >> >> 2, I don't understand the question "It is not clear to me >> whether >> >> >> there >> >> >> >> is >> >> >> >> >> any external facing changes which changes this model.". >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-12 11:01:36, "Vinoth Chandar" <[email protected]> >> >> wrote: >> >> >> >> >> >I actually prefer the builder pattern for making the configs, >> >> >> because I >> >> >> >> >> can >> >> >> >> >> >do `builder.` in the IDE and actually see all the options... >> That >> >> >> said, >> >> >> >> >> >most developers program against the Spark datasource and so >> this >> >> may >> >> >> >> not >> >> >> >> >> be >> >> >> >> >> >useful, unless we expose a builder for that.. I will concede >> that >> >> >> since >> >> >> >> >> its >> >> >> >> >> >also subjective anyway. >> >> >> >> >> > >> >> >> >> >> >But, to clarify Siva's question, you do intend to keep the >> >> different >> >> >> >> >> >component level config classes right - HoodieIndexConfig, >> >> >> >> >> >HoodieCompactionConfig? >> >> >> >> >> > >> >> >> >> >> >Once again, can you please explicitly address the following >> >> >> question, >> >> >> >> so >> >> >> >> >> we >> >> >> >> >> >can get on the same page? >> >> >> >> >> >>> It is not clear to me whether there is any external facing >> >> >> changes >> >> >> >> >> which >> >> >> >> >> >changes this model. >> >> >> >> >> >This is still the most critical question from both me and >> balaji. >> >> >> >> >> > >> >> >> >> >> >On Wed, Dec 11, 2019 at 11:35 AM lamberken <[email protected] >> > >> >> >> wrote: >> >> >> >> >> > >> >> >> >> >> >> hi, @Sivabalan >> >> >> >> >> >> >> >> >> >> >> >> Yes, thanks very much for help me explain my initial >> proposal. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Answer your question, we can call HoodieWriteConfig as a >> >> >> >> SystemConfig, >> >> >> >> >> we >> >> >> >> >> >> need to pass it everywhere. >> >> >> >> >> >> Actually, it may just contains a few custom configurations( >> >> does >> >> >> not >> >> >> >> >> >> include default configurations) >> >> >> >> >> >> Because each component has its own ConfigOptions. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> The old version HoodieWriteConfig includes all keys(custom >> >> >> >> >> configurations, >> >> >> >> >> >> default configurations), it is a fat. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-12 03:14:11, "Sivabalan" <[email protected]> >> >> wrote: >> >> >> >> >> >> >Let me summarize your initial proposal and then will get >> into >> >> >> >> details. >> >> >> >> >> >> >- Introduce ConfigOptions for ease of handling of default >> >> values. >> >> >> >> >> >> >- Remove all Hoodie*Config classes and just have >> >> >> HoodieWriteConfig. >> >> >> >> >> What >> >> >> >> >> >> >this means is that, every other config file will be >> replaced >> >> by >> >> >> >> >> >> >ConfigOptions. eg, HoodieIndexConfigOption, >> >> >> >> >> HoodieCompactionConfigOption, >> >> >> >> >> >> >etc. >> >> >> >> >> >> >- Config option will take care of returning defaults for >> any >> >> >> >> property, >> >> >> >> >> >> even >> >> >> >> >> >> >if an entire Config(eg IndexConfig) is not explicitly set. >> >> >> >> >> >> > >> >> >> >> >> >> >Here are the positives I see. >> >> >> >> >> >> >- By way of having component level ConfigOptions, we >> bucketize >> >> >> the >> >> >> >> >> configs >> >> >> >> >> >> >and have defaults set(same as before) >> >> >> >> >> >> >- User doesn't need to set each component's config(eg >> >> >> IndexConfig) >> >> >> >> >> >> >explicitly with HoodieWriteConfig. >> >> >> >> >> >> > >> >> >> >> >> >> >But have one question: >> >> >> >> >> >> >- I see Bucketizing only in write path. How does one get >> hold >> >> of >> >> >> >> >> >> >IndexConfigOptions as a consumer? For eg, If some class is >> >> using >> >> >> >> just >> >> >> >> >> >> >IndexConfig alone, how will it consume? From your eg, I see >> >> only >> >> >> >> >> >> >HoodieWriteConfig. Do we pass in HoodieWriteConfig >> everywhere >> >> >> then? >> >> >> >> >> >> >Wouldn't that contradicts your initial proposal to not >> have a >> >> fat >> >> >> >> >> config >> >> >> >> >> >> >class? May be can you expand your example below to show >> how a >> >> >> >> consumer >> >> >> >> >> of >> >> >> >> >> >> >IndexConfig look like. >> >> >> >> >> >> > >> >> >> >> >> >> >Your eg: >> >> >> >> >> >> >/** >> >> >> >> >> >> > * New version >> >> >> >> >> >> > */ >> >> >> >> >> >> >// set value overrite the default value >> >> >> >> >> >> >HoodieWriteConfig config = new HoodieWriteConfig(); >> >> >> >> >> >> >config.set(HoodieIndexConfigOptions.INDEX_TYPE, >> >> >> >> >> >> >HoodieIndex.IndexType.HBASE.name < >> >> >> >> >> >> http://hoodieindex.indextype.hbase.name/> >> >> >> >> >> >> >()) >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> >On Wed, Dec 11, 2019 at 8:33 AM lamberken < >> [email protected]> >> >> >> >> wrote: >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi, >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On 1,2. Yes, you are right, moving the getter to the >> >> component >> >> >> >> level >> >> >> >> >> >> >> Config class itself. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On 3, HoodieWriteConfig can also set value through >> >> >> ConfigOption, >> >> >> >> >> small >> >> >> >> >> >> >> code snippets. >> >> >> >> >> >> >> From the bellow snippets, we can see that clients need to >> >> know >> >> >> >> each >> >> >> >> >> >> >> component's builders >> >> >> >> >> >> >> and also call their "with" methods to override the >> default >> >> >> value >> >> >> >> in >> >> >> >> >> old >> >> >> >> >> >> >> version. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> But, in new version, clients just need to know each >> >> component's >> >> >> >> >> public >> >> >> >> >> >> >> config options, just like constants. >> >> >> >> >> >> >> So, these builders are redundant. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> public class HoodieIndexConfigOptions { >> >> >> >> >> >> >> public static final ConfigOption<String> INDEX_TYPE = >> >> >> >> ConfigOption >> >> >> >> >> >> >> .key("hoodie.index.type") >> >> >> >> >> >> >> .defaultValue(HoodieIndex.IndexType.BLOOM.name >> >> ()); >> >> >> >> >> >> >> } >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> public class HoodieWriteConfig { >> >> >> >> >> >> >> public void setString(ConfigOption<String> option, >> String >> >> >> >> value) { >> >> >> >> >> >> >> this.props.put(option.key(), value); >> >> >> >> >> >> >> } >> >> >> >> >> >> >> } >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /** >> >> >> >> >> >> >> * New version >> >> >> >> >> >> >> */ >> >> >> >> >> >> >> // set value overrite the default value >> >> >> >> >> >> >> HoodieWriteConfig config = new HoodieWriteConfig(); >> >> >> >> >> >> >> config.set(HoodieIndexConfigOptions.INDEX_TYPE, >> >> >> >> >> >> >> HoodieIndex.IndexType.HBASE.name()) >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /** >> >> >> >> >> >> >> * Old version >> >> >> >> >> >> >> */ >> >> >> >> >> >> >> HoodieWriteConfig.Builder builder = >> >> >> HoodieWriteConfig.newBuilder() >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> builder.withIndexConfig(HoodieIndexConfig.newBuilder().withIndexType(HoodieIndex.IndexType.BLOOM).build()) >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Another, users use hudi like bellow, here're all keys. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> df.write.format("hudi"). >> >> >> >> >> >> >> option("hoodie.insert.shuffle.parallelism", "10"). >> >> >> >> >> >> >> option("hoodie.upsert.shuffle.parallelism", "10"). >> >> >> >> >> >> >> option("hoodie.delete.shuffle.parallelism", "10"). >> >> >> >> >> >> >> option("hoodie.bulkinsert.shuffle.parallelism", >> "10"). >> >> >> >> >> >> >> option("hoodie.datasource.write.recordkey.field", >> >> "name"). >> >> >> >> >> >> >> option("hoodie.datasource.write.partitionpath.field", >> >> >> >> >> "location"). >> >> >> >> >> >> >> option("hoodie.datasource.write.precombine.field", >> >> "ts"). >> >> >> >> >> >> >> option("hoodie.table.name", tableName). >> >> >> >> >> >> >> mode(Overwrite). >> >> >> >> >> >> >> save(basePath); >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> /---------------------------------------------------------------------------/ >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Last, as I responsed to @vino, it's reasonable to handle >> >> >> >> >> fallbackkeys. I >> >> >> >> >> >> >> think we need to do this step by step, >> >> >> >> >> >> >> it's easy to integrate FallbackKey in the future, it is >> not >> >> >> what >> >> >> >> we >> >> >> >> >> need >> >> >> >> >> >> >> right now in my opinion. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> If some places are still not very clear, feel free to >> >> feedback. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-11 23:41:31, "Vinoth Chandar" < >> [email protected] >> >> > >> >> >> >> wrote: >> >> >> >> >> >> >> >Hi Lamber-ken, >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >I looked at the sample PR you put up as well. >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >On 1,2 => Seems your intent is to replace these with >> moving >> >> >> the >> >> >> >> >> getter >> >> >> >> >> >> to >> >> >> >> >> >> >> >the component level Config class itself? I am fine with >> >> that >> >> >> >> >> (although >> >> >> >> >> >> I >> >> >> >> >> >> >> >think its not that big of a hurdle really to use atm). >> But, >> >> >> once >> >> >> >> we >> >> >> >> >> do >> >> >> >> >> >> >> that >> >> >> >> >> >> >> >we could pass just the specific component config into >> >> parts of >> >> >> >> code >> >> >> >> >> >> versus >> >> >> >> >> >> >> >passing in the entire HoodieWriteConfig object. I am >> fine >> >> with >> >> >> >> >> moving >> >> >> >> >> >> the >> >> >> >> >> >> >> >classes to a ConfigOption class as you suggested as >> well. >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >On 3, I still we feel we will need the builder pattern >> >> going >> >> >> >> >> forward. >> >> >> >> >> >> to >> >> >> >> >> >> >> >build the HoodieWriteConfig object. Like below? Cannot >> >> >> understand >> >> >> >> >> why >> >> >> >> >> >> we >> >> >> >> >> >> >> >would want to change this. Could you please clarify? >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >HoodieWriteConfig.Builder builder = >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> HoodieWriteConfig.newBuilder().withPath(cfg.targetBasePath).combineInput(cfg.filterDupes, >> >> >> >> >> >> >> >true) >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> .withCompactionConfig(HoodieCompactionConfig.newBuilder().withPayloadClass(cfg.payloadClassName) >> >> >> >> >> >> >> > // Inline compaction is disabled for >> continuous >> >> >> mode. >> >> >> >> >> >> >> >otherwise enabled for MOR >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> .withInlineCompaction(cfg.isInlineCompactionEnabled()).build()) >> >> >> >> >> >> >> > .forTable(cfg.targetTableName) >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> .withIndexConfig(HoodieIndexConfig.newBuilder().withIndexType(HoodieIndex.IndexType.BLOOM).build()) >> >> >> >> >> >> >> > .withAutoCommit(false).withProps(props); >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >Typically, we write RFCs for large changes that breaks >> >> >> existing >> >> >> >> >> >> behavior >> >> >> >> >> >> >> or >> >> >> >> >> >> >> >introduces significantly complex new features.. If you >> are >> >> >> just >> >> >> >> >> >> planning >> >> >> >> >> >> >> to >> >> >> >> >> >> >> >do the refactoring into ConfigOption class, per se you >> >> don't >> >> >> >> need a >> >> >> >> >> >> RFC. >> >> >> >> >> >> >> >But , if you plan to address the fallback keys (or) your >> >> >> changes >> >> >> >> are >> >> >> >> >> >> going >> >> >> >> >> >> >> >to break/change existing jobs, we would need a RFC. >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >>> It is not clear to me whether there is any external >> >> facing >> >> >> >> >> changes >> >> >> >> >> >> >> which >> >> >> >> >> >> >> >changes this model. >> >> >> >> >> >> >> >I am still unclear on this as well. can you please >> >> explicitly >> >> >> >> >> clarify? >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >thanks >> >> >> >> >> >> >> >vinoth >> >> >> >> >> >> >> > >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >On Tue, Dec 10, 2019 at 12:35 PM lamberken < >> >> [email protected] >> >> >> > >> >> >> >> >> wrote: >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi, @Balaji @Vinoth >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> I'm sorry, some places are not very clear, >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> 1, We can see that HoodieMetricsConfig, >> >> HoodieStorageConfig, >> >> >> >> etc.. >> >> >> >> >> >> >> already >> >> >> >> >> >> >> >> defined in project. >> >> >> >> >> >> >> >> But we get property value by methods which defined >> in >> >> >> >> >> >> >> >> HoodieWriteConfig, like >> >> >> >> HoodieWriteConfig#getParquetMaxFileSize, >> >> >> >> >> >> >> >> HoodieWriteConfig#getParquetBlockSize, etc. It's >> means >> >> >> that >> >> >> >> >> >> >> >> Hoodie*Config are redundant. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> 2, These Hoodie*Config classes are used to set default >> >> value >> >> >> >> when >> >> >> >> >> >> call >> >> >> >> >> >> >> >> their build method, nothing else. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> 3, For current plan is keep the Builder pattern when >> >> >> >> configuring, >> >> >> >> >> >> when >> >> >> >> >> >> >> we >> >> >> >> >> >> >> >> are familiar with the config framework, >> >> >> >> >> >> >> >> We will find that Hoodie*Config class are redundant >> >> and >> >> >> >> methods >> >> >> >> >> >> >> >> prefixed with "get" in HoodieWriteConfig are also >> >> redundant. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> In addition, I create a pr[1] for initializing with a >> >> demo. >> >> >> At >> >> >> >> >> this >> >> >> >> >> >> >> demo, >> >> >> >> >> >> >> >> I create >> >> >> >> >> >> >> >> MetricsGraphiteReporterOptions which contains HOST, >> PORT, >> >> >> >> PREFIX, >> >> >> >> >> and >> >> >> >> >> >> >> >> remove getGraphiteServerHost, >> >> >> >> >> >> >> >> getGraphiteServerPort, getGraphiteMetricPrefix in >> >> >> >> >> >> HoodieMetricsConfig. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://github.com/apache/incubator-hudi/pull/1094 >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-11 02:35:30, "Balaji Varadarajan" >> >> >> >> >> >> <[email protected] >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> wrote: >> >> >> >> >> >> >> >> > Hi Lamber-Ken, >> >> >> >> >> >> >> >> >Thanks for the time writing the proposal and thinking >> >> about >> >> >> >> >> >> improving >> >> >> >> >> >> >> >> Hudi usability. >> >> >> >> >> >> >> >> >My preference would be to keep the Builder pattern >> when >> >> >> >> >> >> configuring. It >> >> >> >> >> >> >> >> is something I find it natural when configuring. It is >> >> not >> >> >> >> clear >> >> >> >> >> to >> >> >> >> >> >> me >> >> >> >> >> >> >> >> whether there is any external facing changes which >> >> changes >> >> >> this >> >> >> >> >> >> model. >> >> >> >> >> >> >> >> Would you mind adding some more details on the RFC. It >> >> would >> >> >> >> save >> >> >> >> >> >> time >> >> >> >> >> >> >> to >> >> >> >> >> >> >> >> read it in one place as opposed to checking out github >> >> repo >> >> >> :) >> >> >> >> >> >> >> >> >Thanks,Balaji.V >> >> >> >> >> >> >> >> > On Tuesday, December 10, 2019, 07:55:01 AM PST, >> >> Vinoth >> >> >> >> >> Chandar < >> >> >> >> >> >> >> >> [email protected]> wrote: >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> > Hi , >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >Thanks for the proposal. Some parts I agree, some >> parts >> >> I >> >> >> >> don't >> >> >> >> >> and >> >> >> >> >> >> >> some >> >> >> >> >> >> >> >> >parts are unclear >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >Agree : >> >> >> >> >> >> >> >> >- On introducing a class that binds key, default >> value, >> >> >> >> provided >> >> >> >> >> >> value, >> >> >> >> >> >> >> >> and >> >> >> >> >> >> >> >> >also may be a doc along with it (?). >> >> >> >> >> >> >> >> >- Designing the framework to have fallback keys is >> good >> >> >> IMO. >> >> >> >> It >> >> >> >> >> >> helps >> >> >> >> >> >> >> us >> >> >> >> >> >> >> >> do >> >> >> >> >> >> >> >> >things like >> >> https://issues.apache.org/jira/browse/HUDI-89 >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >Disagree : >> >> >> >> >> >> >> >> >- Not all configuration values are in >> HoodieWriteConfig, >> >> >> its >> >> >> >> not >> >> >> >> >> >> >> accurate. >> >> >> >> >> >> >> >> >Configs are already split by components into >> >> >> >> HoodieIndexConfig, >> >> >> >> >> >> >> >> >HoodieCompactionConfig etc.. >> >> >> >> >> >> >> >> >- There are helpers for all these conveniently >> located >> >> in >> >> >> >> >> >> >> >> >HoodieWriteConfig. I think some of the claims of >> >> usability >> >> >> >> seem >> >> >> >> >> >> >> subjective >> >> >> >> >> >> >> >> >to me, speaking from hands-on experience writing >> jobs. >> >> So, >> >> >> if >> >> >> >> you >> >> >> >> >> >> >> >> proposing >> >> >> >> >> >> >> >> >a large shake up (e.g not have a single properties >> file >> >> >> load >> >> >> >> all >> >> >> >> >> >> >> >> >components), I would love to understand this at more >> >> depth. >> >> >> >> From >> >> >> >> >> my >> >> >> >> >> >> >> >> >experience, well namespaced configs in a single >> >> properties >> >> >> >> file >> >> >> >> >> >> keeps >> >> >> >> >> >> >> it >> >> >> >> >> >> >> >> >simple and understandable. >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >Unclear : >> >> >> >> >> >> >> >> >- What is impact on existing jobs - using >> >> RDD/WriteClient >> >> >> >> API, >> >> >> >> >> >> >> >> DataSource, >> >> >> >> >> >> >> >> >DeltaStreamer level? Do you intend to change >> >> namespacing of >> >> >> >> >> configs? >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >Thanks >> >> >> >> >> >> >> >> >Vinoth >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >On Tue, Dec 10, 2019 at 6:44 AM lamberken < >> >> >> [email protected]> >> >> >> >> >> >> wrote: >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi, vino >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Reasonable, we can refactor this step by step. The >> >> first >> >> >> >> step >> >> >> >> >> now >> >> >> >> >> >> >> is to >> >> >> >> >> >> >> >> >> introduce the config framework. >> >> >> >> >> >> >> >> >> When our community is familiar with the config >> >> framework >> >> >> >> >> >> mechanism, >> >> >> >> >> >> >> it's >> >> >> >> >> >> >> >> >> easy to integrate FallbackKey in the future. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> At 2019-12-10 11:51:22, "vino yang" < >> >> >> [email protected]> >> >> >> >> >> >> wrote: >> >> >> >> >> >> >> >> >> >Hi Lamber, >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >Thanks for the proposal. +1 from my side. >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >When it comes to configuration, it will involve >> how >> >> we >> >> >> >> handle >> >> >> >> >> >> >> >> deprecated >> >> >> >> >> >> >> >> >> >configuration items in the future. In my opinion, >> we >> >> >> need >> >> >> >> to >> >> >> >> >> take >> >> >> >> >> >> >> this >> >> >> >> >> >> >> >> >> into >> >> >> >> >> >> >> >> >> >consideration when designing. There are already >> some >> >> >> >> >> successful >> >> >> >> >> >> >> >> practices >> >> >> >> >> >> >> >> >> >for our reference. For example, Flink defines some >> >> >> >> deprecated >> >> >> >> >> >> >> >> >> >configurations as FallbackKey[1]. Maybe we can >> learn >> >> >> from >> >> >> >> >> these >> >> >> >> >> >> >> >> designs. >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >WDYT? >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >[1]: >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/FallbackKey.java >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >Best, >> >> >> >> >> >> >> >> >> >Vino >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >lamberken <[email protected]> 于2019年12月9日周一 >> >> 下午11:19写道: >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Hi, all >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Currently, many configuration items and their >> >> default >> >> >> >> values >> >> >> >> >> >> are >> >> >> >> >> >> >> >> >> dispersed >> >> >> >> >> >> >> >> >> >> in the config file like HoodieWriteConfig. It’s >> >> very >> >> >> >> >> confused >> >> >> >> >> >> for >> >> >> >> >> >> >> >> >> >> developers, and it's easy for developers to use >> >> them >> >> >> in a >> >> >> >> >> wrong >> >> >> >> >> >> >> place >> >> >> >> >> >> >> >> >> >> especially when there are more and more >> >> configuration >> >> >> >> items. >> >> >> >> >> >> If we >> >> >> >> >> >> >> >> can >> >> >> >> >> >> >> >> >> >> solve this, developers will benefit from it and >> the >> >> >> code >> >> >> >> >> >> structure >> >> >> >> >> >> >> >> will >> >> >> >> >> >> >> >> >> be >> >> >> >> >> >> >> >> >> >> more concise. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> I had create a JIRA[1] and a under discuss >> RFC[2] >> >> to >> >> >> >> explain >> >> >> >> >> >> how >> >> >> >> >> >> >> to >> >> >> >> >> >> >> >> >> solve >> >> >> >> >> >> >> >> >> >> the problem, if you are interested in this, you >> can >> >> >> visit >> >> >> >> >> jira >> >> >> >> >> >> and >> >> >> >> >> >> >> >> RFC >> >> >> >> >> >> >> >> >> for >> >> >> >> >> >> >> >> >> >> detail. Any comments and feedback are welcome, >> >> WDYT? >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> Best, >> >> >> >> >> >> >> >> >> >> lamber-ken >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> [1] >> >> >> >> >> >> >> https://issues.apache.org/jira/projects/HUDI/issues/HUDI-375 >> >> >> >> >> >> >> >> >> >> [2] >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> https://cwiki.apache.org/confluence/display/HUDI/RFC-11+%3A+Refactor+of+the+configuration+framework+of+hudi+project >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > >> >> >> >> >> >> > >> >> >> >> >> >> >-- >> >> >> >> >> >> >Regards, >> >> >> >> >> >> >-Sivabalan >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>
