Re: [DISCUSS] Hudi weekly community update

2020-01-05 Thread vino yang
Hi Leesf, Big +1 from my side. The "weekly" can make the community know the progress of Hudi. It has proved a good practice. Best, Vino OpenInx 于2020年1月6日周一 上午9:52写道: > +1 > > On Mon, Jan 6, 2020 at 8:06 AM leesf wrote: > > > Hi Nicholas, > > > > Agree with you that more developers would

Re: [DISCUSS] Hudi weekly community update

2020-01-05 Thread OpenInx
+1 On Mon, Jan 6, 2020 at 8:06 AM leesf wrote: > Hi Nicholas, > > Agree with you that more developers would involve in and are highly > welcomed to support "Hudi weekly community update", we could create a slack > channel to gather developers after finishing the discussion. > > Best, > Leesf >

Re: [DISCUSS] Hudi weekly community update

2020-01-05 Thread leesf
Hi Nicholas, Agree with you that more developers would involve in and are highly welcomed to support "Hudi weekly community update", we could create a slack channel to gather developers after finishing the discussion. Best, Leesf 蒋晓峰 于2020年1月6日周一 上午12:09写道: > Hi leesf, > I thought that

Re:[DISCUSS] Hudi weekly community update

2020-01-05 Thread 蒋晓峰
Hi leesf, I thought that "Hudi weekly community update" is excellent idea for community to know the progress of Hudi development, further attracting more developer to partipate in Hudi community. In my opinion, community could assign some developers to support "Hudi weekly community

Re: [DISCUSS] Is it possible to use DynamoDB as index storage?

2019-12-30 Thread ShaoFeng Shi
Hi Vinoth, Thank you for the information. I will read through the issues in JIRA. Any other user if you see this requirement, welcome to comment. Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Email: shaofeng...@apache.org Apache Kylin FAQ:

Re: [DISCUSS] Is it possible to use DynamoDB as index storage?

2019-12-30 Thread Vinoth Chandar
Hi, I would imagine we could write a dynamoDB index similar to HBase. No one has attempted it so far though :) You can look at issues tagged with "Index" component and see if there are small ones you could pick up to familiarize and then may be draw up a plan. Thanks Vinoth On Mon, Dec 30,

Re: Re: Re:Re: Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-24 Thread nishith agarwal
Lamber, Thanks for explaining, makes sense. -Nishith On Thu, Dec 19, 2019 at 5:29 PM lamberken wrote: > > Hi nishith, > > Thank you for your affirmation. The content in the blue box is to help us > understand the highlighted content. > It is different from the body content, so we need it.

Re: [DISCUSS] RFC-10 Restructuring and auto-generation of docs

2019-12-20 Thread Y Ethan Guo
@lamber-ken, Thanks for the suggestion and pointers. It makes sense to automatically trigger a CI job to validate and build the docs and publish them afterward. Publishing docs and website content do require PMC permission I believe (yet once most heavy-lifting work is done through scripts,

Re: [DISCUSS] Rework of new web site

2019-12-20 Thread lamberken
Hi Vinoth, Thanks for your affirmation, here is pr https://github.com/apache/incubator-hudi/pull/1120 best, lamber-ken On 12/21/2019 07:59,Vinoth Chandar wrote: Hi lamber, Given we have enough +1s on the look and feel aspects, I propose we open a PR and iron out the content/remaining

Re: Re: Re: Re:Re: Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-20 Thread Vinoth Chandar
Hi lamber, Given we have enough +1s on the look and feel aspects, I propose we open a PR and iron out the content/remaining issues there one by one. I think a full line by line review is the best way to go, as with any code change Please share the PR here once you have it Thanks Vinoth On

Re:Re: Re: Re:Re: Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-20 Thread lamberken
Hi leesf, Thank you for your affirmation. best, lamber-ken At 2019-12-21 07:28:50, "leesf" wrote: Hi lamber, Thanks for your great work, the new website looks much better. Also if you guys have other companies(logos) needed to add to powered by(Hudi Users)[1], please let

Re: Re: Re:Re: Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-20 Thread leesf
Hi lamber, Thanks for your great work, the new website looks much better. Also if you guys have other companies(logos) needed to add to powered by(Hudi Users)[1], please let lamberken/me know before using new website. Best, Leesf [1] https://lamber-ken.github.io/ lamberken 于2019年12月20日周五

Re: Re:Re: Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-19 Thread nishith agarwal
Great job Lamber! The website looks really slick and has a much better experience of moving from one page to another (mostly I think because it's faster), also find it the text much more conducive to absorb. While going through the quick start, I noticed that under the highlighted box in dark

Re: Re:Re: Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-18 Thread vino yang
Hi Lamber, Awesome! Thanks for your hard work. Best, Vino lamberken 于2019年12月19日周四 下午2:11写道: > > > Hi everyone, > > > I finished the rework of the new UI, if you have time, please visit the > website[1]. > Any questions are welcome. > > >

Re:Re:Re: Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-18 Thread lamberken
Hi everyone, I finished the rework of the new UI, if you have time, please visit the website[1]. Any questions are welcome. [1]https://lamber-ken.github.io/docs/quick-start-guide/ best, lamber-ken At 2019-12-19 07:38:47, "lamberken" wrote: > > >Hi @Shiyan Xu > > >Thanks. :) >best,

Re: Re: Re: Re: Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-18 Thread Vinoth Chandar
Sounds good.. This scoped down version per se, does not need a RFC. On Wed, Dec 18, 2019 at 3:09 PM lamberken wrote: > > > Hi @Vinoth > > > I understand what you mean, I will continue to work on this when I finish > reworking the new UI. :) > > > best, > lamber-ken > > > > > At 2019-12-18

Re:Re: Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-18 Thread lamberken
Hi @Shiyan Xu Thanks. :) best, lamber-ken At 2019-12-19 00:53:51, "Shiyan Xu" wrote: >Thank you @lamber-ken for the work! It is definitely a greater browsing >experience. > >On Tue, Dec 17, 2019 at 8:28 PM lamberken wrote: > >> >> Hi, @Vinoth >> >> >> >> I'm glad to hear your thoughts on

Re:Re: Re: Re: Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-18 Thread lamberken
Hi @Vinoth I understand what you mean, I will continue to work on this when I finish reworking the new UI. :) best, lamber-ken At 2019-12-18 11:39:30, "Vinoth Chandar" wrote: >Expect most users to use inputDF.write() approach... Uber uses the lower >level RDD apis, like the

Re: Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-18 Thread Shiyan Xu
Thank you @lamber-ken for the work! It is definitely a greater browsing experience. On Tue, Dec 17, 2019 at 8:28 PM lamberken wrote: > > Hi, @Vinoth > > > > I'm glad to hear your thoughts on the new UI, thanks. So we keep its style > as it is now. > The development of new UI can be completed

Re:Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-17 Thread lamberken
Hi, @Vinoth I'm glad to hear your thoughts on the new UI, thanks. So we keep its style as it is now. The development of new UI can be completed these days, any questions are welcome. best, lamber-ken At 2019-12-18 11:44:27, "Vinoth Chandar" wrote: >The case for right navigation for me,

Re: Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-17 Thread Vinoth Chandar
The case for right navigation for me, is mainly from pages like https://lamber-ken.github.io/docs/docker_demo https://lamber-ken.github.io/docs/querying_data https://lamber-ken.github.io/docs/writing_data which often have commands/text you want to selectively copy paste from a section. For

Re: Re: Re: Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-17 Thread Vinoth Chandar
Expect most users to use inputDF.write() approach... Uber uses the lower level RDD apis, like the DeltaStreamer tool does.. If we don't rename configs and still support a builder, it should be fine. I think we can scope this down to introducing a ConfigOption class that ties, the key,value,

Re:Re:Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-17 Thread lamberken
hi, allOne more thing that is missing.In the new UI, I put a "BACK TO TOP" button at the bottom of all pages to help us back to top. We can also discuss whether we need the right navigation at the community meeting today.best, lamber-ken At 2019-12-18 08:41:49, "lamberken" wrote: >

Re: Re: Re: [DISCUSS] Rework of new web site

2019-12-17 Thread Vinoth Chandar
One more thing that is missing. Current site has a navigation links on the right, which lets you jump to the right section directly. This is also a must-have IMHO. I would suggest wait for more folks to come back from vacation, before we finalize anything on this, as there could be more feedback

Re:Re: Re: [DISCUSS] Rework of new web site

2019-12-16 Thread lamberken
Hi Vinoth, 1, I'll update the site content this week, clean some useless templete codes, adjust the content etc... It will take a little long time for syncing the content. 2, I will adjust the style as much as I can to keep the theming blue and white. When the above work is completed, I

Re: Re: [DISCUSS] Rework of new web site

2019-12-16 Thread Vinoth Chandar
Hi Lamber, +1 on the look and feel. Definitely feels slick and fast. Love the syntax highlighting. Few things : - Can we just update the site content as-is? ( I'd rather change just the look-and-feel and evolve the content from there, per usual means) - Can we keep the theming blue and white,

Re: Re: [DISCUSS] Scaling community support

2019-12-16 Thread Vinoth Chandar
https://cwiki.apache.org/confluence/display/HUDI/Community+Support Anyone interested in giving the rotation a shot, please add your name next to the a slot. Let's see how this goes in Jan and we can learn On Mon, Dec 16, 2019 at 5:27 AM Vinoth Chandar wrote: > Thanks everyone for the

Re: Re: [DISCUSS] Scaling community support

2019-12-16 Thread Vinoth Chandar
Thanks everyone for the suggestions. Always great to see a healthy constructive discussion! Overall it seems like - At-least 3-4 are willing to give the support rotation a shot. As concrete followup, will document the support responsibilities clearly in a wiki and share it again here. To be

Re: [DISCUSS] Rework of new web site

2019-12-15 Thread leesf
Hi Lamber, Thanks for your work, have gone through the new web ui, looks good. Hence +1 from my side. Best, Leesf vino yang 于2019年12月16日周一 上午10:17写道: > Hi Lamber, > > I am not an expert on Jekyll. But big +1 for your proposal to improve the > site. > > Best, > Vino > > Vinoth Chandar

Re: [DISCUSS] Rework of new web site

2019-12-15 Thread vino yang
Hi Lamber, I am not an expert on Jekyll. But big +1 for your proposal to improve the site. Best, Vino Vinoth Chandar 于2019年12月16日周一 上午3:15写道: > Thanks for taking the time to improve the site. Will review closely and get > back to you. > > On Sun, Dec 15, 2019 at 11:02 AM lamberken wrote: > >

Re: [DISCUSS] Rework of new web site

2019-12-15 Thread Vinoth Chandar
Thanks for taking the time to improve the site. Will review closely and get back to you. On Sun, Dec 15, 2019 at 11:02 AM lamberken wrote: > > > Hello, everyone. > > > Compare to the web site of Delta Lake[1] and Apache Iceberg[2], they may > looks better than hudi project[3]. > > > I delved

Re:Re: Re: Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-13 Thread lamberken
Hi, @vinoth Okay, I see. If we don't want existing users to do any upgrading or reconfigurations, then this refactor work will not make much sense. This issue can be closed, because ConfigOptions and these builders do the same things. From another side, if we finish this work before a stable

Re: Re: Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-13 Thread Vinoth Chandar
Hi, Are you saying these classes needs to change? If so, understood. But are you planning on renaming configs or relocating them? We dont want existing users to do any upgrading or reconfigurations On Fri, Dec 13, 2019 at 10:28 AM lamberken wrote: > > > Hi, > > > They need to change due to

Re:Re: Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-13 Thread lamberken
Hi, They need to change due to this, because only HoodieWriteConfig and *Options will be kept. best, lamber-ken At 2019-12-14 01:23:35, "Vinoth Chandar" wrote: >Hi, > >We are trying to understand if existing jobs (datasource, deltastreamer, >anything else) needs to change due to this. >

Re: [DISCUSS] Scaling community support

2019-12-13 Thread Vinoth Chandar
Thanks everyone for chiming in! Any more thoughts, before I try and summarize? On Tue, Dec 10, 2019 at 9:33 PM Y Ethan Guo wrote: > Here are my two cents in addition to the great suggestions in the thread: > > I agree with @Sivabalan that folks in Hudi community have different levels > of

Re: Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-13 Thread Vinoth Chandar
Hi, We are trying to understand if existing jobs (datasource, deltastreamer, anything else) needs to change due to this. On Wed, Dec 11, 2019 at 7:18 PM lamberken wrote: > > > Hi, @vinoth > > > 1, Hoodie*Config classes are only used to set default value when call > their build method

Re: [DISCUSS] RFC-12 : Efficient migration of large parquet tables to Apache Hudi

2019-12-13 Thread Vinoth Chandar
+1 (per asf policy) +100 per my own excitement :) .. Happy to review this! On Fri, Dec 13, 2019 at 3:07 AM Balaji Varadarajan wrote: > With Apache Hudi growing in popularity, one of the fundamental challenges > for users has been about efficiently migrating their historical datasets to >

Re: [DISCUSS] Default partition path in TimestampBasedKeyGenerator

2019-12-13 Thread Pratyaksh Sharma
Sure Balaji, https://jira.apache.org/jira/browse/HUDI-406 tracks this. On Fri, Dec 13, 2019 at 4:43 PM Balaji Varadarajan wrote: > Thanks Shahidha for the quick response. > > Pratyaksh, I am ok with making the behavior consistent with other Key > generators. Please go ahead and submit a PR. >

Re: [DISCUSS] Default partition path in TimestampBasedKeyGenerator

2019-12-13 Thread Balaji Varadarajan
Thanks Shahidha for the quick response. Pratyaksh, I am ok with making the behavior consistent with other Key generators. Please go ahead and submit a PR. Thanks, Balaji.V On Thu, Dec 12, 2019 at 10:34 PM Pratyaksh Sharma wrote: > Hi Shahida, > > Thank you for the clarification. Actually I

Re: [DISCUSS] Default partition path in TimestampBasedKeyGenerator

2019-12-12 Thread Pratyaksh Sharma
Hi Shahida, Thank you for the clarification. Actually I was thinking about a corner case where we define the partition field and in some incoming record, the value for the corresponding defined partition field is not present. Such cases would result in exception and job will get killed. On Fri,

Re: [DISCUSS] Default partition path in TimestampBasedKeyGenerator

2019-12-12 Thread Shahida Khan
Hi Pratyaksh, As far as I understand, basic requirement of TimestampBasedKeyGenerator is converting the partitions into timebased dateformat. *e.g.* your columns is in Unix Timestamp which need to convert to dateformat like '2019/12/10' There will never be scenario where you won't give

Re: [DISCUSS] Next Apache Release

2019-12-12 Thread Vinoth Chandar
@pratyaksh : yes that would be a cool feature to do as well. :). Lets get it in @all Please think about anything else, you want to get in to the next release and tag them with fixVersion=0.5.1., so it shows up. On Thu, Dec 12, 2019 at 5:23 AM vino yang wrote: > +1 for leesf to be the release

Re: [DISCUSS] Next Apache Release

2019-12-12 Thread vino yang
+1 for leesf to be the release manager for 0.5.1. Best, Vino Balaji Varadarajan 于2019年12月12日周四 下午2:38写道: > + 1 from me as well for having @leesf be the release manager for 0.5.1. > @leesf - Appreciate your spirit in helping Hudi community. > Balaji.VOn Wednesday, December 11, 2019,

Re: [DISCUSS] Next Apache Release

2019-12-11 Thread Pratyaksh Sharma
Hi Vinoth, We are targeting HUDI-288 also as part of 0.5.1 release. I will change the fix version of that jira as well. Right now, it is not included in the list you shared above. On Thu, Dec 12, 2019 at 8:22 AM Vinoth Chandar wrote: > +1 for

Re:Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread lamberken
Hi, @vinoth 1, Hoodie*Config classes are only used to set default value when call their build method currently. They will be replaced by HoodieMemoryOptions, HoodieIndexOptions, HoodieHBaseIndexOptions, etc... 2, I don't understand the question "It is not clear to me whether there is any

Re: Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread Vinoth Chandar
I actually prefer the builder pattern for making the configs, because I can do `builder.` in the IDE and actually see all the options... That said, most developers program against the Spark datasource and so this may not be useful, unless we expose a builder for that.. I will concede that since

Re: [DISCUSS] Next Apache Release

2019-12-11 Thread Vinoth Chandar
+1 for leesf, driving the release.. >From http://www.apache.org/dev/release-publishing.html#release_manager, it does explicitly confirm that any committer can be RM. I am happy to volunteer my services to assist leesf in the process. @all : Please speak up if you have concerns with the

Re: [DISCUSS] Next Apache Release

2019-12-11 Thread leesf
Hi Balaji, Thanks for kicking the discussion off. +1 to release next version as we made many improvements since last released version and Jan is reasonable considering the upcoming holidays. Besides I am wondering if I can be the release manager of 0.5.1 to work with you. It is always

Re: Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread Sivabalan
Let me summarize your initial proposal and then will get into details. - Introduce ConfigOptions for ease of handling of default values. - Remove all Hoodie*Config classes and just have HoodieWriteConfig. What this means is that, every other config file will be replaced by ConfigOptions. eg,

Re:Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread lamberken
Hi, On 1,2. Yes, you are right, moving the getter to the component level Config class itself. On 3, HoodieWriteConfig can also set value through ConfigOption, small code snippets. From the bellow snippets, we can see that clients need to know each component's builders and also call

Re: Re: [DISCUSS] Refactor of the configuration framework of hudi project

2019-12-11 Thread Vinoth Chandar
Hi Lamber-ken, I looked at the sample PR you put up as well. On 1,2 => Seems your intent is to replace these with moving the getter to the component level Config class itself? I am fine with that (although I think its not that big of a hurdle really to use atm). But, once we do that we could

Re: [DISCUSS] Scaling community support

2019-12-10 Thread Y Ethan Guo
Here are my two cents in addition to the great suggestions in the thread: I agree with @Sivabalan that folks in Hudi community have different levels of expertise and amount of effort to put in the community. So in general, it may be good to have PoCs or contributors for each area in Hudi, e.g.,

Re: [DISCUSS] Scaling community support

2019-12-09 Thread Kabeer Ahmed
Hi Vinoth Regarding [1], I recommend that people who cannot triage or find a owner also chip in. i.e. make this open ended so that anyone from community could answer it. But if the thread reaches a dead end or doesnt progress, then such threads should be picked up by someone more experienced

Re:Re: Re:[DISCUSS] Scaling community support

2019-12-08 Thread lamberken
Okay, thanks for reminding me, I'll see earlier discuss thread. At 2019-12-09 14:09:56, "Vinoth Chandar" wrote: Please see an earlier discuss thread on the same topic - GH issues. Lets please keep this thread to discuss support process, not logistics, if I may say so :) On Sun, Dec 8,

Re: [DISCUSS] Introduce stricter comment and code style validation rules

2019-12-02 Thread Vinoth Chandar
Hi Nicholas, Sorry for the late reply. Thanksgiving :) >>Now Hudi's design, in order to highlight its core components, is a patchwork of the Spark RDD API mixed with business logic scattered in multiple modules and various types of methods. Agree that the hudi-client module needs to be

Re: [Discuss] Convenient time for weekly sync meeting

2019-11-25 Thread Vinoth Chandar
Look forward as well! On Sun, Nov 24, 2019 at 1:15 PM Kabeer Ahmed wrote: > Vinoth - my friend. Apologies for the delayed response. > > I will wake up at 5am and join the call. It is no worry for me. For 1 > person, having a second slot doesnt make sense at all. Plus I would love to > be on the

Re: [Discuss] Migrate from log4j to slf4j

2019-11-25 Thread Vinoth Chandar
Hi, Its log4j actually across the board. (I think there are a couple files that have non log4j loggers? might be good to fix to log4j as well for now to be consistent) Nonetheless, there is a JIRA for this already https://issues.apache.org/jira/browse/HUDI-233 Main thing we need to be mindful

Re: [Discuss] Convenient time for weekly sync meeting

2019-11-24 Thread Kabeer Ahmed
Vinoth - my friend. Apologies for the delayed response. I will wake up at 5am and join the call. It is no worry for me. For 1 person, having a second slot doesnt make sense at all. Plus I would love to be on the call with max attendance. Whole heartedly appreciate the kind gesture. Please be

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-22 Thread Gurudatt Kulkarni
I will take this up. On Thu, Nov 21, 2019 at 8:05 AM vino yang wrote: > +1, we can try to use this feature. > > Best, > Vino > > Vinoth Chandar 于2019年11月21日周四 上午8:28写道: > > > Github actions seems to be able to do this easily as well > > > > >

Re: Re: [DISCUSS] Introduce stricter comment and code style validation rules

2019-11-21 Thread vino yang
Hi lamberken, Reasonable. Will change the description of jira issue soon. Best, Vino lamberken 于2019年11月21日周四 下午4:22写道: > Hi, vino. I think we can set severity to info level, if so, use can check > style themself. > > > > > > > At 2019-11-21 15:43:34, "vino yang" wrote: > >Hi all, > > > >The

Re: [DISCUSS] Introduce stricter comment and code style validation rules

2019-11-20 Thread vino yang
Hi all, The umbrella issue[1] has been created, please feel free to join us to improve the comment and code quality. Best, Vino [1]: https://issues.apache.org/jira/browse/HUDI-354 vino yang 于2019年11月20日周三 下午7:33写道: > > Hi guys, > > Since

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-20 Thread vino yang
+1, we can try to use this feature. Best, Vino Vinoth Chandar 于2019年11月21日周四 上午8:28写道: > Github actions seems to be able to do this easily as well > > https://github.com/actions/starter-workflows/blob/master/automation/stale.yml > > On Mon, Nov 18, 2019 at 10:33 PM Gurudatt Kulkarni > wrote:

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-20 Thread Vinoth Chandar
Github actions seems to be able to do this easily as well https://github.com/actions/starter-workflows/blob/master/automation/stale.yml On Mon, Nov 18, 2019 at 10:33 PM Gurudatt Kulkarni wrote: > > With templates, we can collect good information while people file the > > issues..Not sure about

Re: [DISCUSS] Introduce stricter comment and code style validation rules

2019-11-20 Thread vino yang
Hi guys, Since there is no objection. I will create an umbrella issue to track this work. The plan is: 1) Given relevant check style rules to find all the illegal points; 2) We will refactor modules one by one, each module mappings to one subtask; 3) Add global check style rule for the whole

Re: [DISCUSS] Introduce stricter comment and code style validation rules

2019-11-19 Thread Y Ethan Guo
+1 on all of the proposed rules. These will also make the javadoc more readable. On Mon, Nov 18, 2019 at 5:55 PM Vinoth Chandar wrote: > +1 on all three. > > Would there be a overhaul of existing code to add comments to all classes? > We are pretty reasonable already, but good to get this in

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-18 Thread Gurudatt Kulkarni
> With templates, we can collect good information while people file the > issues..Not sure about permissions we have on JIRA to enable bots, but may > have more luck on github workflows doing these already? > Can we do templates/required fields with JIRAs as well? Yes, it is very much possible

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-18 Thread vino yang
Hi Gurudatt and Vinoth, Thanks for sharing your valuable opinion. Considering Hudi is still a growing project. I agree that it's better to keep Github's Issues tab as a way to discuss problems currently. +1 to introduce issue template and management bot. Best, Vino Vinoth Chandar

Re: [DISCUSS] Introduce stricter comment and code style validation rules

2019-11-18 Thread Vinoth Chandar
+1 on all three. Would there be a overhaul of existing code to add comments to all classes? We are pretty reasonable already, but good to get this in shape. 17:54:37 [incubator-hudi]$ grep -R -B 1 "public class" hudi-*/src/main/java | grep "public class" | wc -l 274 17:54:50

Re: [DISCUSS] Introduce stricter comment and code style validation rules

2019-11-18 Thread lamberken
+1, it’s a hard work but meaningful. | | lamberken IT | | ly.com lamber...@163.com | 签名由网易邮箱大师定制 On 11/19/2019 07:27,leesf wrote: Hi vino, Thanks for bringing ths discussion up. +1 on all. the third one seems a bit too strict and usually requires manual processing of the import order, but I

Re: [DISCUSS] Introduce stricter comment and code style validation rules

2019-11-18 Thread leesf
Hi vino, Thanks for bringing ths discussion up. +1 on all. the third one seems a bit too strict and usually requires manual processing of the import order, but I also agree and think it makes our project more professional. And I learned that the calcite community is also applying this rule.

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-18 Thread Gurudatt Kulkarni
Hi Vinoth / Vino, Just adding my 2 cents to the discussion. Yes, I agree that GitHub issues are low friction and can be the first line of support. It will help in keeping the JIRA clean. Potential solutions that I have come across in the community, 1. Introduce an issue template. 2. Add a bot

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-18 Thread Vinoth Chandar
@vinoyang. All valid points. I just have 1 argument (all others you are right and I have always known this tradeoff) for keeping Github issues, when we are still growing the community and that is : it lets anyone with a github id raise an issue without forcing to sign up for JIRA account. For

Re: [DISCUSS] Introduce stricter comment and code style validation rules

2019-11-18 Thread Pratyaksh Sharma
Having proper class level and method level comments always makes the life easier for any new user. +1 for points 1,2 and 4. On Mon, Nov 18, 2019 at 5:59 PM vino yang wrote: > Hi guys, > > Currently, Hudi's comment and code styles do not have a uniform > specification on certain rules. I will

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-15 Thread vino yang
Hi all, I am not a whimsy, a lot of Apache projects are doing this. Not just Flink, the project list is very long, including Spark, Kafka, Kylin, Calcite, Hadoop, storm... It's no accident that so many projects do this. As the project grows rapidly, we will find that two ways that report issues

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-15 Thread leesf
Hi vino, Thanks for bringing up the discussion. IMHO, the issues and jira are not opposite and we could use them both for their advantages. Such as for some simple questions which is no need to open a jira or send a mail [1], users could get quick response from others via issues and then close

Re: [DISCUSS] RFC-10: Restructuring and auto-generation of docs

2019-11-15 Thread Y Ethan Guo
On Fri, Nov 15, 2019 at 5:33 AM Vinoth Chandar wrote: > Sorry. bit late here to this party .. > > +1 on having the .md files alone on master along with code. Will comment on > the RFC itself. > > Right now, we have a separate branch which may be sufficient IMO. Separate > repo means, also

Re: [DISCUSS] RFC-10: Restructuring and auto-generation of docs

2019-11-15 Thread Vinoth Chandar
Sorry. bit late here to this party .. +1 on having the .md files alone on master along with code. Will comment on the RFC itself. Right now, we have a separate branch which may be sufficient IMO. Separate repo means, also separate access control/management. ? We should do a better job. But we

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-15 Thread Vinoth Chandar
Hi Vino, To echo what Nishith was saying, issues is only being used currently for support i.e looking at stack traces for failures, user errors. Any real work resulting from that always gets a JIRA. I mulled the same thing - disabling issues - a while back. The value I see it adding is - if you

Re: [DISCUSS] Hide Github issues tab and Unified management of issues in JIRA

2019-11-15 Thread Nishith
Hey Vino, Earlier this year, we actually migrated all issues from GitHub to Jira and that’s the recommended route to discuss issues (besides the mailing thread) The remaining issues are either new (folks might open an issue regardless) and we help navigate those folks to open JIRA’s or there

Re: [DISCUSS] RFC-10: Restructuring and auto-generation of docs

2019-11-14 Thread Y Ethan Guo
Hey Gurudatt, Thanks for the great feedback! Comments inlined... On Wed, Nov 13, 2019 at 10:57 PM Gurudatt Kulkarni wrote: > Hi Ethan, > > Thanks for the RFC. I have a few observations about the docs. I saw the > diagram for the docs, asf-site and master are kept as separate branches, >

Re: DISCUSS RFC 6 - Add indexing support to the log file

2019-11-14 Thread Vinoth Chandar
Since attachments don't really work on the mailing list, Can you may be attach them to comments on the RFC itself? In this scenario, we will get a larger range than is probably in the newly compacted base file, correct? Current thinking is, yes it will lead to less efficient pruning by ranges,

Re: DISCUSS RFC 6 - Add indexing support to the log file

2019-11-14 Thread Sivabalan
I have s doubt on the design. I guess this is the right place to discuss. I want to understand how compaction interplays with this new scheme. Let's assume all log block are of new format only. Once compaction completes, those log blocks/files not compacted will have range info pertaining to

Re: [DISCUSS] RFC-10: Restructuring and auto-generation of docs

2019-11-13 Thread Gurudatt Kulkarni
Hi Ethan, Thanks for the RFC. I have a few observations about the docs. I saw the diagram for the docs, asf-site and master are kept as separate branches, currently. (1) I suggest two approaches for maintaining the docs looking at how other popular Apache projects do, - Apache Flink

Re: [DISCUSS] Simplification of terminologies

2019-11-13 Thread Vinoth Chandar
Will review the POC in cwiki. +1 Based on this feedback, I will proceed with the changes. Thanks all! On Tue, Nov 12, 2019 at 10:47 PM Semantic Beeng wrote: > @vc, I think of it as elaborating the #ubiquitouslanguage in DDD. > See private email with references to a small POC in wiki and

Re: [DISCUSS] Intent to RFC: Restructuring and auto-generation of docs

2019-11-13 Thread Y Ethan Guo
Thanks for the feedback! I'll create another thread for the discussion of the RFC. On Wed, Nov 13, 2019 at 11:00 AM Raymond Xu wrote: > > > > (1) docs is as important as code and something developers have to deal > with > > all the time. so good to get broad feedback on this. > > > Agree. I

Re: [DISCUSS] Intent to RFC: Restructuring and auto-generation of docs

2019-11-13 Thread Raymond Xu
> > (1) docs is as important as code and something developers have to deal with > all the time. so good to get broad feedback on this. Agree. I always favor well-organized documentations. (2) We actually expanded RFCs to include even "ideas", providing way for > someone to write down thoughts

Re: [DISCUSS] Intent to RFC: Restructuring and auto-generation of docs

2019-11-13 Thread vbal...@apache.org
+1 on the initiative. I have give cwiki access to both Ethan and Vinoth Balaji.V On Wednesday, November 13, 2019, 09:14:16 AM PST, Vinoth Chandar wrote: Thanks for initiating this, Ethan. Will send detailed comments in a while. @raymond, I actually think this deserves an RFC for two

Re: [DISCUSS] Intent to RFC: Restructuring and auto-generation of docs

2019-11-13 Thread Vinoth Chandar
Thanks for initiating this, Ethan. Will send detailed comments in a while. @raymond, I actually think this deserves an RFC for two reasons. (1) docs is as important as code and something developers have to deal with all the time. so good to get broad feedback on this. (2) We actually expanded

Re: [DISCUSS] Intent to RFC: Restructuring and auto-generation of docs

2019-11-13 Thread Raymond Xu
Hi Ethan, I like the idea and I'm all for it. Actually this is one of the roadmap items under "Ease of Use" for 0.5.1 My only concern is: does this fit into an RFC? I believe an RFC is about adding a new feature to the framework while having better docs fall within the dev experience realm IMO.

Re: [DISCUSS] Intent to RFC: Restructuring and auto-generation of docs

2019-11-13 Thread leesf
+1. It is very practical and thanks for driving the discussion. vinoth and balaji would give your cwiki permission. Best, Leesf vino yang 于2019年11月13日周三 下午5:02写道: > Hi Ethan, > > Thanks for starting this discussion thread. > +1 from my side > > Best, > Vino > > Y Ethan Guo 于2019年11月13日周三

Re: [DISCUSS] Intent to RFC: Restructuring and auto-generation of docs

2019-11-13 Thread vino yang
Hi Ethan, Thanks for starting this discussion thread. +1 from my side Best, Vino Y Ethan Guo 于2019年11月13日周三 下午4:05写道: > Hey Folks, > > I plan to start an RFC in the Docs Overhaul track. The scope of this RFC > will be the restructuring and auto-generation of docs, with the following > goals:

Re: [DISCUSS] Simplification of terminologies

2019-11-12 Thread Vinoth Chandar
Thanks everyone for the feedback. Looks like we are in general agreement. I am inclined to just do 1 & 2 and leave COPY_ON_WRITE as is based on great points Ethan and Shiyan raised. Makes sense.. Will wait for 1-2 days still to close this thread. @semanticbeeing Thats a great idea. Is it more

Re: DISCUSS RFC 7 - Point in time queries on Hudi table (Time-Travel)

2019-11-12 Thread Vinoth Chandar
+1 Will review the RFC On Mon, Nov 11, 2019 at 11:36 PM Balaji Varadarajan wrote: > +1. This would be a powerful feature which would open up use-cases > requiring repeatable query results. > > Balaji.V > > > On Mon, Nov 11, 2019 at 8:12 AM nishith agarwal > wrote: > > > Folks, > > > >

Re: [DISCUSS] New RFC? Hudi dataset snapshotter

2019-11-12 Thread Shiyan Xu
Came up with the first draft. Thank you. https://cwiki.apache.org/confluence/display/HUDI/RFC-9%3A+%28WIP%29+Hudi+Dataset+Snapshotter On Tue, Nov 12, 2019 at 12:44 PM Shiyan Xu wrote: > Thank you all for the +1s! I'll go ahead add a RFC page then. > > On Tue, Nov 12, 2019 at 8:41 AM nishith

Re: [DISCUSS] Simplification of terminologies

2019-11-12 Thread Y. Ethan Guo
+1 on [1] and [2]. For [3], I have similar doubts as Shiyan. For the naming, I can understand the original intent of the analogy for COW which is to make another "copy" of columnar/parquet file upon the modification/update to the records in the file. From the system design point of view, it's

Re: [DISCUSS] New RFC? Hudi dataset snapshotter

2019-11-12 Thread Shiyan Xu
Thank you all for the +1s! I'll go ahead add a RFC page then. On Tue, Nov 12, 2019 at 8:41 AM nishith agarwal wrote: > +1 on the exporter tool idea. > > -Nishith > > On Tue, Nov 12, 2019 at 5:06 AM leesf wrote: > > > +1. and we would discuss it further when design docs are available. > > > >

Re: [Discuss] Convenient time for weekly sync meeting

2019-11-12 Thread Bhavani Sudha Saktheeswaran
@Sivabalan Yes. Tuesday 9 - 10 pm PST still continues to be the 1st slot. On Tue, Nov 12, 2019 at 10:33 AM Sivabalan wrote: > As we work out details for 2nd slot, did we narrow down the slot for 1st > one? Do we have a meeting later today? > > On Mon, Nov 11, 2019 at 3:49 PM Vinoth Chandar

Re: [Discuss] Convenient time for weekly sync meeting

2019-11-12 Thread Sivabalan
As we work out details for 2nd slot, did we narrow down the slot for 1st one? Do we have a meeting later today? On Mon, Nov 11, 2019 at 3:49 PM Vinoth Chandar wrote: > yes. sounds good. As of now, its just Kabeer.@kabeer wdyt? > @nishith Personally, timing is an issue for me, if you are willing

Re: [DISCUSS] Simplification of terminologies

2019-11-12 Thread nishith agarwal
+1 on the first two, don't feel strongly about (3). Thanks, Nishith On Tue, Nov 12, 2019 at 5:03 AM leesf wrote: > [1] +1. `views` indeed confused me a lot. > [2] +1. `snapshot` is more reasonable. > [3] I don't feel very strong to rename it, the current name `COPY_ON_WRITE` > is reasonable

Re: [DISCUSS] New RFC? Hudi dataset snapshotter

2019-11-12 Thread nishith agarwal
+1 on the exporter tool idea. -Nishith On Tue, Nov 12, 2019 at 5:06 AM leesf wrote: > +1. and we would discuss it further when design docs are available. > > Best, > Leesf > > Balaji Varadarajan 于2019年11月12日周二 下午4:17写道: > > > +1 on the exporter tool idea. > > > > On Mon, Nov 11, 2019 at 10:36

<    4   5   6   7   8   9   10   11   >