Re: [DISCUSS] Release 0.6.0 timelines

2020-08-24 Thread vino yang
Glad to know the 0.6.0 has released. Thanks to Sudha for all the efforts. And thanks to everyone. Best, Vino Vinoth Chandar 于2020年8月25日周二 下午12:00写道: > Hi folks, > > As you have may have noticed, the 0.6.0 release is out. Huge shoutout to > our RM, Sudha for pulling this off! > > As always,

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-24 Thread Vinoth Chandar
Hi folks, As you have may have noticed, the 0.6.0 release is out. Huge shoutout to our RM, Sudha for pulling this off! As always, thanks for all our users/contributors. congrats everyone! Onwards and upwards to the next one. Thanks Vinoth On Thu, Aug 20, 2020 at 11:32 AM Vinoth Chandar wrote:

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-22 Thread Shiyan Xu
It can be up to the individual to use the IDE formatter or not, as long as there is a tool to help enforce Checkstyle rules. For people who use IDE formatter, importing Checkstyle.xml as a format scheme does not fully control the formatter's behavior, that's why IDE sometimes gets in the way. But

Re: [DISCUSS] Support for `_hoodie_record_key` as a virtual column

2020-08-22 Thread Sivabalan
Aah, yes. That’s right. On Sat, Aug 22, 2020 at 2:43 AM Vinoth Chandar wrote: > All of the remaining meta fields compress very very nicely. They have > > almost no overhead. > > > > On Fri, Aug 21, 2020 at 12:00 PM Abhishek Modi > > wrote: > > > > > @sivabalan the current plan is to only add

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-22 Thread vino yang
Hi vc, Yes, this part of the practice may have different preferences for different developers. I have never opened the IDE's automatic formatting, nor have I used the IDE's formatting functions artificially. Because I have participated in multiple open source communities, each open source

Re: [DISCUSS] Support for `_hoodie_record_key` as a virtual column

2020-08-22 Thread Vinoth Chandar
All of the remaining meta fields compress very very nicely. They have almost no overhead. On Fri, Aug 21, 2020 at 12:00 PM Abhishek Modi wrote: > @sivabalan the current plan is to only add this for hoodie_record_key. But > I'm hoping to make the implementation general enough to add other

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-22 Thread Vinoth Chandar
>But, IMO, we can ignore the IDE here, if it breaks the code style, checkstyle will stop building and spotless will work. I differ here slightly. Most people reformat code using the "format code" in the IDE. And IDEs also can reorganize the code when you save etc. We need a solid way to not be

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-21 Thread Nishith
+1 for spotless, automating the formatting will definitely help productivity and turn around time for PRs. -Nishith Sent from my iPhone > On Aug 21, 2020, at 11:53 AM, Sivabalan wrote: > > totally +1 for spotless. > > >> On Thu, Aug 20, 2020 at 8:53 AM leesf wrote: >> >> +1 on using mvn

Re: [DISCUSS] Support for `_hoodie_record_key` as a virtual column

2020-08-21 Thread Abhishek Modi
@sivabalan the current plan is to only add this for hoodie_record_key. But I'm hoping to make the implementation general enough to add other columns as well going forward :) On Fri, Aug 21, 2020 at 11:49 AM Sivabalan wrote: > +1 for virtual record keys. Do you also propose to generalize this

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-21 Thread Sivabalan
totally +1 for spotless. On Thu, Aug 20, 2020 at 8:53 AM leesf wrote: > +1 on using mvn spotless:apply to fix the codestyle. > > Bhavani Sudha 于2020年8月19日周三 下午12:31写道: > > > +1 on auto code formatting. I also think it should be okay to be even > more > > restrictive by failing builds when the

Re: [DISCUSS] Support for `_hoodie_record_key` as a virtual column

2020-08-21 Thread Sivabalan
+1 for virtual record keys. Do you also propose to generalize this for partition path as well ? On Fri, Aug 21, 2020 at 4:20 AM Pratyaksh Sharma wrote: > This is a good option to have. :) > > On Thu, Aug 20, 2020 at 11:25 PM Vinoth Chandar wrote: > > > IIRC _hoodie_record_key was supposed to

Re: [DISCUSS] Support for `_hoodie_record_key` as a virtual column

2020-08-21 Thread Pratyaksh Sharma
This is a good option to have. :) On Thu, Aug 20, 2020 at 11:25 PM Vinoth Chandar wrote: > IIRC _hoodie_record_key was supposed to this standardized key field. :) > Anyways, it's good to provide this option to the user. > So +1 for. RFC/further discussion. > > To level set, I want to also share

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-20 Thread Vinoth Chandar
RC-1 is out! lets all test and vote! On Wed, Aug 19, 2020 at 8:49 AM Pratyaksh Sharma wrote: > Hi Allen, > > Yes, it's a serialization (runtime) issue. I am working on fixing it. > > On Wed, Aug 19, 2020 at 7:04 PM Sivabalan wrote: > > > That would be of great help Allen. Much appreciated. >

Re: [DISCUSS] Support Spark Structured Streaming read from Hudi table

2020-08-20 Thread Vinoth Chandar
I would for all these new things to be revamped on top of Spark 3's newer APIs (it's kind of frustrating that the datasource APIs don't stabilize easily in Spark) I am thinking we can implement a "hudi3" format using Spark 3, with support for SQL Merges, existing functionality and a redone Spark

Re: [DISCUSS] Support for `_hoodie_record_key` as a virtual column

2020-08-20 Thread Vinoth Chandar
IIRC _hoodie_record_key was supposed to this standardized key field. :) Anyways, it's good to provide this option to the user. So +1 for. RFC/further discussion. To level set, I want to also share some of the benefits of having an explicit key column. a) if you build your data lake using a bunch

Re: [DISCUSS] Support Spark Structured Streaming read from Hudi table

2020-08-20 Thread Balaji Varadarajan
Hi linshan, Sorry for the delay in responding. It is better to discuss code changes over draft PR. Can you open one and tag us there. At a high level, it looks like you are using Spark Datasource v2 APIs while currently the structured streaming write is implemented using V1 API. Let's discuss

Re: [DISCUSS] Support for `_hoodie_record_key` as a virtual column

2020-08-20 Thread Balaji Varadarajan
+1. This should be good to have as an option. If everybody agrees, please go ahead with RFC and we can discuss details there. Balaji.VOn Tuesday, August 18, 2020, 04:37:18 PM PDT, Abhishek Modi wrote: Hi everyone! I was hoping to discuss adding support for making `_hoodie_record_key`

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-20 Thread leesf
+1 on using mvn spotless:apply to fix the codestyle. Bhavani Sudha 于2020年8月19日周三 下午12:31写道: > +1 on auto code formatting. I also think it should be okay to be even more > restrictive by failing builds when the code format is not adhered (in any > environment). That way everyone is forced to use

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-19 Thread Pratyaksh Sharma
Hi Allen, Yes, it's a serialization (runtime) issue. I am working on fixing it. On Wed, Aug 19, 2020 at 7:04 PM Sivabalan wrote: > That would be of great help Allen. Much appreciated. > > On Wed, Aug 19, 2020 at 9:30 AM Allen Underwood > wrote: > > > Thanks Sivabalan, > > > > That's

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-19 Thread Sivabalan
That would be of great help Allen. Much appreciated. On Wed, Aug 19, 2020 at 9:30 AM Allen Underwood wrote: > Thanks Sivabalan, > > That's definitely the issue I had to resolve when introducing Joda time. > I'll have a look back at my code to see how I got around it. If I remember > correctly,

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-19 Thread Sivabalan
I am not sure if all findings so far have been documented here. but this is the ticket AFSIK: https://issues.apache.org/jira/browse/HUDI-1177 On Wed, Aug 19, 2020 at 9:15 AM Allen Underwood wrote: > Just out of curiosity - what's the blocker - you have an issue? I had > originally done the

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-19 Thread Allen Underwood
Just out of curiosity - what's the blocker - you have an issue? I had originally done the code to make that work. On Wed, Aug 19, 2020 at 12:51 AM Vinoth Chandar wrote: > We still have 1 blocker issue from TimestampKeyGenerator issue with joda > DateTimeFormatter. Sudha (RM) and Pratyaksh are

Re: Re: [DISCUSS] Release 0.6.0 timelines

2020-08-19 Thread 957029...@qq.com
Hi, vc I also want to do some tests against the release branch. 957029...@qq.com From: Vinoth Chandar Date: 2020-08-19 12:51 To: dev Subject: Re: [DISCUSS] Release 0.6.0 timelines We still have 1 blocker issue from TimestampKeyGenerator issue with joda DateTimeFormatter. Sudha (RM

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-18 Thread Vinoth Chandar
We still have 1 blocker issue from TimestampKeyGenerator issue with joda DateTimeFormatter. Sudha (RM) and Pratyaksh are going to look into this. In the meantime, here's the progress, plans around testing so far. If folks in the community can help test the release branch in the next couple of

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-18 Thread vino yang
> the key challenge has been keeping checkstyle, IDE and spotless agreeing on the same thing. Yes, it's the key thing. But, IMO, we can ignore the IDE here, if it breaks the code style, checkstyle will stop building and spotless will work. Vinoth Chandar 于2020年8月19日周三 上午7:49写道: > the key

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-18 Thread Vinoth Chandar
the key challenge has been keeping checkstyle, IDE and spotless agreeing on the same thing. your understanding is correct. CI will enforce in a similar fashion. Spotless just makes us productive by auto fixing all the checkstyle violations, without having to manually fix by hand. On Tue, Aug 18,

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-18 Thread vbal...@apache.org
+1 on standardizing code formatting. On Tuesday, August 18, 2020, 03:58:42 PM PDT, Vinoth Chandar wrote: can more people please chime in?  This will affect all of us on a daily basis :) On Thu, Aug 13, 2020 at 8:25 AM Gary Li wrote: > Vote for mvn spotless:apply to do the auto fix.

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-18 Thread Vinoth Chandar
can more people please chime in? This will affect all of us on a daily basis :) On Thu, Aug 13, 2020 at 8:25 AM Gary Li wrote: > Vote for mvn spotless:apply to do the auto fix. > > On Thu, Aug 13, 2020 at 1:13 AM Vinoth Chandar wrote: > > > Hi, > > > > Anyone has thoughts on this? > > > > esp

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-18 Thread Bhavani Sudha
Quick update on the RC. Found a build issue when building scala 2.12 and sent a PR for that - https://github.com/apache/hudi/pull/1976 . Working on resolving this in the release branch and updating RC. Will update soon. Thanks, Sudha On Fri, Aug 14, 2020 at 5:56 PM Vinoth Chandar wrote: >

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-14 Thread Vinoth Chandar
Thanks Sudha! This is means master is now open for regular PRs. Thanks for your patience, everyone. On Fri, Aug 14, 2020 at 3:51 PM Bhavani Sudha wrote: > Hello all, > > We have cut the release branch - > https://github.com/apache/hudi/tree/release-0.6.0 . Since it is already > Friday, we will

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-14 Thread Bhavani Sudha
Hello all, We have cut the release branch - https://github.com/apache/hudi/tree/release-0.6.0 . Since it is already Friday, we will be sending the release candidate early next week (after some testing). Happy Friday! Thanks, Sudha On Wed, Aug 12, 2020 at 3:56 PM vbal...@apache.org wrote: > >

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-13 Thread Gary Li
Vote for mvn spotless:apply to do the auto fix. On Thu, Aug 13, 2020 at 1:13 AM Vinoth Chandar wrote: > Hi, > > Anyone has thoughts on this? > > esp leesf/vinoyang, given you both drove much of the initial cleanups. > > On Mon, Aug 10, 2020 at 7:16 PM Shiyan Xu > wrote: > > > in that case,

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-13 Thread Vinoth Chandar
Hi, Anyone has thoughts on this? esp leesf/vinoyang, given you both drove much of the initial cleanups. On Mon, Aug 10, 2020 at 7:16 PM Shiyan Xu wrote: > in that case, yes, all for automation. > > On Mon, Aug 10, 2020 at 7:12 PM Vinoth Chandar wrote: > > > Overall, I think we should

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-12 Thread vbal...@apache.org
Hi Folks, We are continuing to work on CI stabilization and will cut the release once we stabilize the builds hopefully tonight/tomorrow. Thanks,Balaji.V On Tuesday, August 11, 2020, 09:15:05 PM PDT, Vinoth Chandar wrote: Hello all, Update on this. We have landed most of the

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-11 Thread Vinoth Chandar
Hello all, Update on this. We have landed most of the blockers for the 0.6.0 release and I am currently working on the last major blocker, HUDI-1013. We are working through some unexpected CI flakiness. We hope to stabilize master, cut the RC, and then open up master for regular PR merges. ETA

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-10 Thread Shiyan Xu
in that case, yes, all for automation. On Mon, Aug 10, 2020 at 7:12 PM Vinoth Chandar wrote: > Overall, I think we should standardize this across the project. > But most importantly, may be revive the long dormant spotless effort first > to enable autofixing of checkstyle issues, before we add

Re: [DISCUSS] Codestyle: force multiline indentation

2020-08-10 Thread Vinoth Chandar
Overall, I think we should standardize this across the project. But most importantly, may be revive the long dormant spotless effort first to enable autofixing of checkstyle issues, before we add more checking? On Mon, Aug 10, 2020 at 7:04 PM Shiyan Xu wrote: > Hi all, > > I noticed that

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-04 Thread Vinoth Chandar
Small correction: >> Vinoth working on code review, tests for PR 1876, This is landed! On Tue, Aug 4, 2020 at 9:44 PM Bhavani Sudha wrote: > Hello all, > > We are targeting the end of this week to cut RC. Here is an update of where > we are at release blockers. > > 0.6.0 Release blocker

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-04 Thread Bhavani Sudha
Hello all, We are targeting the end of this week to cut RC. Here is an update of where we are at release blockers. 0.6.0 Release blocker status (board ) , - Spark Datasource/MOR

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-03 Thread Vinoth Chandar
+1 (we need to formalize this well) But having just blockers land first, would help not just with rebasing, but also wind down towards cutting an RC by end of week. On Mon, Aug 3, 2020 at 2:53 PM Bhavani Sudha wrote: > Hello all, > > As we are all hustling towards getting the blockers in, I

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-03 Thread Bhavani Sudha
Hello all, As we are all hustling towards getting the blockers in, I wanted to propose a code/merge freeze until we cut a release for 0.6.0 and restrict it to only merging blockers identified for this release. It would reduce rebasing time for blockers in progress. If we feel some issue is a

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-03 Thread Vinoth Chandar
Given enough time has passed, Sudha can be our RM for 0.6.0. On the release blocker progress, we landed few blockers over the weekend, with some almost ready for landing Will send out a status update again tomorrow night PST! On Mon, Aug 3, 2020 at 8:17 AM Vinoth Chandar wrote: > Hi anton. >

Re: [DISCUSS] Release 0.6.0 timelines

2020-08-03 Thread Vinoth Chandar
Hi anton. We were hoping to cut a release by last weekend. New target is this weekend! (tbh we were thrown off a bit due to COVID in Q2, given a lot of PMC/Committers had additional kid care duties. Now we are back to normal cadence) Going forward, I plan to start a discussion around planning,

Re: DISCUSS code, config, design walk through sessions

2020-08-02 Thread Vinoth Chandar
Hi all, Sorry for the late reply. >Can we have those session in a regular basis? I personally find today's session are super helpful! yes. we totally can. May be once a quarter or something? Please start a new DISCUSS thread, to discuss further topics, cadence etc On Sat, Aug 1, 2020 at 9:19

Re: DISCUSS code, config, design walk through sessions

2020-08-01 Thread Sivabalan
I am still editing the video for some minor portions. Will update the community once its ready for public consumption. On Sat, Aug 1, 2020 at 8:45 AM leesf wrote: > Noticed that the video had been uploaded > https://www.youtube.com/watch?v=N2eDfU_rQ_U > > Ranganath Tirumala 于2020年7月31日周五

Re: DISCUSS code, config, design walk through sessions

2020-08-01 Thread leesf
Noticed that the video had been uploaded https://www.youtube.com/watch?v=N2eDfU_rQ_U Ranganath Tirumala 于2020年7月31日周五 上午8:13写道: > Hi Vinoth, > > I am interested in the recording as well. The timings didn't suit me in > Australia (1 am). > > Regards, > > Ranganath > > On Fri, 31 Jul 2020 at

Re: [DISCUSS] Release 0.6.0 timelines

2020-07-31 Thread Anton Zuyeu
Hi All, I apologize for possibly dumb question but when was 0.6.0 planned to be released? Can't find any dates on Hudi related pages. On Thu, Jul 30, 2020 at 10:36 AM Vinoth Chandar wrote: > Is anyone able to help with the at risk items? :) > > On Thu, Jul 30, 2020 at 7:07 AM leesf wrote: > >

Re: DISCUSS code, config, design walk through sessions

2020-07-30 Thread Ranganath Tirumala
Hi Vinoth, I am interested in the recording as well. The timings didn't suit me in Australia (1 am). Regards, Ranganath On Fri, 31 Jul 2020 at 04:29, tanu dua wrote: > I missed it due to work commitments. Can we please have the recording ? > > On Thu, 30 Jul 2020 at 11:52 PM, Zijing Guo >

Re: DISCUSS code, config, design walk through sessions

2020-07-30 Thread tanu dua
I missed it due to work commitments. Can we please have the recording ? On Thu, 30 Jul 2020 at 11:52 PM, Zijing Guo wrote: > Thanks for the great session Vinoth! Can we have those session in a > regular basis? I personally find today's session are super helpful! > On Thursday, July 30,

Re: DISCUSS code, config, design walk through sessions

2020-07-30 Thread Zijing Guo
Thanks for the great session Vinoth!  Can we have those session in a regular basis? I personally find today's session are super helpful! On Thursday, July 30, 2020, 01:36:06 PM EDT, Vinoth Chandar wrote: Thanks everyone who joined! I am hanging out in #general on slack, if we want to

Re: [DISCUSS] Release 0.6.0 timelines

2020-07-30 Thread Vinoth Chandar
Is anyone able to help with the at risk items? :) On Thu, Jul 30, 2020 at 7:07 AM leesf wrote: > @Vinoth Chandar Thanks for the reminder, marked to > blocker, and next week would be ok to me. > > Vinoth Chandar 于2020年7月30日周四 上午11:35写道: > > > @leesf can we please mark the relevant ticket(s)

Re: DISCUSS code, config, design walk through sessions

2020-07-30 Thread Vinoth Chandar
Thanks everyone who joined! I am hanging out in #general on slack, if we want to finish off any remaining questions. Please @vc me for questions. On Thu, Jul 30, 2020 at 8:00 AM Vinoth Chandar wrote: > yes! Please join > > On Thu, Jul 30, 2020 at 7:35 AM Pratyaksh Sharma > wrote: > >> Hi

Re: DISCUSS code, config, design walk through sessions

2020-07-30 Thread Vinoth Chandar
yes! Please join On Thu, Jul 30, 2020 at 7:35 AM Pratyaksh Sharma wrote: > Hi Vinoth, > > Is this happening now? > > On Mon, Jul 27, 2020 at 3:50 AM Vinoth Chandar wrote: > > > Hi all, > > > > We will be using the conference link we use for the community sync. > > > > >

Re: DISCUSS code, config, design walk through sessions

2020-07-30 Thread Pratyaksh Sharma
Hi Vinoth, Is this happening now? On Mon, Jul 27, 2020 at 3:50 AM Vinoth Chandar wrote: > Hi all, > > We will be using the conference link we use for the community sync. > > https://cwiki.apache.org/confluence/display/HUDI/Apache+Hudi+Community+Weekly+Sync > > > Once again, the date time is :

Re: [DISCUSS] Release 0.6.0 timelines

2020-07-30 Thread leesf
@Vinoth Chandar Thanks for the reminder, marked to blocker, and next week would be ok to me. Vinoth Chandar 于2020年7月30日周四 上午11:35写道: > @leesf can we please mark the relevant ticket(s) > with blocker priority, so it's easier to track? > > Looks like we are nearing a choice for RM. > Any more

Re: [DISCUSS] Release 0.6.0 timelines

2020-07-29 Thread Vinoth Chandar
@leesf can we please mark the relevant ticket(s) with blocker priority, so it's easier to track? Looks like we are nearing a choice for RM. Any more thoughts on timelines? Looks like everyone so far is leaning towards completeness of the release over doing it sooner? On Wed, Jul 29, 2020 at

Re: [DISCUSS] Release 0.6.0 timelines

2020-07-29 Thread vino yang
+1 on Sudha being RM for the release. And looking forward to 0.6.0. Best, Vino leesf 于2020年7月30日周四 上午9:15写道: > +1 on Sudha on being RM, and PR#1810 > https://github.com/apache/hudi/pull/1810 (abstract hive sync module) would > also goes to this release. > > Sivabalan 于2020年7月30日周四 上午2:18写道: >

Re: [DISCUSS] Release 0.6.0 timelines

2020-07-29 Thread leesf
+1 on Sudha on being RM, and PR#1810 https://github.com/apache/hudi/pull/1810 (abstract hive sync module) would also goes to this release. Sivabalan 于2020年7月30日周四 上午2:18写道: > +1 on Sudha being RM for the release. Makes sense to push the release by a > week. > > On Wed, Jul 29, 2020 at 1:35 AM

Re: [DISCUSS] Release 0.6.0 timelines

2020-07-29 Thread Sivabalan
+1 on Sudha being RM for the release. Makes sense to push the release by a week. On Wed, Jul 29, 2020 at 1:35 AM vbal...@apache.org wrote: > +1 on Sudha on being RM for this release. Also agree on pushing the > release date by a week. > Balaji.V > On Tuesday, July 28, 2020, 10:08:41 PM

Re: [DISCUSS] Release 0.6.0 timelines

2020-07-28 Thread vbal...@apache.org
+1 on Sudha on being RM for this release. Also agree on pushing the release date by a week. Balaji.V On Tuesday, July 28, 2020, 10:08:41 PM PDT, Bhavani Sudha wrote: Thanks Vinoth for the update. I can volunteer to RM this release. Understand 0.6.0 release is delayed than what we

Re: [DISCUSS] Release 0.6.0 timelines

2020-07-28 Thread Bhavani Sudha
Thanks Vinoth for the update. I can volunteer to RM this release. Understand 0.6.0 release is delayed than what we originally discussed. Q2 has been really hard with COVID and everything going on. Given that we are at this point, I feel by delaying the RC by a week or so more if we can get some

Re: [DISCUSS] Hyperspace + Hudi

2020-07-28 Thread Vinoth Chandar
Very informative. Thanks! On Mon, Jul 27, 2020 at 5:09 PM nishith agarwal wrote: > Yes. > > SparkSession has a reference to something called a SessionState here -> > > https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L152 > > Each

Re: [DISCUSS] Adding Metrics to Hudi Common

2020-07-28 Thread Shiyan Xu
yea make sense to keep module-specific metrics classes, like deltastreamer metrics should just reside in hudi-utilities. On Tue, Jul 28, 2020 at 9:52 AM Vinoth Chandar wrote: > IMO having metrics within each module is probably more maintainable. > the common metrics interfaces/base classes can

Re: [DISCUSS] Adding Metrics to Hudi Common

2020-07-28 Thread Vinoth Chandar
IMO having metrics within each module is probably more maintainable. the common metrics interfaces/base classes can just live in hudi-common for now? On Tue, Jul 28, 2020 at 9:06 AM Shiyan Xu wrote: > +1. It would be very helpful to have more internal performance/cost-related > metrics (perhaps

Re: [DISCUSS] Adding Metrics to Hudi Common

2020-07-28 Thread Shiyan Xu
+1. It would be very helpful to have more internal performance/cost-related metrics (perhaps optionally enabled). Also it does make sense to move metrics classes to common, or even to a separate module (if the scope gets extended a lot further) On Tue, Jul 28, 2020 at 8:43 AM vbal...@apache.org

Re: [DISCUSS] Adding Metrics to Hudi Common

2020-07-28 Thread vbal...@apache.org
+1. Would love to see observability metrics exposed for file system RPC calls. This would greatly help in figuring out RPC performance and bottlenecks across varied file-systems that Hudi supports.  On Tuesday, July 28, 2020, 08:24:54 AM PDT, Nishith wrote: +1 Having the metrics

Re: [DISCUSS] Adding Metrics to Hudi Common

2020-07-28 Thread Nishith
+1 Having the metrics flexibly in common will help in building observability in other modules. Thanks, Nishith > On Jul 28, 2020, at 7:28 AM, Vinoth Chandar wrote: > > +1 as well. > > Given we support many reporters now. Could you please further > improve/retain modularity. > >> On Mon,

Re: [DISCUSS] Adding Metrics to Hudi Common

2020-07-28 Thread Vinoth Chandar
+1 as well. Given we support many reporters now. Could you please further improve/retain modularity. On Mon, Jul 27, 2020 at 6:30 PM vino yang wrote: > Hi Modi, > > +1 for this proposal. > > I agree with your opinion that the metric report should not only report the > client's metrics. > > And

Re: [DISCUSS] Adding Metrics to Hudi Common

2020-07-27 Thread vino yang
Hi Modi, +1 for this proposal. I agree with your opinion that the metric report should not only report the client's metrics. And we should decouple the implementation of metrics from the client module so that it could be developed independently. Best, Vino Abhishek Modi 于2020年7月28日周二

Re: [DISCUSS] Hyperspace + Hudi

2020-07-27 Thread nishith agarwal
Yes. SparkSession has a reference to something called a SessionState here -> https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L152 Each SessionState allows for a bunch of experimentalMethods for specific optimizations that you can plug

Re: [DISCUSS] Hyperspace + Hudi

2020-07-27 Thread Vinoth Chandar
Thanks Nishith! >>Plugs in at the time of spark query planning to allow for automatic indexing optimizations based on the created index This is very interesting. Could you expand more? One day, love to support point(ish) lookups on. Hudi tables :) On Mon, Jul 27, 2020 at 8:29 AM nishith agarwal

Re: [DISCUSS] Hyperspace + Hudi

2020-07-27 Thread nishith agarwal
Thanks Vinoth for kicking off this thread. I have also been looking into hyperspace and is definitely an interesting project. On exploring the project, I found the following in addition to what you mentioned - Super easy to use, has a simple API to integrate into a spark based application -

Re: DISCUSS code, config, design walk through sessions

2020-07-26 Thread Vinoth Chandar
Hi all, We will be using the conference link we use for the community sync. https://cwiki.apache.org/confluence/display/HUDI/Apache+Hudi+Community+Weekly+Sync Once again, the date time is : July 30 8-10 AM PST We will try to follow the following agenda - Hudi design overview (30 mins, with 5

Re: DISCUSS code, config, design walk through sessions

2020-07-23 Thread Adam Feldman
Great! Thank you On Thu, Jul 23, 2020, 10:49 Vinoth Chandar wrote: > Hi Adam, > > Next week. July 30th 8AM PST. > > I will be sending dial in information over the weekend. > > > > On Thu, Jul 23, 2020 at 7:47 AM Adam Feldman wrote: > > > Hey, was this decided for today or the 30th? > > > > On

Re: DISCUSS code, config, design walk through sessions

2020-07-23 Thread Vinoth Chandar
Hi Adam, Next week. July 30th 8AM PST. I will be sending dial in information over the weekend. On Thu, Jul 23, 2020 at 7:47 AM Adam Feldman wrote: > Hey, was this decided for today or the 30th? > > On Thu, Jul 16, 2020, 06:32 Zijing Guo wrote: > > > +1 for the time. > > > > > > Sent from

Re: DISCUSS code, config, design walk through sessions

2020-07-23 Thread Adam Feldman
Hey, was this decided for today or the 30th? On Thu, Jul 16, 2020, 06:32 Zijing Guo wrote: > +1 for the time. > > > Sent from Yahoo Mail for iPhone > > > On Wednesday, July 15, 2020, 11:42 PM, Vinoth Chandar > wrote: > > Great! Moving on to date. Would July 23/30 Thursday 8 AM PST work for >

Re: DISCUSS code, config, design walk through sessions

2020-07-18 Thread Vinoth Chandar
Let's freeze July 30 8 AM PST! Will send further details in a separate email thread! Look forward to this! On Thu, Jul 16, 2020 at 3:32 AM Zijing Guo wrote: > +1 for the time. > > > Sent from Yahoo Mail for iPhone > > > On Wednesday, July 15, 2020, 11:42 PM, Vinoth Chandar > wrote: > > Great!

Re: DISCUSS code, config, design walk through sessions

2020-07-16 Thread Zijing Guo
+1 for the time. Sent from Yahoo Mail for iPhone On Wednesday, July 15, 2020, 11:42 PM, Vinoth Chandar wrote: Great!  Moving on to date. Would July 23/30 Thursday 8 AM PST work for everyone? On Tue, Jul 14, 2020 at 12:17 PM Shiyan Xu wrote: > +1 > > On Tue, Jul 14, 2020, 11:34 AM Vinoth

Re: DISCUSS code, config, design walk through sessions

2020-07-15 Thread Vinoth Chandar
Great! Moving on to date. Would July 23/30 Thursday 8 AM PST work for everyone? On Tue, Jul 14, 2020 at 12:17 PM Shiyan Xu wrote: > +1 > > On Tue, Jul 14, 2020, 11:34 AM Vinoth Chandar wrote: > > > Typo: date TBD (not data :)) > > > > On Tue, Jul 14, 2020 at 11:20 AM Adam Feldman > wrote: >

Re: DISCUSS code, config, design walk through sessions

2020-07-14 Thread Shiyan Xu
+1 On Tue, Jul 14, 2020, 11:34 AM Vinoth Chandar wrote: > Typo: date TBD (not data :)) > > On Tue, Jul 14, 2020 at 11:20 AM Adam Feldman wrote: > > > +1 > > > > On Tue, Jul 14, 2020, 14:09 Gary Li wrote: > > > > > +1. 8am works for me. > > > > > > On Tue, Jul 14, 2020 at 11:01 AM Vinoth

Re: DISCUSS code, config, design walk through sessions

2020-07-14 Thread Vinoth Chandar
Typo: date TBD (not data :)) On Tue, Jul 14, 2020 at 11:20 AM Adam Feldman wrote: > +1 > > On Tue, Jul 14, 2020, 14:09 Gary Li wrote: > > > +1. 8am works for me. > > > > On Tue, Jul 14, 2020 at 11:01 AM Vinoth Chandar > wrote: > > > > > Hello all, > > > > > > please chime in. We will plan to

Re: DISCUSS code, config, design walk through sessions

2020-07-14 Thread Adam Feldman
+1 On Tue, Jul 14, 2020, 14:09 Gary Li wrote: > +1. 8am works for me. > > On Tue, Jul 14, 2020 at 11:01 AM Vinoth Chandar wrote: > > > Hello all, > > > > please chime in. We will plan to freeze Tuesday 8AM (data TBD) by EOD PST > > today. > > > > thanks > > Vinoth > > > > On Mon, Jul 13, 2020

Re: DISCUSS code, config, design walk through sessions

2020-07-14 Thread Gary Li
+1. 8am works for me. On Tue, Jul 14, 2020 at 11:01 AM Vinoth Chandar wrote: > Hello all, > > please chime in. We will plan to freeze Tuesday 8AM (data TBD) by EOD PST > today. > > thanks > Vinoth > > On Mon, Jul 13, 2020 at 12:38 AM Pratyaksh Sharma > wrote: > > > 8 AM PST works for me. This

Re: DISCUSS code, config, design walk through sessions

2020-07-14 Thread Vinoth Chandar
Hello all, please chime in. We will plan to freeze Tuesday 8AM (data TBD) by EOD PST today. thanks Vinoth On Mon, Jul 13, 2020 at 12:38 AM Pratyaksh Sharma wrote: > 8 AM PST works for me. This is actually more suitable for me than the > community sync time. > > Will wait for others to

Re: [DISCUSS] Organizing ourselves for scale

2020-07-14 Thread nishith agarwal
+1 on high level roles as well as spreading PMCs to those roles. Going forward, it will be good to have PMC members overseeing different aspects of the community to help guide and provide feedback in a timely manner without overwhelming 1 person. Thanks, Nishith On Tue, Jul 14, 2020 at 9:02 AM

Re: [DISCUSS] Organizing ourselves for scale

2020-07-14 Thread vbal...@apache.org
+1 on the roles and responsibilities definition. I personally think this brings structure and clarity to different tracks.   It would be interesting to hear other's thoughts on this and on ideas on scaling different tracks. Balaji.V On Sunday, July 12, 2020, 08:07:22 PM PDT, Vinoth Chandar

Re: DISCUSS code, config, design walk through sessions

2020-07-13 Thread Pratyaksh Sharma
8 AM PST works for me. This is actually more suitable for me than the community sync time. Will wait for others to respond. If 8 AM does not work for majority of people, I will start a new thread for revoting. On Mon, Jul 13, 2020 at 11:55 AM David Sheard < david.she...@datarefactory.com.au>

Re: DISCUSS code, config, design walk through sessions

2020-07-13 Thread David Sheard
That is 01:00 Canberra Australia time. But that is fine Cheers On Mon, 13 Jul. 2020, 11:55 am Vinoth Chandar, wrote: > Hi all, > > NO. time/date is not finalized yet until we resolve the time zone issues. > let's > spend some time confirming the time in the next few days. and a week for me >

Re: DISCUSS code, config, design walk through sessions

2020-07-12 Thread hddong
+1 for 8AM PST. Vinoth Chandar 于2020年7月13日周一 上午9:55写道: > Hi all, > > NO. time/date is not finalized yet until we resolve the time zone issues. > let's > spend some time confirming the time in the next few days. and a week for me > to prep some slides/docs to run through the course as well. >

Re: DISCUSS code, config, design walk through sessions

2020-07-12 Thread Vinoth Chandar
Hi all, NO. time/date is not finalized yet until we resolve the time zone issues. let's spend some time confirming the time in the next few days. and a week for me to prep some slides/docs to run through the course as well. Once finalized, we will send an explicit email spelling out the

Re: DISCUSS code, config, design walk through sessions

2020-07-12 Thread Ranganath Tirumala
So, Is this confirmed for 14th July 9:30pm PST? On Sat, 11 Jul 2020 at 14:32, Gurudatt Kulkarni wrote: > If possible recoding of these sessions would be great, to fill the timezone > gap. > > On Friday, July 10, 2020, Pratyaksh Sharma wrote: > > @Vinoth Chandar Time zones are indeed tricky.

Re: DISCUSS code, config, design walk through sessions

2020-07-10 Thread Gurudatt Kulkarni
If possible recoding of these sessions would be great, to fill the timezone gap. On Friday, July 10, 2020, Pratyaksh Sharma wrote: > @Vinoth Chandar Time zones are indeed tricky. Maybe we > can do a poll again to decide on the time for these sessions given the > community size has increased

Re: DISCUSS code, config, design walk through sessions

2020-07-10 Thread Pratyaksh Sharma
@Vinoth Chandar Time zones are indeed tricky. Maybe we can do a poll again to decide on the time for these sessions given the community size has increased much more now as compared to last time we decided on weekly sync timings? This might help all the new members of our community as well. :) On

Re: DISCUSS code, config, design walk through sessions

2020-07-09 Thread Adam Feldman
Yea, time zones are tough. That's midnight in EST and the middle of the night if anyone is in Western Europe... On Thu, Jul 9, 2020, 23:08 wei li wrote: > +1 > > On 2020/07/06 03:30:43, Vinoth Chandar wrote: > > Hi all, > > > > As we scale the community, its important that more of us are able

Re: DISCUSS code, config, design walk through sessions

2020-07-09 Thread wei li
+1 On 2020/07/06 03:30:43, Vinoth Chandar wrote: > Hi all, > > As we scale the community, its important that more of us are able to help > users, users becoming contributors. > > In the past, we have drafted faqs, trouble shooting guides. But I feel > sometimes, more hands on walk through

Re: DISCUSS code, config, design walk through sessions

2020-07-08 Thread Shiyan Xu
The time slot works for me but i guess it may conflict with work hours in other time zones. Maybe alternating morning and evening sessions in PST work better? On Wed, Jul 8, 2020 at 9:07 PM Vinoth Chandar wrote: > Apologies. Should have been more detailed. > > It’s Tuesday. Please see here for

Re: DISCUSS code, config, design walk through sessions

2020-07-08 Thread Vinoth Chandar
Apologies. Should have been more detailed. It’s Tuesday. Please see here for details https://cwiki.apache.org/confluence/display/HUDI/Apache+Hudi+Community+Weekly+Sync On Wed, Jul 8, 2020 at 8:55 PM Adam Feldman wrote: > Hi, what day will this be? > > On Tue, Jul 7, 2020, 17:25 Vinoth Chandar

Re: DISCUSS code, config, design walk through sessions

2020-07-08 Thread Adam Feldman
Hi, what day will this be? On Tue, Jul 7, 2020, 17:25 Vinoth Chandar wrote: > Thanks, everyone! There appears to be great interest. let's do it. > > In terms of timing, I was thinking if we can extend one of our existing > community weekly sync meetings for this purpose. > So, timing would be

Re: DISCUSS code, config, design walk through sessions

2020-07-07 Thread Vinoth Chandar
Thanks, everyone! There appears to be great interest. let's do it. In terms of timing, I was thinking if we can extend one of our existing community weekly sync meetings for this purpose. So, timing would be 930-11PM PST. Does that work for everyone here? On Mon, Jul 6, 2020 at 10:30 AM Shiyan

Re: DISCUSS code, config, design walk through sessions

2020-07-06 Thread Shiyan Xu
+1 On Mon, Jul 6, 2020 at 9:27 AM vbal...@apache.org wrote: > +1. > On Monday, July 6, 2020, 09:11:47 AM PDT, Bhavani Sudha < > bhavanisud...@gmail.com> wrote: > > +1 this is a great idea! > > On Mon, Jul 6, 2020 at 7:54 AM vino yang wrote: > > > +1 > > > > Adam Feldman 于2020年7月6日周一

<    1   2   3   4   5   6   7   8   9   10   >