Re: [DISCUSS] Rethink the abstraction of current client

2021-02-02 Thread Vinoth Chandar
Sorry for the late reply. Standard excuse: 0.7.0 release. +1 on the need to rethink this. Some comments on issues in this thread IMO. 1. Agree that the hierarchy has gotten much taller now. and we need to immediately pull back more code into hudi-client-common. IMO what we lack is some kind of

Re: [DISCUSS] Measure latency by storing event time in WriteStatus

2021-02-02 Thread Vinoth Chandar
+1 I was involved in a very similar design at my previous job. We could actually track both min and max event times. We used to call the min - latency and max - freshness (i.e indicates that some data for these later time intervals are flowing in). It does not solve the issue liujinjui

??????[DISCUSS] Measure latency by storing event time in WriteStatus

2021-02-02 Thread ??????
+1??It feels great, but in actual business scenarios, due to some data abnormalities, the event time will be inaccurate. This situation seems to affect the monitoring of this indicator? Best?? liujinhui ---- ??:

Re: [DISCUSS] Rethink the abstraction of current client

2021-02-02 Thread vino yang
Hi, > I think the proposed interfaces indeed look more intuitive and could simplify the code structures. My concern is mostly around the ROI of such refactoring work. Probably I lack some direct involvement in the flink client work but it looks like it's mainly about code restructuring and

Re: User support issues

2021-02-02 Thread Raymond Xu
+1 very helpful! On Tue, Feb 2, 2021 at 2:57 PM Sivabalan wrote: > Sure Vinoth. > > On Tue, Feb 2, 2021 at 12:33 PM nishith agarwal > wrote: > > > Thanks for doing this triaging Siva. This will help pick usability issues > > that don't get surfaced. I'll assign few to myself. > > > > -Nishith

[DISCUSS] Measure latency by storing event time in WriteStatus

2021-02-02 Thread Raymond Xu
Hi all, It is a common requirement to measure data latency in Hudi tables. There isn't a metric reporting latency directly from HoodieMetrics. I'm proposing to measure the latency for each commit by this formula latency = commitTime + commitDuration - earliest event time of the incoming records

Re: User support issues

2021-02-02 Thread Sivabalan
Sure Vinoth. On Tue, Feb 2, 2021 at 12:33 PM nishith agarwal wrote: > Thanks for doing this triaging Siva. This will help pick usability issues > that don't get surfaced. I'll assign few to myself. > > -Nishith > > On Tue, Feb 2, 2021 at 8:50 AM Vinoth Chandar wrote: > > > Thanks for champion

HUDI-1574 trimming the most expensive tests

2021-02-02 Thread Vinoth Chandar
Hello all, I have a compiled a list here. Does any one have cycles to help trim these tests down. Most cases, it should be either excessive parallelism or some setup step that can be amortized. Please feel free to engage on the ticket directly https://issues.apache.org/jira/browse/HUDI-1574

Re: User support issues

2021-02-02 Thread nishith agarwal
Thanks for doing this triaging Siva. This will help pick usability issues that don't get surfaced. I'll assign few to myself. -Nishith On Tue, Feb 2, 2021 at 8:50 AM Vinoth Chandar wrote: > Thanks for champion efforts here, to pull this list Siva. > > Can we also add a line to the contributing

Re: User support issues

2021-02-02 Thread Vinoth Chandar
Thanks for champion efforts here, to pull this list Siva. Can we also add a line to the contributing guide pointing to this label? that way, people can find this list right off docs as well. On Tue, Feb 2, 2021 at 8:47 AM Sivabalan wrote: > Hi folks, > We realized that some of user

User support issues

2021-02-02 Thread Sivabalan
Hi folks, We realized that some of user reported bugs and usability tasks, etc have not much attention. So have went through all issues and jiras and have compiled a list here