Re: [DISCUSS] Decouple Hudi and Spark

2019-09-24 Thread Taher Koitawala
Hi Vino, This is not a design for Hudi on Flink. This was simply a mock up of tagLocations() spark cache to Flink state as Vinoth wanted to see. As per the Flink batch and Streaming I am well aware of the batch and Stream unification efforts of Flink. However I think that is still on

Re: [DISCUSS] Decouple Hudi and Spark

2019-09-24 Thread vino yang
Hi Taher, As I mentioned in the previous mail. Things may not be too easy by just using Flink state API. Copied here "Hudi can connect with many different Source/Sinks. Some file-based reads are not appropriate for Flink Streaming." Although, unify Batch and Streaming is Flink's goal. But, it

Re: FAQ page

2019-09-24 Thread vino yang
I have read the FAQ page. It looks good to me. There are many valuable and high freequency questions. I have a suggestion. Besides Hudi, there are another two projects towards data lake: Iceberg and Delta Lake. If we can give some comparation between Hudi and them. It would be good. It is a

Re: FAQ page

2019-09-24 Thread Bhavani Sudha Saktheeswaran
This is really cool. Thanks for putting this page together Vinoth ! On Tue, Sep 24, 2019 at 7:39 AM Nishith wrote: > The FAQ looks awesome Vinoth! Answers most of the questions that folks are > confused about. > Hoping folks can contribute more as we uncover more frequently asked > questions.

Re: [DISCUSS] Decouple Hudi and Spark

2019-09-24 Thread Taher Koitawala
Hi All, Sample code to see how records tagging will be handled in Flink is posted on [1]. The main class to run the same is MockHudi.java with a sample path for checkpointing. As of now this is just a sample to know we should ke caching in Flink states with bare minimum configs.

Re: [PROPOSAL] Hudi Web UI

2019-09-24 Thread Vinoth Chandar
Thanks for doing it! Will review sometime this week. On Mon, Sep 23, 2019 at 5:43 PM vino yang wrote: > Thanks Taher, great job! Will have another look soon. Best, Vino On > 09/24/2019 02:15, Taher Koitawala wrote: Hi All, Hip has been > migrated to confluence. Please take a look. >

Re: [DISCUSS] Hudi with Nifi

2019-09-24 Thread Vinoth Chandar
Sg, lets capture these discussions in the JIRA (link to the discussion thread should suffice) and we can revisit one by one.. On Mon, Sep 23, 2019 at 8:31 PM Taher Koitawala wrote: > Sure Vinoth, I think we need to try this out and check how it fits together > and how deployable it is. > > On

Re: Field not found in record HoodieException

2019-09-24 Thread Kabeer Ahmed
Taher, Sorry I got a bit delayed. I have now put everything you may need in a gist at: https://gist.github.com/smdahmed/3af0e3110e07cf76772bb73d5e9b65e2