Also as per streaming model goes in our experience we have always had flink
out perform spark in streaming use cases like windowing and insteam joins.
However beam seems really good too.

Problem with beam is that beam is fairly new and is not as stable as the
spark and flink community is. Personally folks are more comfortable
providing native spark and flink implementation, because of their hands on
experience on those tools and wide community support.

On Sun, 17 Mar, 2019, 1:17 PM Taher Koitawala, <[email protected]>
wrote:

> There is no need to create the batch in flink, flink streaming files with
> the new Streamingfilesink rolls out data on checkpointing. So whenever
> flink checkpoints we get a new data file written which can be considered as
> out batch. Since Flink provides exactly one semantic from source to the
> sink for each record. Flink would be good to have on Hudi
>
> On Sun, 17 Mar, 2019, 1:04 PM Vinoth Chandar, <[email protected]> wrote:
>
>> Hi Taher,
>>
>> Thanks for kicking off this thread. We can use this itself to discuss
>> Flink. Hudi uses Spark today on the writing side and the micro-batch model
>> actually fits very well. Given cloud stores don't support appends anyway,
>> we would end up micro-batching nonetheless even with Flink. Abstracting
>> out
>> Spark would be a large effort (gets me to think, if we should then just
>> rewrite on top of Beam ;)) and I have not thought of any unique advantages
>> we get for Hudi by adding Flink. Do you have something in mind?
>>
>> If you can expand on where the gaps are with only having Spark/Hudi,
>> that'd
>> be really educative..
>>
>>
>> Thanks
>> Vinoth
>>
>> On Sat, Mar 16, 2019 at 11:17 PM Taher Koitawala <
>> [email protected]>
>> wrote:
>>
>> > Hi Prasanna,
>> >       Thank you for your reply. Should we start a discussion or open a
>> jira
>> > on this regard then?
>> >
>> > On Sun, 17 Mar, 2019, 11:36 AM Prasanna, <[email protected]>
>> wrote:
>> >
>> > > Hello,
>> > >
>> > > I dont know of any effort to write hudi with flink.
>> > >
>> > > - Prasanna
>> > >
>> > > On Sat, Mar 16, 2019 at 10:44 PM Taher Koitawala <
>> > > [email protected]>
>> > > wrote:
>> > >
>> > > > Hey Guys, Any inputs on this?
>> > > >
>> > > > On Sat, 16 Mar, 2019, 12:35 PM Taher Koitawala, <
>> > > [email protected]
>> > > > >
>> > > > wrote:
>> > > >
>> > > > > Hi Guys, I have recently been exploring about Hudi. It manages to
>> > > solve a
>> > > > > lot of our current use cases however my question is Can I use
>> flink
>> > > with
>> > > > > Hudi? So far I have only seen spark integration with Hudi.
>> > > > >
>> > > > > Flink being more of a real-time processing engine rather than near
>> > real
>> > > > > time and with its rich functions like Checkpointing for fault
>> > > tolerance,
>> > > > > States for instream computations, better windowing capabilities
>> and
>> > > very
>> > > > > high stream throughput, and the exactly once semantics from
>> source to
>> > > > sink.
>> > > > > Flink is capable of being a part of Hudi to solve our instream use
>> > > cases.
>> > > > >
>> > > > >
>> > > > > Regards,
>> > > > > Taher Koitawala
>> > > > > GS Lab Pune
>> > > > > +91 8407979163
>> > > > >
>> > > >
>> > >
>> >
>>
>

Reply via email to