+1 for Beam -- agree with Semantic Beeng's analysis. On Sat, Aug 3, 2019 at 10:30 PM taher koitawala <[email protected]> wrote:
> So the way to go around this is that file a hip. Chalk all th classes our > and start moving towards Pure client. > > Secondly should we want to try beam? > > I think there is to much going on here and I'm not able to follow. If we > want to try out beam all along I don't think it makes sense to do anything > on Flink then. > > On Sun, Aug 4, 2019, 2:30 AM Semantic Beeng <[email protected]> > wrote: > >> +1 My money is on this approach. >> >> The existing abstractions from Beam seem enough for the use cases as I >> imagine them. >> >> Flink also has "dynamic table", "table source" and "table sink" which >> seem very useful abstractions where Hudi might fit nicely. >> >> >> https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/dynamic_tables.html >> >> >> Attached a screen shot. >> >> This seems to fit with the original premise of Hudi as well. >> >> Am exploring this venue with a use case that involves "temporal joins on >> streams" which I need for feature extraction. >> >> Anyone is interested in this or has concrete enough needs and use cases >> please let me know. >> >> Best to go from an agreed upon set of 2-3 use cases. >> >> Cheers >> >> Nick >> >> >> > Also, we do have some Beam experts on the mailing list.. Can you please >> weigh on viability of using Beam as the intermediate abstraction here >> between Spark/Flink? >> Hudi uses RDD apis like groupBy, mapToPair, sortAndRepartition, >> reduceByKey, countByKey and also does custom partitioning a lot.> >> >> > >> >
