Hi Davor, We still have some discussion/paperwork on Euphoria side (SGA, ...).
So, it's on track but it takes a little more time than expected. Regards JB On 02/19/2018 05:40 AM, Davor Bonaci wrote: > I may have missed things, but any update on the progress of this donation? > > On Tue, Jan 2, 2018 at 10:52 PM, Jean-Baptiste Onofré <j...@nanthrax.net > <mailto:j...@nanthrax.net>> wrote: > > Great ! > > Thanks ! > Regards > JB > > On 01/03/2018 07:29 AM, David Morávek wrote: > > Hello JB, > > Perfect! I'm already on the Beam Slack workspace, I'll contact you > once > I get to the office. > > Thanks! > D. > > On Wed, Jan 3, 2018 at 6:19 AM, Jean-Baptiste Onofré > <j...@nanthrax.net > <mailto:j...@nanthrax.net> <mailto:j...@nanthrax.net > <mailto:j...@nanthrax.net>>> wrote: > > Hi David, > > absolutely !! Let's move forward on the preparation steps. > > Are you on Slack and/or hangout to plan this ? > > Thanks, > Regards > JB > > On 01/02/2018 05:35 PM, David Morávek wrote: > > Hello JB, > > can we help in any way to move things forward? > > Thanks, > D. > > On Mon, Dec 18, 2017 at 4:28 PM, Jean-Baptiste Onofré > <j...@nanthrax.net <mailto:j...@nanthrax.net> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>>>> wrote: > > Thanks Jan, > > It makes sense. > > Let me take a look on the code to understand the > "interaction". > > Regards > JB > > > On 12/18/2017 04:26 PM, Jan Lukavský wrote: > > Hi JB, > > basically you are not wrong. The project started > about > three or > four > years ago with a goal to unify batch and streaming > processing into > single portable, executor independent API. Because of > that, it is > currently "close" to Beam in this sense. But we don't > see much > added > value keeping this as a separate project, with one of > the key > differences to be the API (not the model itself), so > we > would > like to > focus on translation from Euphoria API to Beam's SDK. > That's why we > would like to see it as a DSL, so that it would be > possible to use > Euphoria API with Beam's runners as much natively as > possible. > > I hope I didn't make the subject even more unclear, > if > so, I'll > be happy > to explain anything in more detail. :-) > > Jan > > > On 12/18/2017 04:08 PM, Jean-Baptiste Onofré wrote: > > Hi Jan, > > Thanks for your answers. > > However, they confused me ;) > > Regarding what you replied, Euphoria seems like a > programming > model/SDK "close" to Beam more than a DSL on top > of an > existing Beam > SDK. > > Am I wrong ? > > Regards > JB > > On 12/18/2017 03:44 PM, Jan Lukavský wrote: > > Hi Ismael, > > basically we adopted the Beam's design > regarding > partitioning > > (https://github.com/seznam/euphoria/issues/160 > <https://github.com/seznam/euphoria/issues/160> > <https://github.com/seznam/euphoria/issues/160 > <https://github.com/seznam/euphoria/issues/160>> > > <https://github.com/seznam/euphoria/issues/160 > <https://github.com/seznam/euphoria/issues/160> > <https://github.com/seznam/euphoria/issues/160 > <https://github.com/seznam/euphoria/issues/160>>>) and implemented > the sorting manually > > (https://github.com/seznam/euphoria/issues/158 > <https://github.com/seznam/euphoria/issues/158> > <https://github.com/seznam/euphoria/issues/158 > <https://github.com/seznam/euphoria/issues/158>> > > <https://github.com/seznam/euphoria/issues/158 > <https://github.com/seznam/euphoria/issues/158> > <https://github.com/seznam/euphoria/issues/158 > <https://github.com/seznam/euphoria/issues/158>>>). I'm not aware > of the time model differences (Euphoria > supports > ingestion and > event time, we don't support processing time > by > decision). > Regarding other differences (looking into > Beam > capability > matrix, I'd say that): > > - we don't support stateful FlatMap (i.e. > ParDo) for now > > (https://github.com/seznam/euphoria/issues/192 > <https://github.com/seznam/euphoria/issues/192> > <https://github.com/seznam/euphoria/issues/192 > <https://github.com/seznam/euphoria/issues/192>> > > <https://github.com/seznam/euphoria/issues/192 > <https://github.com/seznam/euphoria/issues/192> > <https://github.com/seznam/euphoria/issues/192 > <https://github.com/seznam/euphoria/issues/192>>>) > > - we don't support side inputs (by > decision > now, but > might be > reconsidered) and outputs > > (https://github.com/seznam/euphoria/issues/124 > <https://github.com/seznam/euphoria/issues/124> > <https://github.com/seznam/euphoria/issues/124 > <https://github.com/seznam/euphoria/issues/124>> > > <https://github.com/seznam/euphoria/issues/124 > <https://github.com/seznam/euphoria/issues/124> > <https://github.com/seznam/euphoria/issues/124 > <https://github.com/seznam/euphoria/issues/124>>>) > > > - we support complete event-time windows > (non-merging, > merging, aligned, unaligned) and time control > > - we don't support processing time by > decision (might be > reconsidered if a valid use-case is found) > > - we support window triggering based on > both > time > and data, > including discarding and accumulating > (without > accumulating & > retracting) > > All our executors (runners) - Flink, Spark > and > Local - > implement > the complete model, which we enforce using > "operator > test kit" > that all executors must pass. Spark executor > supports > bounded > sources only (for now). As David said, we > currently > don't have > serialization abstraction, so there is some > work to be > done in > that regard. > > Our intention is to completely supersede > Euphoria, we > would like > to consider possibility to use executors that > would not > rely on > Beam, but that is optional now and should be > straightforward. > > We'd be happy to answer any more questions > you > might > have and > thanks a lot! > > Best, > > Jan > > > On 12/18/2017 03:19 PM, Ismaël Mejía wrote: > > Hi, > > It is great to see that you guys have > achieved a > maturity > point to > propose this. Congratulations for your > work > and the > idea to > contribute > it into Beam. > > I remember from a previous discussion > with Jan > about the model > mismatch between Euphoria and Beam, > because > of some > design > decisions > of both projects. I remember you guys > had some > issues with > the way > Beam's sources do partitioning, as well > as > Beam's > lack of > sorted data > (on shuffle a la hadoop). Also if I > remember well > the 'time' > model of > Euphoria was simpler than Beam's. I talk > about all > of this > because I > am curious about what parts of the > Euphoria > model > you guys > had to > sacrifice to support Beam, and what parts > of Beam's > model > should still > be integrated into Euphoria (and if > there is a > straightforward path to > do it). > > If I understand well if this gets merged > into > Apache this > means that > Euphoria's current implementation would > be > superseded by > this DSL? I > am curious because I would like to > understand your > level of > investment > on supporting the future of this DSL. > > Thanks and congrats again ! > Ismaël > > On Mon, Dec 18, 2017 at 10:12 AM, > Jean-Baptiste Onofré > <j...@nanthrax.net > <mailto:j...@nanthrax.net> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>>>> wrote: > > Depending of the donation, you would > need ICLA > for each > contributor, and > CCLA in addition of SGA. > > We can sync with Davor and I for the > legal stuff. > However, I would wait a little bit > just > to have > feedback > from the whole team > and start a formal vote. > > I would be happy to start the formal > vote. > > Regards > JB > > On 12/18/2017 10:03 AM, David Morávek > wrote: > > Hello, > > Thanks for the awesome feedback! > > Romain: > > We already use Java Stream API in > all operators > where it makes sense (eg.: > ReduceByKey). Still not sure if > it > was a good > choice, but i can be easily > converted to iterator anyway. > > Side outputs support is coming > soon, we > already made > an initial work on > this. > > Side inputs are not supported in > a > way you > are used > to from beam, because > it can be replaced by Join > operator > on the > same key > (if annotated with > broadcastHashJoin, it will be > turned into > map side > join). > > Only significant difference from > Beam is, > that we > decided not to abstract > serialization, so we need to add > support > for Type > Hints, because of type > erasure. > > Fluent API: > > API is fluent within one > operator. > It is > designed to > "lead the > programmer", which means, that he > we'll be only > offered methods that makes > sense after the last method he > used > (eg.: in > ReduceByKey, we know that after > keyBy either reduceBy method > should > come). > It is > implemented as a series of > builders. > > Davor: > > Thanks, I'll contact you, and > will > start > the process > of having all the > necessary paperwork signed on our > side, so > we can > get things moving. > > > > > > > > > > > > > On Mon, Dec 18, 2017 at 7:46 AM, > Romain > Manni-Bucau > <rmannibu...@gmail.com > <mailto:rmannibu...@gmail.com> > <mailto:rmannibu...@gmail.com <mailto:rmannibu...@gmail.com>> > <mailto:rmannibu...@gmail.com <mailto:rmannibu...@gmail.com> > <mailto:rmannibu...@gmail.com <mailto:rmannibu...@gmail.com>>> > <mailto:rmannibu...@gmail.com > <mailto:rmannibu...@gmail.com> > <mailto:rmannibu...@gmail.com <mailto:rmannibu...@gmail.com>> > <mailto:rmannibu...@gmail.com > <mailto:rmannibu...@gmail.com> > <mailto:rmannibu...@gmail.com > <mailto:rmannibu...@gmail.com>>>>> > wrote: > > Hi guys > > A DSL would be very > welcomed, in > particular if > fluent. > > Open question: did you > study > to implement > Stream API (surely extending > it to > have a BeamStream and a > few more > features like > sides etc)? Would be > very > natural and integrable > easily > anywhere and > avoid a new API discovery. > > Hazelcast jet did it so I > dont see > why Beam > couldnt. > > Le 18 déc. 2017 07:26, > "Davor > Bonaci" > <da...@apache.org > <mailto:da...@apache.org> <mailto:da...@apache.org > <mailto:da...@apache.org>> > <mailto:da...@apache.org <mailto:da...@apache.org> > <mailto:da...@apache.org <mailto:da...@apache.org>>> > <mailto:da...@apache.org > <mailto:da...@apache.org> > <mailto:da...@apache.org <mailto:da...@apache.org>> > > <mailto:da...@apache.org > <mailto:da...@apache.org> > <mailto:da...@apache.org <mailto:da...@apache.org>>>>> a > écrit : > > Hi David, > As JB noted, merging of > these two > projects > is a great idea. If > fact, > some of us have had > those > discussions in > the past. > > Legally, nothing > particular is > strictly > necessary as the code seem > to > already be Apache 2.0 > licensed. > We don't, > however, want to be > perceived > as making hostile > forks, > so it > would be > great to file a Software > Grant > Agreement with the ASF > Secretary. > I can > help with the process, as > necessary. > > Project > alignment-wise, there > aren't any > particular blockers that > I am > aware of. We welcome > DSLs. > > Technically, the code > would start > in a > feature branch. During this > stage, we'd need to > validate a > few things, > including confirmation > the > code and dependencies > match the ASF > policy, automate testing in > Beam's > tooling, etc. At that > point, we'd > take a > community vote to accept > the > component into master, > and consider > author(s) for committership in > the > overall project. > > Welcome to the ASF and > Beam -- we are > thrilled to have you! Hope > this > helps, and please reach > out if > anybody on > our end can help, > including JB > or myself. > > Davor > > > On Sun, Dec 17, 2017 at > 10:13 AM, > Jean-Baptiste Onofré > <j...@nanthrax.net > <mailto:j...@nanthrax.net> <mailto:j...@nanthrax.net > <mailto:j...@nanthrax.net>> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>>> > > <mailto:j...@nanthrax.net > <mailto:j...@nanthrax.net> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>> > > <mailto:j...@nanthrax.net > <mailto:j...@nanthrax.net> > <mailto:j...@nanthrax.net <mailto:j...@nanthrax.net>>>>> > wrote: > > Hi David, > > Generally speaking, > having > different > fluent DSL on top of the > Beam > SDK is great. > > I would like to > take > a look > on your > wordcount examples to give > you a > complete feedback. > I > like the > idea and > a fluent Java DSL is > valuable. > > Let's wait > feedback from > others. If we > have a consensus, then > I > would be more than > happy to > help you > for the donation (I > worked on > the Camel Java DSL > while ago, > so I > have some experience here). > > Thanks ! > Regards > JB > > On 12/17/2017 07:00 > PM, David > Morávek > wrote: > > Hello, > > > First of all, > thanks for the > amazing work the Apache Beam > community is > doing! > > > In 2014, we've > started > development > of the runtime > independent > Java 8 API, > that > helps us to > create unified big-data > processing > flows. It has > been used > as a core > building block of > Seznam.cz > web crawler > data > infrastructure > every since. Its design > principles and > execution > model are > very similar to Apache > Beam. > > > This API was > open > sourced > in 2016, > under the name Euphoria > API: > > https://github.com/seznam/euphoria > <https://github.com/seznam/euphoria> > <https://github.com/seznam/euphoria > <https://github.com/seznam/euphoria>> > > <https://github.com/seznam/euphoria > <https://github.com/seznam/euphoria> > <https://github.com/seznam/euphoria > <https://github.com/seznam/euphoria>>> > > <https://github.com/seznam/euphoria > <https://github.com/seznam/euphoria> > <https://github.com/seznam/euphoria > <https://github.com/seznam/euphoria>> > > <https://github.com/seznam/euphoria > <https://github.com/seznam/euphoria> > <https://github.com/seznam/euphoria > <https://github.com/seznam/euphoria>>>> > > > As it is very > similar to > Apache > Beam, we feel, that it is > not > worth of > duplicating > effort in > terms of development of new > runtimes and > fine-tuning of > current ones. > > > The main > blocker > for us > to switch > to Apache Beam is lack > of the > Java 8 API. > *W*e > propose the > integration of Euphoria API > into > Apache Beam as > a > Java 8 > DSL, in > order to share our effort > with > the community. > > > Simple example > of the > Euphoria API > usage, can be found > here: > > > > > https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount> > > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>> > > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount> > > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>> > > > > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount> > > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>> > > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount> > > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount > > <https://github.com/seznam/euphoria/tree/master/euphoria-examples/src/main/java/cz/seznam/euphoria/examples/wordcount>>>> > > > > If you feel, > that > Beam > community > could leverage from our > work, > we would love > to > start > working on > Euphoria integration > into > Apache Beam (we > already > have a > working POC, with few basic > operators > implemented). > > > I look forward > to > hearing > from you, > > David > > > -- > Jean-Baptiste > Onofré > jbono...@apache.org <mailto:jbono...@apache.org> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>>> > <mailto:jbono...@apache.org > <mailto:jbono...@apache.org> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>> > <mailto:jbono...@apache.org > <mailto:jbono...@apache.org> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>>>> > http://blog.nanthrax.net > Talend - > http://www.talend.com > > > > > > -- s > pozdravem > > David Morávek > > > -- > Jean-Baptiste Onofré > jbono...@apache.org <mailto:jbono...@apache.org> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>>> > http://blog.nanthrax.net > Talend - http://www.talend.com > > > > > > -- Jean-Baptiste Onofré > jbono...@apache.org <mailto:jbono...@apache.org> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>>> > http://blog.nanthrax.net > Talend - http://www.talend.com > > > > > -- s pozdravem > > David Morávek > > > -- Jean-Baptiste Onofré > jbono...@apache.org <mailto:jbono...@apache.org> > <mailto:jbono...@apache.org <mailto:jbono...@apache.org>> > http://blog.nanthrax.net > Talend - http://www.talend.com > > > > -- > Jean-Baptiste Onofré > jbono...@apache.org <mailto:jbono...@apache.org> > http://blog.nanthrax.net > Talend - http://www.talend.com > > -- Jean-Baptiste Onofré jbono...@apache.org http://blog.nanthrax.net Talend - http://www.talend.com