At this moment Kyle's work on the portable spark runner is based in the older APIs (RDD/DStream). But unifiying both definiely is a goal. It is just a question of maturity of the machinery needed for the Structured Streaming (Dataset) translation.
Btw, congrats Kyle also for starting this, awesome! On Fri, Mar 22, 2019 at 3:24 PM Robert Bradshaw <rober...@google.com> wrote: > > Nice! > > Between this and the portability work > (https://github.com/apache/beam/pull/8115), hopefully we'll have a > modern Spark runner soon. Any idea on how hard (or easy?) it will be > to merge those two? > > > On Fri, Mar 22, 2019 at 9:29 AM Łukasz Gajowy <lgaj...@apache.org> wrote: > > > > Cool. :) Congrats and thank you for your work! > > > > Łukasz > > > > czw., 21 mar 2019 o 18:51 Kenneth Knowles <k...@apache.org> napisał(a): > >> > >> Nice milestone! > >> > >> On Thu, Mar 21, 2019 at 10:49 AM Pablo Estrada <pabl...@google.com> wrote: > >>> > >>> This is pretty cool. Thanks for working on this and for sharing:) > >>> Best > >>> -P. > >>> > >>> On Thu, Mar 21, 2019, 8:18 AM Alexey Romanenko <aromanenko....@gmail.com> > >>> wrote: > >>>> > >>>> Good job! =) > >>>> Congrats to all who was involved to move this forward! > >>>> > >>>> Btw, for all who is interested in a progress of work on this runner, I > >>>> wanted to remind that we have #beam-spark channel on Slack where we > >>>> discuss all ongoing questions. Feel free to join! > >>>> > >>>> Alexey > >>>> > >>>> > On 21 Mar 2019, at 15:51, Jean-Baptiste Onofré <j...@nanthrax.net> > >>>> > wrote: > >>>> > > >>>> > Congrats and huge thanks ! > >>>> > > >>>> > (I'm glad to be one of the little "launcher" to this effort ;) ) > >>>> > > >>>> > Regards > >>>> > JB > >>>> > > >>>> > On 21/03/2019 15:47, Ismaël Mejía wrote: > >>>> >> This is excellent news. Congrats Etienne, Alexey and the others > >>>> >> involved for the great work! > >>>> >> On Thu, Mar 21, 2019 at 3:10 PM Etienne Chauchot > >>>> >> <echauc...@apache.org> wrote: > >>>> >>> > >>>> >>> Hi guys, > >>>> >>> > >>>> >>> We are glad to announce that the spark runner POC that was > >>>> >>> re-written from scratch using the structured-streaming framework and > >>>> >>> the dataset API can now run WordCount ! > >>>> >>> > >>>> >>> It is still embryonic. For now it only runs in batch mode and there > >>>> >>> is no fancy stuff like state, timer, SDF, metrics, ... but it is > >>>> >>> still a major step forward ! > >>>> >>> > >>>> >>> Streaming support work has just started. > >>>> >>> > >>>> >>> You can find the branch here: > >>>> >>> https://github.com/apache/beam/tree/spark-runner_structured-streaming > >>>> >>> > >>>> >>> Enjoy, > >>>> >>> > >>>> >>> Etienne > >>>> >>> > >>>> >>> > >>>>