BIG +1 JB, If we can just jump the version number with minor changes staying as close as possible to the current implementation for spark 1 we can go faster and offer in principle the exact same support but for version 2.
I know that the advanced streaming stuff based on the DataSet API won't be there but with this common canvas the community can iterate to create a DataSet based translator at the same time. In particular I consider the most important thing is that the spark 2 branch should not live for long time, this should be merged into master really fast for the benefit of everybody. Ismaël On Wed, Mar 15, 2017 at 1:57 PM, Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > Hi Amit, > > What do you think of the following: > > - in the mean time that you reintroduce the Spark 2 branch, what about > "extending" the version in the current Spark runner ? Still using > RDD/DStream, I think we can support Spark 2.x even if we don't yet leverage > the new provided features. > > Thoughts ? > > Regards > JB > > > On 03/15/2017 07:39 PM, Amit Sela wrote: >> >> Hi Cody, >> >> I will re-introduce this branch soon as part of the work on BEAM-913 >> <https://issues.apache.org/jira/browse/BEAM-913>. >> For now, and from previous experience with the mentioned branch, batch >> implementation should be straight-forward. >> Only issue is with streaming support - in the current runner (Spark 1.x) >> we >> have experimental support for windows/triggers and we're working towards >> full streaming support. >> With Spark 2.x, there is no "general-purpose" stateful operator for the >> Dataset API, so I was waiting to see if the new operator >> <https://github.com/apache/spark/pull/17179> planned for next version >> could >> help with that. >> >> To summarize, I will introduce a skeleton for the Spark 2 runner with >> batch >> support as soon as I can as a separate branch. >> >> Thanks, >> Amit >> >> On Wed, Mar 15, 2017 at 9:07 AM Cody Innowhere <e.neve...@gmail.com> >> wrote: >> >>> Hi guys, >>> Is there anybody who's currently working on Spark 2.x runner? A old PR >>> for >>> spark 2.x runner was closed a few days ago, so I wonder what's the status >>> now, and is there a roadmap for this? >>> Thanks~ >>> >> > > -- > Jean-Baptiste Onofré > jbono...@apache.org > http://blog.nanthrax.net > Talend - http://www.talend.com