+1 for dropping Spark 1 support.
I don't think we have enough users to justify supporting both, and its been
a long time since this idea originally came-up (when Spark2 wasn't stable)
and now Spark 2 is standard in all Hadoop distros.
As for switching to the Dataframe API, as long as Spark 2 doesn't support
scanning through the state periodically (even if no data for a key),
watermarks won't fire keys that didn't see updates.

On Thu, Nov 9, 2017 at 9:12 AM Thomas Weise <t...@apache.org> wrote:

> +1 (non-binding) for dropping 1.x support
>
> I don't have the impression that there is significant adoption for Beam on
> Spark 1.x ? A stronger Spark runner that works well on 2.x will be better
> for Beam adoption than a runner that has to compromise due to 1.x baggage.
> Development efforts can go into improving the runner.
>
> Thanks,
> Thomas
>
>
> On Thu, Nov 9, 2017 at 4:08 AM, Srinivas Reddy <srinivas96all...@gmail.com
> >
> wrote:
>
> > +1
> >
> >
> >
> > --
> > Srinivas Reddy
> >
> > http://mrsrinivas.com/
> >
> >
> > (Sent via gmail web)
> >
> > On 8 November 2017 at 14:27, Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
> >
> > > Hi all,
> > >
> > > as you might know, we are working on Spark 2.x support in the Spark
> > runner.
> > >
> > > I'm working on a PR about that:
> > >
> > > https://github.com/apache/beam/pull/3808
> > >
> > > Today, we have something working with both Spark 1.x and 2.x from a
> code
> > > standpoint, but I have to deal with dependencies. It's the first step
> of
> > > the update as I'm still using RDD, the second step would be to support
> > > dataframe (but for that, I would need PCollection elements with
> schemas,
> > > that's another topic on which Eugene, Reuven and I are discussing).
> > >
> > > However, as all major distributions now ship Spark 2.x, I don't think
> > it's
> > > required anymore to support Spark 1.x.
> > >
> > > If we agree, I will update and cleanup the PR to only support and focus
> > on
> > > Spark 2.x.
> > >
> > > So, that's why I'm calling for a vote:
> > >
> > >   [ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
> > >   [ ] 0 (I don't care ;))
> > >   [ ] -1, I would like to still support Spark 1.x, and so having
> support
> > > of both Spark 1.x and 2.x (please provide specific comment)
> > >
> > > This vote is open for 48 hours (I have the commits ready, just waiting
> > the
> > > end of the vote to push on the PR).
> > >
> > > Thanks !
> > > Regards
> > > JB
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> >
>

Reply via email to