Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

Robert Bradshaw Thu, 09 Nov 2017 17:04:21 -0800

On Thu, Nov 9, 2017 at 11:05 AM, Kenneth Knowles <[email protected]> 
wrote:
> I think it makes sense to communicate with email to users@ and in the
> release notes of 2.2.0.


Totally agree.

> That communication should be specific and indicate
> whether we are planning to merely not work on it anymore or actually remove
> it in 2.3.0.

There seems to be some ambiguity in this vote which of these two
options we're actually considering. I'm certainly +1 on relegating it
to maintenance mode at least. I don't have a good sense on the burden
of keeping it around, nor the number of potential (current?) users
we'd be alienating, which seem to be the driving factors. The fact
that all major distributions ship 2.x is very different than the
question of whether most users have migrated to 2.x.

> On Thu, Nov 9, 2017 at 6:35 AM, Amit Sela <[email protected]> wrote:
>
>> +1 for dropping Spark 1 support.
>> I don't think we have enough users to justify supporting both, and its been
>> a long time since this idea originally came-up (when Spark2 wasn't stable)
>> and now Spark 2 is standard in all Hadoop distros.
>> As for switching to the Dataframe API, as long as Spark 2 doesn't support
>> scanning through the state periodically (even if no data for a key),
>> watermarks won't fire keys that didn't see updates.
>>
>> On Thu, Nov 9, 2017 at 9:12 AM Thomas Weise <[email protected]> wrote:
>>
>> > +1 (non-binding) for dropping 1.x support
>> >
>> > I don't have the impression that there is significant adoption for Beam
>> on
>> > Spark 1.x ? A stronger Spark runner that works well on 2.x will be better
>> > for Beam adoption than a runner that has to compromise due to 1.x
>> baggage.
>> > Development efforts can go into improving the runner.
>> >
>> > Thanks,
>> > Thomas
>> >
>> >
>> > On Thu, Nov 9, 2017 at 4:08 AM, Srinivas Reddy <
>> [email protected]
>> > >
>> > wrote:
>> >
>> > > +1
>> > >
>> > >
>> > >
>> > > --
>> > > Srinivas Reddy
>> > >
>> > > http://mrsrinivas.com/
>> > >
>> > >
>> > > (Sent via gmail web)
>> > >
>> > > On 8 November 2017 at 14:27, Jean-Baptiste Onofré <[email protected]>
>> > wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > as you might know, we are working on Spark 2.x support in the Spark
>> > > runner.
>> > > >
>> > > > I'm working on a PR about that:
>> > > >
>> > > > https://github.com/apache/beam/pull/3808
>> > > >
>> > > > Today, we have something working with both Spark 1.x and 2.x from a
>> > code
>> > > > standpoint, but I have to deal with dependencies. It's the first step
>> > of
>> > > > the update as I'm still using RDD, the second step would be to
>> support
>> > > > dataframe (but for that, I would need PCollection elements with
>> > schemas,
>> > > > that's another topic on which Eugene, Reuven and I are discussing).
>> > > >
>> > > > However, as all major distributions now ship Spark 2.x, I don't think
>> > > it's
>> > > > required anymore to support Spark 1.x.
>> > > >
>> > > > If we agree, I will update and cleanup the PR to only support and
>> focus
>> > > on
>> > > > Spark 2.x.
>> > > >
>> > > > So, that's why I'm calling for a vote:
>> > > >
>> > > >   [ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
>> > > >   [ ] 0 (I don't care ;))
>> > > >   [ ] -1, I would like to still support Spark 1.x, and so having
>> > support
>> > > > of both Spark 1.x and 2.x (please provide specific comment)
>> > > >
>> > > > This vote is open for 48 hours (I have the commits ready, just
>> waiting
>> > > the
>> > > > end of the vote to push on the PR).
>> > > >
>> > > > Thanks !
>> > > > Regards
>> > > > JB
>> > > > --
>> > > > Jean-Baptiste Onofré
>> > > > [email protected]
>> > > > http://blog.nanthrax.net
>> > > > Talend - http://www.talend.com
>> > > >
>> > >
>> >
>>

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

Reply via email to