Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-10 Thread Jean-Baptiste Onofré
I think so ;) Regards JB On 11/10/2017 09:29 AM, Reuven Lax wrote: Sounds good. I doubt we will have much opposition from users, in which case Beam 2.3.0 can deprecate Spark 1.x On Thu, Nov 9, 2017 at 11:54 PM, Jean-Baptiste Onofré wrote: Hi all, thanks a lot for all your feedback. The tr

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-10 Thread Reuven Lax
Sounds good. I doubt we will have much opposition from users, in which case Beam 2.3.0 can deprecate Spark 1.x On Thu, Nov 9, 2017 at 11:54 PM, Jean-Baptiste Onofré wrote: > Hi all, > > thanks a lot for all your feedback. > > The trend is about to upgrade to Spark 2.x and drop Spark 1.x support.

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-09 Thread Jean-Baptiste Onofré
Hi all, thanks a lot for all your feedback. The trend is about to upgrade to Spark 2.x and drop Spark 1.x support. However, some of you (especially Reuven and Robert) commented that users have to be pinged as well. It makes perfect sense, and it was my intention. I propose the following acti

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-09 Thread Robert Bradshaw
On Thu, Nov 9, 2017 at 11:05 AM, Kenneth Knowles wrote: > I think it makes sense to communicate with email to users@ and in the > release notes of 2.2.0. Totally agree. > That communication should be specific and indicate > whether we are planning to merely not work on it anymore or actually re

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-09 Thread Reuven Lax
+1 from me. However let's notify users@ first. If we do get a lot of pushback from users (which I doubt we will), we might reconsider dropping Spark 1 support. On Thu, Nov 9, 2017 at 11:05 AM, Kenneth Knowles wrote: > +1 from me, with a friendly deprecation process > > I am convinced by the foll

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-09 Thread Kenneth Knowles
+1 from me, with a friendly deprecation process I am convinced by the following: - We don't have the resources to make both great, and anyhow it isn't worth it - People keeping up with Beam releases are likely to be keeping up with Spark as well - Spark 1 users already have a Spark 1 runner fo

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-09 Thread Amit Sela
+1 for dropping Spark 1 support. I don't think we have enough users to justify supporting both, and its been a long time since this idea originally came-up (when Spark2 wasn't stable) and now Spark 2 is standard in all Hadoop distros. As for switching to the Dataframe API, as long as Spark 2 doesn'

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-09 Thread Thomas Weise
+1 (non-binding) for dropping 1.x support I don't have the impression that there is significant adoption for Beam on Spark 1.x ? A stronger Spark runner that works well on 2.x will be better for Beam adoption than a runner that has to compromise due to 1.x baggage. Development efforts can go into

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-09 Thread Srinivas Reddy
+1 -- Srinivas Reddy http://mrsrinivas.com/ (Sent via gmail web) On 8 November 2017 at 14:27, Jean-Baptiste Onofré wrote: > Hi all, > > as you might know, we are working on Spark 2.x support in the Spark runner. > > I'm working on a PR about that: > > https://github.com/apache/beam/pull/38

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-09 Thread Ismaël Mejía
+1 for the move to Spark 2 modulo preventing users and deciding on support: I agree that having compatibility for both versions of Spark is desirable but I am not sure if is worth the effort. Apart of the reasons mentioned by Holden and Pei, I will add that the burden of simultaneous maintenance c

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Pei HE
+1 on moving forward with Spark 2.x only. Spark 1 users can still use already released Spark runners, and we can support them with minor version releases for future bug fixes. I don't see how important it is to make future Beam releases available to Spark 1 users. If they choose not to upgrade Spa

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Holden Karau
That's a good point about Oozie does only supporting only Spark 1 or 2 at a time on a cluster -- but do we know people using Oozie and Spark 1 that would still be using Spark 1 by the time of the next BEAM release? The last Spark 1 release was a year ago (and last non-maintenance release almost 20

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread NerdyNick
I don't know if ditching Spark 1 out right right now would be a great move given that a lot of the main support applications around spark haven't yet fully moved to Spark 2 yet. Yet alone have support for having a cluster with both. Oozie for example is still pre stable release for their Spark 1 an

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Holden Karau
Also, upgrading Spark 1 to 2 is generally easier than changing JVM versions. For folks using YARN or the hosted environments it pretty much trivial since you can effectively have distinct Spark clusters for each job. On Wed, Nov 8, 2017 at 9:19 PM, Holden Karau wrote: > I'm +1 on dropping Spark

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Holden Karau
I'm +1 on dropping Spark 1. There are a lot of exciting improvements in Spark 2, and trying to write efficient code that runs between Spark 1 and Spark 2 is super painful in the long term. It would be one thing if there were a lot of people available to work on the Spark runners, but it seems like

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Ted Yu
Having both Spark1 and Spark2 modules would benefit wider user base. I would vote for that. Cheers On Wed, Nov 8, 2017 at 12:51 AM, Jean-Baptiste Onofré wrote: > Hi Robert, > > Thanks for your feedback ! > > From an user perspective, with the current state of the PR, the same > pipelines can r

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Jean-Baptiste Onofré
Hi Robert, Thanks for your feedback ! From an user perspective, with the current state of the PR, the same pipelines can run on both Spark 1.x and 2.x: the only difference is the dependencies set. I'm calling the vote to get suck kind of feedback: if we consider Spark 1.x still need to be su

Re: [VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Robert Bradshaw
I'm generally a -0.5 on this change, or at least doing so hastily. As with dropping Java 7 support, I think this should at least be announced in release notes that we're considering dropping support in the subsequent release, as this dev list likely does not reach a substantial portion of the user

[VOTE] Drop Spark 1.x support to focus on Spark 2.x

2017-11-08 Thread Jean-Baptiste Onofré
Hi all, as you might know, we are working on Spark 2.x support in the Spark runner. I'm working on a PR about that: https://github.com/apache/beam/pull/3808 Today, we have something working with both Spark 1.x and 2.x from a code standpoint, but I have to deal with dependencies. It's the firs