[ ] Use Spark 1 & Spark 2 Support Branch
    [X] Use Spark 2 Only Branch

On Thu, Nov 16, 2017 at 5:08 AM, Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi guys,
>
> To illustrate the current discussion about Spark versions support, you can
> take a look on:
>
> --
> Spark 1 & Spark 2 Support Branch
>
> https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-MODULES
>
> This branch contains a Spark runner common module compatible with both
> Spark 1.x and 2.x. For convenience, we introduced spark1 & spark2
> modules/artifacts containing just a pom.xml to define the dependencies set.
>
> --
> Spark 2 Only Branch
>
> https://github.com/jbonofre/beam/tree/BEAM-1920-SPARK2-ONLY
>
> This branch is an upgrade to Spark 2.x and "drop" support of Spark 1.x.
>
> As I'm ready to merge one of the other in the PR, I would like to complete
> the vote/discussion pretty soon.
>
> Correct me if I'm wrong, but it seems that the preference is to drop Spark
> 1.x to focus only on Spark 2.x (for the Spark 2 Only Branch).
>
> I would like to call a final vote to act the merge I will do:
>
>     [ ] Use Spark 1 & Spark 2 Support Branch
>     [ ] Use Spark 2 Only Branch
>
> This informal vote is open for 48 hours.
>
> Please, let me know what your preference is.
>
> Thanks !
> Regards
> JB
>
> On 11/13/2017 09:32 AM, Jean-Baptiste Onofré wrote:
>
>> Hi Beamers,
>>
>> I'm forwarding this discussion & vote from the dev mailing list to the
>> user mailing list.
>> The goal is to have your feedback as user.
>>
>> Basically, we have two options:
>> 1. Right now, in the PR, we support both Spark 1.x and 2.x using three
>> artifacts (common, spark1, spark2). You, as users, pick up spark1 or spark2
>> in your dependencies set depending the Spark target version you want.
>> 2. The other option is to upgrade and focus on Spark 2.x in Beam 2.3.0.
>> If you still want to use Spark 1.x, then, you will be stuck up to Beam
>> 2.2.0.
>>
>> Thoughts ?
>>
>> Thanks !
>> Regards
>> JB
>>
>>
>> -------- Forwarded Message --------
>> Subject: [VOTE] Drop Spark 1.x support to focus on Spark 2.x
>> Date: Wed, 8 Nov 2017 08:27:58 +0100
>> From: Jean-Baptiste Onofré <j...@nanthrax.net>
>> Reply-To: dev@beam.apache.org
>> To: dev@beam.apache.org
>>
>> Hi all,
>>
>> as you might know, we are working on Spark 2.x support in the Spark
>> runner.
>>
>> I'm working on a PR about that:
>>
>> https://github.com/apache/beam/pull/3808
>>
>> Today, we have something working with both Spark 1.x and 2.x from a code
>> standpoint, but I have to deal with dependencies. It's the first step of
>> the update as I'm still using RDD, the second step would be to support
>> dataframe (but for that, I would need PCollection elements with schemas,
>> that's another topic on which Eugene, Reuven and I are discussing).
>>
>> However, as all major distributions now ship Spark 2.x, I don't think
>> it's required anymore to support Spark 1.x.
>>
>> If we agree, I will update and cleanup the PR to only support and focus
>> on Spark 2.x.
>>
>> So, that's why I'm calling for a vote:
>>
>>    [ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
>>    [ ] 0 (I don't care ;))
>>    [ ] -1, I would like to still support Spark 1.x, and so having support
>> of both Spark 1.x and 2.x (please provide specific comment)
>>
>> This vote is open for 48 hours (I have the commits ready, just waiting
>> the end of the vote to push on the PR).
>>
>> Thanks !
>> Regards
>> JB
>>
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to