Hi all,

Quick update about the Spark 2.x runner: I updated the PR with Spark 2.x update only:

https://github.com/apache/beam/pull/3808

I will rebase and do new tests as soon as gitbox will be back.

Don't hesitate to take a look and review.

Thanks !
Regards
JB

On 11/21/2017 08:32 AM, Jean-Baptiste Onofré wrote:
Hi Tim,

I will update the PR today for a new review round. Yes, you are correct: the target is 2.3.0 for end of this year (with announcement in the Release Notes).

Regards
JB

On 11/20/2017 10:09 PM, Tim wrote:
Thanks JB

 From which release will Spark 1.x be dropped please? Is this slated for 2.3.0 at the end of the year?

Thanks,
Tim,
Sent from my iPhone

On 20 Nov 2017, at 21:21, Jean-Baptiste Onofré <j...@nanthrax.net> wrote:

Hi,
,
it seems we have a consensus to upgrade to Spark 2.x, dropping Spark 1.x. I will upgrade the PR accordingly.

Thanks all for your input and feedback.

Regards
JB

On 11/13/2017 09:32 AM, Jean-Baptiste Onofré wrote:
Hi Beamers,
I'm forwarding this discussion & vote from the dev mailing list to the user mailing list.
The goal is to have your feedback as user.
Basically, we have two options:
1. Right now, in the PR, we support both Spark 1.x and 2.x using three artifacts (common, spark1, spark2). You, as users, pick up spark1 or spark2 in your dependencies set depending the Spark target version you want. 2. The other option is to upgrade and focus on Spark 2.x in Beam 2.3.0. If you still want to use Spark 1.x, then, you will be stuck up to Beam 2.2.0.
Thoughts ?
Thanks !
Regards
JB
-------- Forwarded Message --------
Subject: [VOTE] Drop Spark 1.x support to focus on Spark 2.x
Date: Wed, 8 Nov 2017 08:27:58 +0100
From: Jean-Baptiste Onofré <j...@nanthrax.net>
Reply-To: dev@beam.apache.org
To: dev@beam.apache.org
Hi all,
as you might know, we are working on Spark 2.x support in the Spark runner.
I'm working on a PR about that:
https://github.com/apache/beam/pull/3808
Today, we have something working with both Spark 1.x and 2.x from a code standpoint, but I have to deal with dependencies. It's the first step of the update as I'm still using RDD, the second step would be to support dataframe (but for that, I would need PCollection elements with schemas, that's another topic on which Eugene, Reuven and I are discussing). However, as all major distributions now ship Spark 2.x, I don't think it's required anymore to support Spark 1.x. If we agree, I will update and cleanup the PR to only support and focus on Spark 2.x.
So, that's why I'm calling for a vote:
   [ ] +1 to drop Spark 1.x support and upgrade to Spark 2.x only
   [ ] 0 (I don't care ;))
   [ ] -1, I would like to still support Spark 1.x, and so having support of both Spark 1.x and 2.x (please provide specific comment) This vote is open for 48 hours (I have the commits ready, just waiting the end of the vote to push on the PR).
Thanks !
Regards
JB

--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to