Any objections or comments from Spark 2 users on this topic?

—
Alexey


On 20 Apr 2022, at 19:17, Alexey Romanenko <aromanenko....@gmail.com> wrote:

Hi everyone,

A while ago, we already discussed on dev@ that there are several reasons to 
stop provide a support of Spark2 in Spark Runner (in all its variants that we 
have for now - RDD, Dataset, Portable) [1]. In two words, it brings some burden 
to Spark runner support that we would like to avoid in the future.

From the devs perspective I don’t see any objections about this. So, I’d like 
to know if there are users that still uses Spark2 for their Beam pipelines and 
it will be critical for them to keep using it. 

Please, share any your opinion on this!

—
Alexey

[1] https://lists.apache.org/thread/opfhg3xjb9nptv878sygwj9gjx38rmco

> On 31 Mar 2022, at 17:51, Alexey Romanenko <aromanenko....@gmail.com> wrote:
> 
> Hi everyone,
> 
> For the moment, Beam Spark Runner supports two versions of Spark - 2.x and 
> 3.x. 
> 
> Taking into account the several things that:
> - almost all cloud providers already mostly moved to Spark 3.x as a main 
> supported version;
> - the latest Spark 2.x release (Spark 2.4.8, maintenance release) was done 
> almost a year ago;
> - Spark 3 is considered as a mainstream Spark version for development and bug 
> fixing;
> - better to avoid the burden of maintenance (there are some incompatibilities 
> between Spark 2 and 3) of two versions; 
> 
> I’d suggest to stop support Spark 2 for the Spark Runner in the one of the 
> next Beam releases. 
> 
> What are your thoughts on this? Are there any principal objections or reasons 
> for not doing this that I probably missed?
> 
> —
> Alexey 
> 
> 

Reply via email to