Re: Beam support on Spark 2.x

Jean-Baptiste Onofré Fri, 10 Nov 2017 04:55:05 -0800

Hi,

I guess you are not following the dev mailing list.

Spark runner supports almost all transforms and yes, you can fully use Sparkrunner to run your pipelines.


PCollection is represented with RDD and it's currently Spark 1.x.

I'm working on the Spark 2.x support (still using RDD): we have a vote inprogress on the mailing list if we want to support both Spark 1.x & Spark 2.x orjust upgrade to Spark 2.x and drop support for Spark 1.x.


You can take a look on the beam-samples: they all run using the Spark runner.

Regards
JB

On 11/10/2017 01:46 PM, Artur Mrozowski wrote:

Hi,
I have seen the compatibility matrix and I realize that Spark is not the mostsupported runner.I am curious if it is possible to run a pipeline on Spark, say with globalwindows, after processing triggers and group by key(CoGroupByKye, CombineByKey). We have definitely problems to execute a pipeline that successfully runs ondirect runner.
Is that a known issue? Is Flink the best option?

Best Regards
Artur


--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: Beam support on Spark 2.x

Reply via email to