Lei (Eddy) Xu created BEAM-6296:
-----------------------------------
Summary: Support Python Spark Runner
Key: BEAM-6296
URL: https://issues.apache.org/jira/browse/BEAM-6296
Project: Beam
Issue Type: New Feature
Components: runner-spark
Affects Versions: 2.9.0
Reporter: Lei (Eddy) Xu
Assignee: Amit Sela
Hello, everyone,
It would be great to have a Python version of Spark runner available to Python.
While we are happy of running Apache Beam on Dataflow, there are a few use
cases that require different dependencies and OS env which makes it be more
appropriate to run on a self-managed Spark cluster. With a spark runner for the
python SDK, there will be an option to unify the language to define data
pipelines.
Would like to see the community's feedbacks of this feature.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)