[
https://issues.apache.org/jira/browse/BEAM-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183305#comment-15183305
]
Ryan Brush commented on BEAM-11:
--------------------------------
Not sure if others are working on this, but the commit linked below is probably
the smallest possible change to get spark-dataflow running with the current
Beam code.
Here it is:
https://github.com/rbrush/spark-dataflow/commit/0a11d747eeb6bb47bb46e179deca4c85a9d5cf33
We need to do quite a bit more with the runner before it's broadly usable; see
the ugly "TODO" around state internals in the commit. So perhaps the best path
forward is to just create a development branch of Beam that includes the
dataflow runner and we can improve on it there? Once it's in a better state we
can squash/rebase (or whatever conventions this project follows) to get a clean
merge into master.
I'm happy to create the branch if desired (although I lack commit privs), or
feel free to just grab the code from the above commit if it makes sense.
> Integrate Spark runner with Beam
> --------------------------------
>
> Key: BEAM-11
> URL: https://issues.apache.org/jira/browse/BEAM-11
> Project: Beam
> Issue Type: Task
> Components: runner-spark
> Reporter: Amit Sela
> Assignee: Amit Sela
>
> Refactor and integrate the Spark runner code against Google's contributed
> version of Dataflow - Beam.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)