[ 
https://issues.apache.org/jira/browse/BEAM-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183305#comment-15183305
 ] 

Ryan Brush commented on BEAM-11:
--------------------------------

Not sure if others are working on this, but the commit linked below is probably 
the smallest possible change to get spark-dataflow running with the current 
Beam code. 

Here it is: 
https://github.com/rbrush/spark-dataflow/commit/0a11d747eeb6bb47bb46e179deca4c85a9d5cf33

We need to do quite a bit more with the runner before it's broadly usable; see 
the ugly "TODO" around state internals in the commit. So perhaps the best path 
forward is to just create a development branch of Beam that includes the 
dataflow runner and we can improve on it there? Once it's in a better state we 
can squash/rebase (or whatever conventions this project follows) to get a clean 
merge into master.

I'm happy to create the branch if desired (although I lack commit privs), or 
feel free to just grab the code from the above commit if it makes sense.



> Integrate Spark runner with Beam
> --------------------------------
>
>                 Key: BEAM-11
>                 URL: https://issues.apache.org/jira/browse/BEAM-11
>             Project: Beam
>          Issue Type: Task
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
>
> Refactor and integrate the Spark runner code against Google's contributed 
> version of Dataflow - Beam.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to