Pablo Estrada created BEAM-7104:
-----------------------------------

             Summary: Document Deployment of a Flink and Spark Clusters with 
Portable Beam
                 Key: BEAM-7104
                 URL: https://issues.apache.org/jira/browse/BEAM-7104
             Project: Beam
          Issue Type: Improvement
          Components: website
            Reporter: Pablo Estrada


The Apache Beam vision has been to provide a framework for users to write and 
execute pipelines on the programming language of your choice, and the runner of 
your choice. As the reality of Beam has evolved towards this vision, the way in 
which Beam is run on top of runners such as Apache Spark and Apache Flink has 
changed.

These changes are documented in the wiki and in design documents, and are 
accessible for Beam contributors; but they are not available in the user-facing 
documentation. This has been a barrier of adoption for other users of Beam.

This project involves improving the Flink Runner page[1] to include strategies 
to deploy Beam on a few different environments: A Kubernetes cluster, a Google 
Cloud Dataproc cluster, and an AWS EMR cluster. There are other places in the 
documentation that should be updated in this regard[4][5].

After working on the Flink Runner, then similar updates should be made to the 
Spark Runner page[2], and the getting started documentation[3].

[1] https://beam.apache.org/documentation/runners/flink/ 
[2] https://beam.apache.org/documentation/runners/spark/
[3] https://beam.apache.org/get-started/beam-overview/
[4] https://beam.apache.org/documentation/sdks/python-streaming/
[5] 
https://beam.apache.org/documentation/sdks/python-streaming/#unsupported-features



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to