kennknowles opened a new issue, #19549:
URL: https://github.com/apache/beam/issues/19549

   The Apache Beam vision has been to provide a framework for users to write 
and execute pipelines on the programming language of your choice, and the 
runner of your choice. As the reality of Beam has evolved towards this vision, 
the way in which Beam is run on top of runners such as Apache Spark and Apache 
Flink has changed.
   
   These changes are documented in the wiki and in design documents, and are 
accessible for Beam contributors; but they are not available in the user-facing 
documentation. This has been a barrier of adoption for other users of Beam.
   
   This project involves improving the Flink Runner page[1] to include 
strategies to deploy Beam on a few different environments: A Kubernetes 
cluster, a Google Cloud Dataproc cluster, and an AWS EMR cluster. There are 
other places in the documentation that should be updated in this regard[4][5].
   
   After working on the Flink Runner, then similar updates should be made to 
the Spark Runner page[2], and the getting started documentation[3].
   
   [1] https://beam.apache.org/documentation/runners/flink/ 
   [2] https://beam.apache.org/documentation/runners/spark/
   [3] https://beam.apache.org/get-started/beam-overview/
   [4] https://beam.apache.org/documentation/sdks/python-streaming/
   [5] 
https://beam.apache.org/documentation/sdks/python-streaming/#unsupported-features
   
   Imported from Jira 
[BEAM-7104](https://issues.apache.org/jira/browse/BEAM-7104). Original Jira may 
contain additional context.
   Reported by: pabloem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to