[ 
https://issues.apache.org/jira/browse/BEAM-7632?focusedWorklogId=267197&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-267197
 ]

ASF GitHub Bot logged work on BEAM-7632:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 26/Jun/19 01:22
            Start Date: 26/Jun/19 01:22
    Worklog Time Spent: 10m 
      Work Description: melap commented on pull request #8949: [BEAM-7632] 
Update Python quickstart guide for Flink and Spark
URL: https://github.com/apache/beam/pull/8949#discussion_r297452381
 
 

 ##########
 File path: website/src/roadmap/portability.md
 ##########
 @@ -144,25 +144,34 @@ their respective components.
 
 MVP, and FeatureCompletness nearly done (missing SDF, timers) for
 SDKs, Python ULR, and shared java runners library.
-Flink is the first runner to fully leverage this, with focus moving to
-Performance.
+Currently, the Flink and Spark runners support portable pipeline execution.
 See the
 [Portability support 
table](https://s.apache.org/apache-beam-portability-support-table)
 for details.
 
-### Running Python wordcount on Flink or Spark {#python-on-flink}
+### Running Python wordcount on Flink {#python-on-flink}
 
-Currently, the Flink and Spark runners support portable pipeline execution.
-To run a basic Python wordcount (in batch mode) with embedded Flink or Spark:
+To run a basic Python wordcount (in batch mode) with embedded Flink:
 
 1. Run once to build the SDK harness container: `./gradlew 
:sdks:python:container:docker`
-2. Choose one:
- * Start the Flink portable JobService endpoint: `./gradlew 
:runners:flink:1.5:job-server:runShadow`
- * Or start the Spark portable JobService endpoint: `./gradlew 
:runners:spark:job-server:runShadow`
-3. Submit the wordcount pipeline to above endpoint: `./gradlew 
:sdks:python:portableWordCount -PjobEndpoint=localhost:8099 
-PenvironmentType=LOOPBACK`
+2. Start the Flink portable JobService endpoint: `./gradlew 
:runners:flink:1.5:job-server:runShadow`
+3. In a new terminal, submit the wordcount pipeline to above endpoint: 
`./gradlew :sdks:python:portableWordCount -PjobEndpoint=localhost:8099 
-PenvironmentType=LOOPBACK`
 
-To run the pipeline in streaming mode (currently only supported on Flink): 
`./gradlew :sdks:python:portableWordCount -PjobEndpoint=localhost:8099 
-Pstreaming`
+To run the pipeline in streaming mode: `./gradlew 
:sdks:python:portableWordCount -PjobEndpoint=localhost:8099 -Pstreaming`
 
 Please see the [Flink Runner page]({{ site.baseurl 
}}/documentation/runners/flink/) for more information on
 how to run portable pipelines on top of Flink.
 
+### Running Python wordcount on Spark {#python-on-spark}
+
+To run a basic Python wordcount (in batch mode) with embedded Spark:
+
+1. Run once to build the SDK harness container: `./gradlew 
:sdks:python:container:docker`
+2. Start the Spark portable JobService endpoint: `./gradlew 
:runners:spark:job-server:runShadow`
+3. In a new terminal, submit the wordcount pipeline to above endpoint: 
`./gradlew :sdks:python:portableWordCount -PjobEndpoint=localhost:8099 
-PenvironmentType=LOOPBACK`
+
+Python streaming mode is not yet supported on the Spark.
 
 Review comment:
   perhaps either remove "the", or add "runner" to the end so it is Spark runner
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 267197)
    Time Spent: 0.5h  (was: 20m)

> Update Python quickstart guide for Flink and Spark
> --------------------------------------------------
>
>                 Key: BEAM-7632
>                 URL: https://issues.apache.org/jira/browse/BEAM-7632
>             Project: Beam
>          Issue Type: Improvement
>          Components: website
>            Reporter: Kyle Weaver
>            Assignee: Kyle Weaver
>            Priority: Minor
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, the documentation says "This runner is not yet available for the 
> Python SDK.", which is out of date. 
> [https://beam.apache.org/get-started/quickstart-py/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to