We recently checked in the last few changes needed to support streaming pipelines on the Beam Python DirectRunner (BEAM-1265 <https://issues.apache.org/jira/browse/BEAM-1265>). As of HEAD (1-2 weeks ago) and the 2.1.0 RC, Python SDK users can now write their pipelines in streaming mode and run them locally on their own machine.
Check out the streaming wordcount example here (streaming_wordcount.py <https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/streaming_wordcount.py>) and please kick the tires, try out the new functionality and report any bugs you may encounter. Use the "--streaming" PipelineOption to enable this new functionality. Currently, the I/Os supported are the TestStream <https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/test_stream.py> and Google Cloud PubSub I/O. Chamikara is working on implementing SplittableDoFn as the Python streaming source API so that it will be easy to write new streaming sources. Python streaming support for other runners like Cloud Dataflow and Flink will be provided through the FnAPI (please contact me if you would be interested in joining the Python Streaming Alpha for Google Cloud Dataflow). For reference, here are some of the relevant PRs checked in for this effort: https://github.com/apache/beam/pull/3318 https://github.com/apache/beam/pull/3362 https://github.com/apache/beam/pull/3370 https://github.com/apache/beam/pull/3405 https://github.com/apache/beam/pull/3409 https://github.com/apache/beam/pull/3440 https://github.com/apache/beam/pull/3444 https://github.com/apache/beam/pull/3499 Best, Charles
