Awesome work! Please sign me in for a Python Streaming Alpha, we've been looking forward for SDoFn support coming in.
In fact, we would be glad to assist on completing it faster, if there's some grunt work you can hand off. On Wed, Jul 12, 2017 at 10:07 AM, Charles Chen <[email protected]> wrote: > We recently checked in the last few changes needed to support streaming > pipelines on the Beam Python DirectRunner (BEAM-1265 > <https://issues.apache.org/jira/browse/BEAM-1265>). As of HEAD (1-2 weeks > ago) and the 2.1.0 RC, Python SDK users can now write their pipelines in > streaming mode and run them locally on their own machine. > > Check out the streaming wordcount example here (streaming_wordcount.py > <https://github.com/apache/beam/blob/master/sdks/python/ > apache_beam/examples/streaming_wordcount.py>) > and please kick the tires, try out the new functionality and report any > bugs you may encounter. Use the "--streaming" PipelineOption to enable > this new functionality. > > Currently, the I/Os supported are the TestStream > <https://github.com/apache/beam/blob/master/sdks/python/ > apache_beam/testing/test_stream.py> > and > Google Cloud PubSub I/O. Chamikara is working on implementing > SplittableDoFn as the Python streaming source API so that it will be easy > to write new streaming sources. Python streaming support for other runners > like Cloud Dataflow and Flink will be provided through the FnAPI (please > contact me if you would be interested in joining the Python Streaming Alpha > for Google Cloud Dataflow). > > For reference, here are some of the relevant PRs checked in for this > effort: > > https://github.com/apache/beam/pull/3318 > https://github.com/apache/beam/pull/3362 > https://github.com/apache/beam/pull/3370 > https://github.com/apache/beam/pull/3405 > https://github.com/apache/beam/pull/3409 > https://github.com/apache/beam/pull/3440 > https://github.com/apache/beam/pull/3444 > https://github.com/apache/beam/pull/3499 > > Best, > Charles > -- Best regards, Dmitry Demeshchuk.
