Hello everyone:

As of github pull request #446 (
https://github.com/apache/incubator-beam/pull/446), we're going to replace
the DirectPipelineRunner in the Core SDK with the InProcessPipelineRunner
as the default runner (finishing BEAM-22), after which we will rename the
InProcessPipelineRunner to DirectRunner (Part of BEAM-22 and BEAM-234).
After this is done, we can refactor core runner classes out of the SDK
proper into a runner-specific module.

This move has a few consequences for users, another for connector writers,
and another for core SDK developers, which I'll tackle in order:

For users:
    * The biggest immediate change is that Pipelines reading from unbounded
sources will work out of the box!
    * Users who want to run using the DirectRunner will need to declare a
dependency on the maven artifact org.apache.beam:direct-runner in addition
to org.apache.beam:java-sdk-all
    * After the rename goes through, existing command line invocations that
specify '--runner=DirectPipelineRunner' will need to remove the word
'Pipeline' from their invocation

For connector/extension writers:
    * The connector should declare a test-scoped dependency on the direct
runner. There are some examples in PR #446 (e.g. the sdks/java/io/... POM
files)

For developers working in the core SDK (incubator-beam/sdks/java/core):
    * Some tests (those that depend on the ability to call Pipeline#run())
in the Core SDK will no longer function in an IDE by default, as no runner
exists in that module that is capable of running them (and none can be
introduced without a circular dependency). Tests that need a runner to be
run should be marked as a member of the category 'NeedsRunner', of which
there are plenty of examples. Configuring the test execution to run in the
classpath of the runners/direct-java module is sufficient to execute these
tests, or with the runners/direct-java module as an entry.

Thanks,

Thomas

Reply via email to