Robert Burke created BEAM-11088:
-----------------------------------
Summary: [Go SDK] Implement TestStream primitive
Key: BEAM-11088
URL: https://issues.apache.org/jira/browse/BEAM-11088
Project: Beam
Issue Type: New Feature
Components: sdk-go
Reporter: Robert Burke
TestStream is a Test-Only primitive to help in verifying streaming SDK and
Runner semantics.
It's a Known URN in the Beam Pipeline proto which Runners (like the Python
Portable Runner) can implement, and SDKs can configure to achieve desired
behavior.
This task is to implement Test Stream so it can be added to SDK and user test
pipelines to simplify validating various SDK semantics.
This task is *not* to implement and support it in the Go Direct Runner at this
time, though a separate Jira can be filed for that. At least direct runner is
expected to fail, clearly saying it doesn't support the TestStream primitive.
(Further improving the Direct Runner is in issue
https://issues.apache.org/jira/browse/BEAM-11076)
Implementing further allows large sections of tests already implemented in
[Python|https://github.com/apache/beam/search?l=Python&q=TestStream] and
[Java|https://github.com/apache/beam/search?l=Java&q=TestStream] to be
replicated in Go, to further improve confidence in the SDK implementation. Care
needs to be taken for this though, many of those tests also validate the
"runner" implementations of test stream itself, as Python and Java already have
runner implementations.
To implement the Go SDK side of Test Stream look to the following:
[The original TestStream Blog
post|https://beam.apache.org/blog/2016/10/20/test-stream.html] describing it's
overall purpose.
[TestStreamPayload|https://github.com/apache/beam/blob/43c97d811b9ec85116dbde49cde3f0718c2498ce/model/pipeline/src/main/proto/beam_runner_api.proto#L568]
message for configuring test streams.
[TEST_STREAM
Urn|https://github.com/apache/beam/blob/43c97d811b9ec85116dbde49cde3f0718c2498ce/model/pipeline/src/main/proto/beam_runner_api.proto#L266]
for adding the test stream primitive to the pipeline graph.
Look at how [Reshuffle was added to the Go
SDK.|https://github.com/apache/beam/pull/11197/files] While Reshuffle isn't a
well known URN like TestStream is (and thus was more work) the same code end up
needing to be modified to allow user side specification.
* A [new TestStream
edge|https://github.com/apache/beam/pull/11197/files#diff-ad3762b94450801cd205383673b76f0cc7c7aebd5f55da4e1bd61aac6512fc2e]
needs to be added to the core/graph package.
* Handling translation of that edge into the TestStreamPayload needs to happen
in the [runtime/graphx
package|https://github.com/apache/beam/pull/11197/files#diff-ba723a9194fc9c7dd64d4b22c76b83c55c92c006cfa2dd1c7e4072c5650f71b3]
* A user facing [beam package entry
point|https://github.com/apache/beam/pull/11197/files#diff-ba19ec6c6322550c7ee60adec55cf212ca9d40fcc8909be98231427da32e4710]
needs to be added with documentation so users can add TestStream to their
pipelines.
* At that point, [integration
tests|https://github.com/apache/beam/tree/master/sdks/go/test/integration/primitives]
can begin to be added using the primitive, for supported runners.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)