[ 
https://issues.apache.org/jira/browse/BEAM-11088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke updated BEAM-11088:
--------------------------------
    Description: 
TestStream is a Test-Only primitive to help in verifying streaming SDK and 
Runner semantics.

It's a Known URN in the Beam Pipeline proto which Runners (like the Python 
Portable Runner) can implement, and SDKs can configure to achieve desired 
behavior.

This task is to implement Test Stream so it can be added to SDK and user test 
pipelines to simplify validating various SDK semantics. 

This task is *not* to implement and support it in the Go Direct Runner at this 
time, though a separate Jira can be filed for that. At least direct runner is 
expected to fail, clearly saying it doesn't support the TestStream primitive. 
(Further improving the Direct Runner is in issue 
https://issues.apache.org/jira/browse/BEAM-11076)

Implementing further allows large sections of tests already implemented in 
[Python|https://github.com/apache/beam/search?l=Python&q=TestStream] and 
[Java|https://github.com/apache/beam/search?l=Java&q=TestStream] to be 
replicated in Go, to further improve confidence in the SDK implementation. Care 
needs to be taken for this though, many of those tests also validate the 
"runner" implementations of test stream itself, as Python and Java already have 
runner implementations.

To implement the Go SDK side of Test Stream look to the following:
 [The original TestStream Blog 
post|https://beam.apache.org/blog/2016/10/20/test-stream.html] describing it's 
overall purpose.

[TestStreamPayload|https://github.com/apache/beam/blob/43c97d811b9ec85116dbde49cde3f0718c2498ce/model/pipeline/src/main/proto/beam_runner_api.proto#L568]
 message for configuring test streams. 

[TEST_STREAM 
Urn|https://github.com/apache/beam/blob/43c97d811b9ec85116dbde49cde3f0718c2498ce/model/pipeline/src/main/proto/beam_runner_api.proto#L266]
 for adding the test stream primitive to the pipeline graph.

Look at how [Reshuffle was added to the Go 
SDK.|https://github.com/apache/beam/pull/11197/files] While Reshuffle isn't a 
well known URN like TestStream is (and thus was more work) the same code end up 
needing to be modified to allow user side specification.
 * A [new TestStream 
edge|https://github.com/apache/beam/pull/11197/files#diff-ad3762b94450801cd205383673b76f0cc7c7aebd5f55da4e1bd61aac6512fc2e]
 needs to be added to the core/graph package. 
 * Handling translation of that edge into the TestStreamPayload needs to happen 
in the [runtime/graphx 
package|https://github.com/apache/beam/pull/11197/files#diff-ba723a9194fc9c7dd64d4b22c76b83c55c92c006cfa2dd1c7e4072c5650f71b3]
 * A user facing [beam package entry 
point|https://github.com/apache/beam/pull/11197/files#diff-ba19ec6c6322550c7ee60adec55cf212ca9d40fcc8909be98231427da32e4710]
 needs to be added with documentation so users can add TestStream to their 
pipelines. 
 * At that point, [integration 
tests|https://github.com/apache/beam/tree/master/sdks/go/test/integration/primitives]
 can begin to be added using the primitive, for supported runners. 

 

Included in this work is various convenience helper functions or libraries to 
make using test stream simple to use for end users. This may include a new user 
facing package with various options depended on by the beam, graph, and graphx 
packages for configuration. 
Like the rest of the SDK implementation, it's strongly recommended that Beam 
Pipeline Protos are handled in the graphx package to avoid overly coupling on a 
specific implementation of beam, should that change in the future.

 

 

  was:
TestStream is a Test-Only primitive to help in verifying streaming SDK and 
Runner semantics.

It's a Known URN in the Beam Pipeline proto which Runners (like the Python 
Portable Runner) can implement, and SDKs can configure to achieve desired 
behavior.

This task is to implement Test Stream so it can be added to SDK and user test 
pipelines to simplify validating various SDK semantics. 

This task is *not* to implement and support it in the Go Direct Runner at this 
time, though a separate Jira can be filed for that. At least direct runner is 
expected to fail, clearly saying it doesn't support the TestStream primitive. 
(Further improving the Direct Runner is in issue 
https://issues.apache.org/jira/browse/BEAM-11076)

Implementing further allows large sections of tests already implemented in 
[Python|https://github.com/apache/beam/search?l=Python&q=TestStream] and 
[Java|https://github.com/apache/beam/search?l=Java&q=TestStream] to be 
replicated in Go, to further improve confidence in the SDK implementation. Care 
needs to be taken for this though, many of those tests also validate the 
"runner" implementations of test stream itself, as Python and Java already have 
runner implementations.

To implement the Go SDK side of Test Stream look to the following:
[The original TestStream Blog 
post|https://beam.apache.org/blog/2016/10/20/test-stream.html] describing it's 
overall purpose.

[TestStreamPayload|https://github.com/apache/beam/blob/43c97d811b9ec85116dbde49cde3f0718c2498ce/model/pipeline/src/main/proto/beam_runner_api.proto#L568]
 message for configuring test streams. 

[TEST_STREAM 
Urn|https://github.com/apache/beam/blob/43c97d811b9ec85116dbde49cde3f0718c2498ce/model/pipeline/src/main/proto/beam_runner_api.proto#L266]
 for adding the test stream primitive to the pipeline graph.

Look at how [Reshuffle was added to the Go 
SDK.|https://github.com/apache/beam/pull/11197/files] While Reshuffle isn't a 
well known URN like TestStream is (and thus was more work) the same code end up 
needing to be modified to allow user side specification.
 * A [new TestStream 
edge|https://github.com/apache/beam/pull/11197/files#diff-ad3762b94450801cd205383673b76f0cc7c7aebd5f55da4e1bd61aac6512fc2e]
 needs to be added to the core/graph package. 
 * Handling translation of that edge into the TestStreamPayload needs to happen 
in the [runtime/graphx 
package|https://github.com/apache/beam/pull/11197/files#diff-ba723a9194fc9c7dd64d4b22c76b83c55c92c006cfa2dd1c7e4072c5650f71b3]
 * A user facing [beam package entry 
point|https://github.com/apache/beam/pull/11197/files#diff-ba19ec6c6322550c7ee60adec55cf212ca9d40fcc8909be98231427da32e4710]
 needs to be added with documentation so users can add TestStream to their 
pipelines. 
 * At that point, [integration 
tests|https://github.com/apache/beam/tree/master/sdks/go/test/integration/primitives]
 can begin to be added using the primitive, for supported runners. 

 

 

 


> [Go SDK] Implement TestStream primitive
> ---------------------------------------
>
>                 Key: BEAM-11088
>                 URL: https://issues.apache.org/jira/browse/BEAM-11088
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-go
>            Reporter: Robert Burke
>            Priority: P3
>
> TestStream is a Test-Only primitive to help in verifying streaming SDK and 
> Runner semantics.
> It's a Known URN in the Beam Pipeline proto which Runners (like the Python 
> Portable Runner) can implement, and SDKs can configure to achieve desired 
> behavior.
> This task is to implement Test Stream so it can be added to SDK and user test 
> pipelines to simplify validating various SDK semantics. 
> This task is *not* to implement and support it in the Go Direct Runner at 
> this time, though a separate Jira can be filed for that. At least direct 
> runner is expected to fail, clearly saying it doesn't support the TestStream 
> primitive. (Further improving the Direct Runner is in issue 
> https://issues.apache.org/jira/browse/BEAM-11076)
> Implementing further allows large sections of tests already implemented in 
> [Python|https://github.com/apache/beam/search?l=Python&q=TestStream] and 
> [Java|https://github.com/apache/beam/search?l=Java&q=TestStream] to be 
> replicated in Go, to further improve confidence in the SDK implementation. 
> Care needs to be taken for this though, many of those tests also validate the 
> "runner" implementations of test stream itself, as Python and Java already 
> have runner implementations.
> To implement the Go SDK side of Test Stream look to the following:
>  [The original TestStream Blog 
> post|https://beam.apache.org/blog/2016/10/20/test-stream.html] describing 
> it's overall purpose.
> [TestStreamPayload|https://github.com/apache/beam/blob/43c97d811b9ec85116dbde49cde3f0718c2498ce/model/pipeline/src/main/proto/beam_runner_api.proto#L568]
>  message for configuring test streams. 
> [TEST_STREAM 
> Urn|https://github.com/apache/beam/blob/43c97d811b9ec85116dbde49cde3f0718c2498ce/model/pipeline/src/main/proto/beam_runner_api.proto#L266]
>  for adding the test stream primitive to the pipeline graph.
> Look at how [Reshuffle was added to the Go 
> SDK.|https://github.com/apache/beam/pull/11197/files] While Reshuffle isn't a 
> well known URN like TestStream is (and thus was more work) the same code end 
> up needing to be modified to allow user side specification.
>  * A [new TestStream 
> edge|https://github.com/apache/beam/pull/11197/files#diff-ad3762b94450801cd205383673b76f0cc7c7aebd5f55da4e1bd61aac6512fc2e]
>  needs to be added to the core/graph package. 
>  * Handling translation of that edge into the TestStreamPayload needs to 
> happen in the [runtime/graphx 
> package|https://github.com/apache/beam/pull/11197/files#diff-ba723a9194fc9c7dd64d4b22c76b83c55c92c006cfa2dd1c7e4072c5650f71b3]
>  * A user facing [beam package entry 
> point|https://github.com/apache/beam/pull/11197/files#diff-ba19ec6c6322550c7ee60adec55cf212ca9d40fcc8909be98231427da32e4710]
>  needs to be added with documentation so users can add TestStream to their 
> pipelines. 
>  * At that point, [integration 
> tests|https://github.com/apache/beam/tree/master/sdks/go/test/integration/primitives]
>  can begin to be added using the primitive, for supported runners. 
>  
> Included in this work is various convenience helper functions or libraries to 
> make using test stream simple to use for end users. This may include a new 
> user facing package with various options depended on by the beam, graph, and 
> graphx packages for configuration. 
> Like the rest of the SDK implementation, it's strongly recommended that Beam 
> Pipeline Protos are handled in the graphx package to avoid overly coupling on 
> a specific implementation of beam, should that change in the future.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to