[ 
https://issues.apache.org/jira/browse/BEAM-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286900#comment-16286900
 ] 

Kenneth Knowles commented on BEAM-3323:
---------------------------------------

Doesn't TestStream + SDF yield plenty of events?

> Create a generator of finite-but-unbounded PCollection's for integration 
> testing
> --------------------------------------------------------------------------------
>
>                 Key: BEAM-3323
>                 URL: https://issues.apache.org/jira/browse/BEAM-3323
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-core
>            Reporter: Eugene Kirpichov
>            Assignee: Kenneth Knowles
>
> Several IOs have features that exhibit nontrivial behavior when writing 
> unbounded PCollection's - e.g. WriteFiles with windowed writes; BigQueryIO. 
> We need to be able to write integration tests for these features.
> Currently we have two ways to generate an unbounded PCollection without 
> reading from a real-world external streaming system such as pubsub or kafka:
> 1) TestStream, which only works in direct runner - sufficient for some tests 
> but not all: definitely not sufficient for large-scale tests or for tests 
> that need to interact with a real instance of the external system (e.g. 
> BigQueryIO). It is also quite verbose to use.
> 2) GenerateSequence.from(0) without a .to(), which returns an infinite amount 
> of data.
> GenerateSequence.from(a).to(b) returns a finite amount of data, but returns 
> it as a bounded PCollection, and doesn't report the watermark.
> I think the right thing to do here, for now, is to make 
> GenerateSequence.from(a).to(b) have an option (e.g. ".asUnbounded()", where 
> it will return an unbounded PCollection, go through UnboundedSource (or 
> potentially via SDF in runners that support it), and track the watermark 
> properly (or via a configurable watermark fn).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to