[ https://issues.apache.org/jira/browse/BEAM-853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amit Sela updated BEAM-853: --------------------------- Comment: was deleted (was: Streaming tests require forcing streaming mode on {{BoundedFromUnbounded}}, and {{PAssert}} requires proper {{GroupByKey}} implementation, so those two go together.) > Force streaming execution on batch pipelines for testing. > --------------------------------------------------------- > > Key: BEAM-853 > URL: https://issues.apache.org/jira/browse/BEAM-853 > Project: Beam > Issue Type: Bug > Components: runner-spark > Reporter: Amit Sela > Assignee: Amit Sela > > Beam's {{RunnableOnService}} tests for runners are written with bounded reads > so the SparkRunner should force a streaming pipeline on an "all-batch" > pipeline. > Currently the runner will decide the translation (batch/streaming) according > to the input PCollection boundness (in case of input, according to output). > One way to overcome this would be to override this behaviour in > {{TestSparkRunner}}. > Another challenge is the implementation itself - we could clearly read the > RDD and create a single-RDD-DStream of it (following RDDs will be empty) but > this will miss the point of testing streaming pipelines both for testing the > execution of {{Read.Unbounded}} and for testing across-microbatches (tests > state management). > Finally, we have to consider how this is going to play nicely with > {{SplittableDoFn}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)