[ 
https://issues.apache.org/jira/browse/BEAM-735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15575453#comment-15575453
 ] 

ASF GitHub Bot commented on BEAM-735:
-------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-beam/pull/1073


> PAssertStreaming should make sure the assertion happened.
> ---------------------------------------------------------
>
>                 Key: BEAM-735
>                 URL: https://issues.apache.org/jira/browse/BEAM-735
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
>
> The Spark runner currently runs PAsserts via `PAssertStreaming` which groups 
> into a single key and asserts the values on the worker (part of the "Lambda" 
> in the Spark lingo).
> This could be a problem since Spark won't run if there's nothing to process - 
> so that if for some reason the input is missed, say reading from Kafka latest 
> or simply an empty topic, the assertion will be skipped and so we'll never 
> fail (we would like to fail if there was no input, while we expected one).
> This might change once Spark provide a better support for the Beam model in 
> streaming, but until then, it's best that our tests will consider this case 
> as well.
> I'll add an aggregator and increment for assertion, at the end I'll make sure 
> the aggregator is not 0, so that at least one assertion took place (if for 
> some reason Spark kept on for a couple of more intervals it might execute the 
> same assertion more then once, if the input is repeated).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to