[
https://issues.apache.org/jira/browse/BEAM-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308622#comment-15308622
]
ASF GitHub Bot commented on BEAM-155:
-------------------------------------
GitHub user tgroh opened a pull request:
https://github.com/apache/incubator-beam/pull/406
[BEAM-155] Use custom Assertions in Spark Streaming Tests
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
- [ ] Make sure the PR title is formatted like:
`[BEAM-<Jira issue #>] Description of pull request`
- [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
Travis-CI on your fork and ensure the whole test matrix passes).
- [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
number, if there is one.
- [ ] If this contribution is large, please file an Apache
[Individual Contributor License
Agreement](https://www.apache.org/licenses/icla.txt).
---
Spark Streaming Side Inputs behave differently than the Beam Model. As
the underlying implementation of PAssert is based on side inputs, this
means that Streaming Spark Tests that use SideInputs as the actuals are
non-portable.
More specifically, this enables pipeline-construction time enforcement
that a Preexisting Side Input must be in a window compatible with the
Global Window (otherwise the side input WindowFn should throw an
exception, e.g. in
[PartitioningWindowFn](https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/PartitioningWindowFn.java#L46)
Modify FlattenStreamingTest, KafkaStreamingTest, and
SimpleStreamingWordCountTest to group all of the contents of the
asserted PCollection into a single key, and assert the contents of that
concatenation, rather than doing so via PAssert and Side Input.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tgroh/incubator-beam
spark_custom_streaming_assertions
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/406.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #406
----
commit ccf3412b06862dddaabbd0869aa0aefca3c77156
Author: Thomas Groh <[email protected]>
Date: 2016-05-31T20:27:43Z
Use custom Assertions in Spark Streaming Tests
Spark Streaming Side Inputs behave differently than the Beam Model. As
the underlying implementation of PAssert is based on side inputs, this
means that Streaming Spark Tests that use SideInputs as the actuals are
non-portable.
Modify FlattenStreamingTest, KafkaStreamingTest, and
SimpleStreamingWordCountTest to group all of the contents of the
asserted PCollection into a single key, and assert the contents of that
concatenation, rather than doing so via PAssert and Side Input.
----
> Support asserting the contents of windows and panes in PAssert
> --------------------------------------------------------------
>
> Key: BEAM-155
> URL: https://issues.apache.org/jira/browse/BEAM-155
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-core
> Reporter: Thomas Groh
> Assignee: Thomas Groh
>
> This consists of reifying the output windows and panes, and running asserts
> per-window about the contents of panes.
> This includes aggregated matching and final pane matching, e.g.
> PAssert.that(output).byOnTimePane().hasOutputElements(foo, bar);
> // For discarding mode - could have emitted (say) [spam, eggs], [spam], [],
> [sausage], []
> PAssert.that(output).byFinalPane().hasOutputElements(spam, eggs, sausage,
> spam);
> // For accumulating mode without late data
> PAssert.that(output).finalPane().containsInAnyOrder(spam, eggs, sausage,
> spam);
> // For accumulating mode with late data
> PAssert.that(output).finalPane().containsInAnyOrder(foo,
> bar).mayAlsoContain(baz, rab);
> See also:
> https://docs.google.com/document/d/1fZUUbG2LxBtqCVabQshldXIhkMcXepsbv2vuuny8Ix4/edit#
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)