There are a few tests that still use side inputs to implement the ; the big ones are anything that asserts on a singleton; these are `PAssert.thatSingleton`, `PAssert.thatMap` and `PAssert.thatMultimap`. All of the other asserts are implemented via GroupByKey, and these are by far the most common form of assertions in the Core SDK; however, PAssertTest will continue to fail without side input support.
On Fri, Jun 24, 2016 at 5:58 PM, Thomas Weise <[email protected]> wrote: > Thanks! I will try that out. > > Regarding the View translation, it still fails with HEAD of master (just > did a pull): > > > ------------------------------------------------------------------------------- > Test set: org.apache.beam.sdk.testing.PAssertTest > > ------------------------------------------------------------------------------- > Tests run: 10, Failures: 1, Errors: 3, Skipped: 0, Time elapsed: 29.566 sec > <<< FAILURE! - in org.apache.beam.sdk.testing.PAssertTest > testIsEqualTo(org.apache.beam.sdk.testing.PAssertTest) Time elapsed: 1.518 > sec <<< ERROR! > java.lang.IllegalStateException: no translator registered for > View.CreatePCollectionView > at > > org.apache.beam.runners.apex.ApexPipelineTranslator.visitPrimitiveTransform(ApexPipelineTranslator.java:98) > > > On Fri, Jun 24, 2016 at 5:28 PM, Lukasz Cwik <[email protected]> > wrote: > > > Below I outline a different approach than the DirectRunner which didn't > > require an override for Create since it knows that there was no data > > remaining and can correctly shut the pipeline down by pushing the > watermark > > all the way through the pipeline. This is a superior approach but I > believe > > is more difficult to get right. > > > > PAssert emits an aggregator with a specific name which states that the > > PAssert succeeded or failed: > > > > > https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java#L110 > > > > The test Dataflow runner counts how many PAsserts were applied and then > > polls itself every 10 seconds checking to see if the aggregator has any > > failures or all the successes for streaming pipelines. > > Polling logic: > > > > > https://github.com/apache/incubator-beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/testing/TestDataflowRunner.java#L114 > > Check logic: > > > > > https://github.com/apache/incubator-beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/testing/TestDataflowRunner.java#L177 > > > > As for overriding a transform, the runner is currently invoked during > > application of a transform and is able to inject/replace/modify the > > transform that was being applied. The test Dataflow runner uses this a > > little bit to do the PAssert counting while the normal Dataflow runner > does > > this a lot for its own specific needs. > > > > Finally, I believe Ken just made some changes which removed the > requirement > > to support View.YYY and replaced it with GroupByKey so the no translator > > registered for View... may go away. > > > > > > On Fri, Jun 24, 2016 at 4:52 PM, Thomas Weise <[email protected]> > > wrote: > > > > > Kenneth and Lukasz, thanks for the direction. > > > > > > Is there any information about other requirements to run the cross > runner > > > tests and hints to troubleshoot. On first attempt they mosty fail due > to > > > missing translator: > > > > > > PAssertTest.testIsEqualTo:219 ▒ IllegalState no translator registered > for > > > View... > > > > > > Also, for run() to be synchronous or wait, there needs to be an exit > > > condition. I know how to solve this for the Apex runner specific tests. > > But > > > for the cross runner tests, what is the recommended way to do this? > > Kenneth > > > mentioned that Create could signal end of stream. Should I look to > > override > > > the Create transformation to configure the behavior ((just for this > test > > > suite) and if so, is there an example how to do this cleanly? > > > > > > Thanks, > > > Thomas > > > > > > > > > > > > > > > On Tue, Jun 21, 2016 at 7:32 PM, Kenneth Knowles > <[email protected] > > > > > > wrote: > > > > > > > To expand on the RunnableOnService test suggestion, here [1] is the > > > commit > > > > from the Spark runner. You will get a lot more information if you can > > > port > > > > this for your runner than you would from an example end-to-end test. > > > > > > > > Note that this just pulls in the tests from the core SDK. For testing > > > with > > > > other I/O connectors, you'll add them to the dependenciesToScan. > > > > > > > > [1] > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/commit/4254749bf103c4bb6f68e316768c0aa46d9f7df0 > > > > > > > > On Tue, Jun 21, 2016 at 4:06 PM, Lukasz Cwik > <[email protected] > > > > > > > wrote: > > > > > > > > > There is a start to getting more e2e like integration tests going > > with > > > > the > > > > > first being WordCount. > > > > > > > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/examples/java/src/test/java/org/apache/beam/examples/WordCountIT.java > > > > > You could add WindowedWordCountIT.java which will be launched with > > the > > > > > proper configuration of the Apex runner pom.xml > > > > > > > > > > I would also suggest that you take a look at the @RunnableOnService > > > tests > > > > > which are a comprehensive validation suite of ~200ish tests that > test > > > > > everything from triggers to side inputs. It requires some pom > changes > > > and > > > > > creating a test runner which is able to setup an apex environment. > > > > > > > > > > Furthermore, we could really use an addition to the Beam wiki about > > > > testing > > > > > and how runners write tests/execute tests/... > > > > > > > > > > Some relevant links: > > > > > Older presentation about getting cross runner tests going: > > > > > > > > > > > > > > > > > > > > https://docs.google.com/presentation/d/1uTb7dx4-Y2OM_B0_3XF_whwAL2FlDTTuq2QzP9sJ4Mg/edit#slide=id.g127d614316_19_39 > > > > > > > > > > Examples of test runners: > > > > > > > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/testing/TestDataflowRunner.java > > > > > > > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/TestFlinkRunner.java > > > > > > > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/TestSparkRunner.java > > > > > > > > > > Section of pom dedicated to enabling runnable on service tests: > > > > > > > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/runners/spark/pom.xml#L54 > > > > > > > > > > On Tue, Jun 21, 2016 at 2:21 PM, Thomas Weise < > > [email protected]> > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > As part of the Apex runner, we have a few unit tests for the > > > supported > > > > > > transformations. Next, I would like to test the WindowedWordCount > > > > > example. > > > > > > > > > > > > Is there an example of configuring this pipeline for another > > runner? > > > Is > > > > > it > > > > > > recommended to supply such configuration as a JUnit test? What is > > the > > > > > > general (repeatable?) approach to exercise different runners with > > the > > > > set > > > > > > of example pipelines? > > > > > > > > > > > > Thanks, > > > > > > Thomas > > > > > > > > > > > > > > > > > > > > >
