Thanks! I will try that out. Regarding the View translation, it still fails with HEAD of master (just did a pull):
------------------------------------------------------------------------------- Test set: org.apache.beam.sdk.testing.PAssertTest ------------------------------------------------------------------------------- Tests run: 10, Failures: 1, Errors: 3, Skipped: 0, Time elapsed: 29.566 sec <<< FAILURE! - in org.apache.beam.sdk.testing.PAssertTest testIsEqualTo(org.apache.beam.sdk.testing.PAssertTest) Time elapsed: 1.518 sec <<< ERROR! java.lang.IllegalStateException: no translator registered for View.CreatePCollectionView at org.apache.beam.runners.apex.ApexPipelineTranslator.visitPrimitiveTransform(ApexPipelineTranslator.java:98) On Fri, Jun 24, 2016 at 5:28 PM, Lukasz Cwik <[email protected]> wrote: > Below I outline a different approach than the DirectRunner which didn't > require an override for Create since it knows that there was no data > remaining and can correctly shut the pipeline down by pushing the watermark > all the way through the pipeline. This is a superior approach but I believe > is more difficult to get right. > > PAssert emits an aggregator with a specific name which states that the > PAssert succeeded or failed: > > https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java#L110 > > The test Dataflow runner counts how many PAsserts were applied and then > polls itself every 10 seconds checking to see if the aggregator has any > failures or all the successes for streaming pipelines. > Polling logic: > > https://github.com/apache/incubator-beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/testing/TestDataflowRunner.java#L114 > Check logic: > > https://github.com/apache/incubator-beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/testing/TestDataflowRunner.java#L177 > > As for overriding a transform, the runner is currently invoked during > application of a transform and is able to inject/replace/modify the > transform that was being applied. The test Dataflow runner uses this a > little bit to do the PAssert counting while the normal Dataflow runner does > this a lot for its own specific needs. > > Finally, I believe Ken just made some changes which removed the requirement > to support View.YYY and replaced it with GroupByKey so the no translator > registered for View... may go away. > > > On Fri, Jun 24, 2016 at 4:52 PM, Thomas Weise <[email protected]> > wrote: > > > Kenneth and Lukasz, thanks for the direction. > > > > Is there any information about other requirements to run the cross runner > > tests and hints to troubleshoot. On first attempt they mosty fail due to > > missing translator: > > > > PAssertTest.testIsEqualTo:219 ▒ IllegalState no translator registered for > > View... > > > > Also, for run() to be synchronous or wait, there needs to be an exit > > condition. I know how to solve this for the Apex runner specific tests. > But > > for the cross runner tests, what is the recommended way to do this? > Kenneth > > mentioned that Create could signal end of stream. Should I look to > override > > the Create transformation to configure the behavior ((just for this test > > suite) and if so, is there an example how to do this cleanly? > > > > Thanks, > > Thomas > > > > > > > > > > On Tue, Jun 21, 2016 at 7:32 PM, Kenneth Knowles <[email protected] > > > > wrote: > > > > > To expand on the RunnableOnService test suggestion, here [1] is the > > commit > > > from the Spark runner. You will get a lot more information if you can > > port > > > this for your runner than you would from an example end-to-end test. > > > > > > Note that this just pulls in the tests from the core SDK. For testing > > with > > > other I/O connectors, you'll add them to the dependenciesToScan. > > > > > > [1] > > > > > > > > > https://github.com/apache/incubator-beam/commit/4254749bf103c4bb6f68e316768c0aa46d9f7df0 > > > > > > On Tue, Jun 21, 2016 at 4:06 PM, Lukasz Cwik <[email protected] > > > > > wrote: > > > > > > > There is a start to getting more e2e like integration tests going > with > > > the > > > > first being WordCount. > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/examples/java/src/test/java/org/apache/beam/examples/WordCountIT.java > > > > You could add WindowedWordCountIT.java which will be launched with > the > > > > proper configuration of the Apex runner pom.xml > > > > > > > > I would also suggest that you take a look at the @RunnableOnService > > tests > > > > which are a comprehensive validation suite of ~200ish tests that test > > > > everything from triggers to side inputs. It requires some pom changes > > and > > > > creating a test runner which is able to setup an apex environment. > > > > > > > > Furthermore, we could really use an addition to the Beam wiki about > > > testing > > > > and how runners write tests/execute tests/... > > > > > > > > Some relevant links: > > > > Older presentation about getting cross runner tests going: > > > > > > > > > > > > > > https://docs.google.com/presentation/d/1uTb7dx4-Y2OM_B0_3XF_whwAL2FlDTTuq2QzP9sJ4Mg/edit#slide=id.g127d614316_19_39 > > > > > > > > Examples of test runners: > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/testing/TestDataflowRunner.java > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/TestFlinkRunner.java > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/TestSparkRunner.java > > > > > > > > Section of pom dedicated to enabling runnable on service tests: > > > > > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/runners/spark/pom.xml#L54 > > > > > > > > On Tue, Jun 21, 2016 at 2:21 PM, Thomas Weise < > [email protected]> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > As part of the Apex runner, we have a few unit tests for the > > supported > > > > > transformations. Next, I would like to test the WindowedWordCount > > > > example. > > > > > > > > > > Is there an example of configuring this pipeline for another > runner? > > Is > > > > it > > > > > recommended to supply such configuration as a JUnit test? What is > the > > > > > general (repeatable?) approach to exercise different runners with > the > > > set > > > > > of example pipelines? > > > > > > > > > > Thanks, > > > > > Thomas > > > > > > > > > > > > > > >
