I like the usage of metrics since it doesn't depend on external resources. I believe there could be some small amount of code shared between runners for the PAssert metric verification.
I would say that PAssert by itself and PAssert with metrics are two levels of testing available. For runners that don't support metrics than PAssert gives a signal (albeit weaker one) and ones that do support metrics will have a stronger signal for execution correctness. On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <[email protected]> wrote: > Currently, PAssert assertions may not happen and tests will pass while > silently hiding issues. > > Up until now, several runners have implemented an assertion that the number > of expected successful assertions have actually happened, and that no > failed assertions have happened. (runners which check this are Dataflow > runner and Spark runner). > > This has been valuable in the past to find bugs which were hidden by > passing tests. > > The work to repeat this in https://issues.apache.org/jira/browse/BEAM-1726 > has > surfaced bugs in the Flink runner that were also hidden by passing tests. > However, with the removal of aggregators in > https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be harder > to implement, since Flink runner does not support metrics. > > I believe that validating that runners do in fact support Beam model is a > blocker for first stable release. (BEAM-1726 was also marked as a blocker > for Flink runner). > > I think we have one of 2 choices here: > 1. Keep implementing this for each runner separately. > 2. Implement this in a runner agnostic way (For runners which support > metrics - use metrics, for those that do not use a fallback implementation, > perhaps using files or some other method). This should be covered by the > following ticket: https://issues.apache.org/jira/browse/BEAM-1763 > > Thoughts? >
