(I'll also note that the bit about URNs and whatnot is decouplable -- we have Pipeline surgery APIs right now, and will someday have URN-with-payload-based-surgery APIs, but we can certainly do the work to make PAssert more overridable now and be ready for full Runner API work later).
On Mon, Apr 17, 2017 at 11:14 AM, Dan Halperin <[email protected]> wrote: > I believe Pablo's existing proposal is here: https://lists.apache. > org/thread.html/CADJfNJBEuWYhhH1mzMwwvUL9Wv2HyFc8_E=9zYBKwUgT8ca1HA@mail. > gmail.com > > The idea is that we'll stick with the current design -- aggregator- (but > now metric)-driven validation of PAssert. Runners that do not support these > things can override the validation step to do something different. > > This seems to me to satisfy all parties and unblock removal of > aggregators. If a runner supports aggregators but not metrics because the > semantics are slightly different, that runner can override the behavior. > > I agree that all runners doing sensible things with PAssert should be a > first stable release blocker. But I do not think it's important that all > runners verify them the same way. There has been no proposal that provides > a single validation mechanism that works well with all runners. > > On Wed, Apr 12, 2017 at 9:24 AM, Aljoscha Krettek <[email protected]> > wrote: > >> That sounds very good! Now we only have to manage to get this in before >> the first stable release because I think this is a very important signal >> for ensuring Runner correctness. >> >> @Pablo Do you already have plans regarding 3., i.e. stable URNs for the >> assertions. And also for verifying them in a runner-agnostic way in >> TestStream, i.e. https://issues.apache.org/jira/browse/BEAM-1763? < >> https://issues.apache.org/jira/browse/BEAM-1763?> >> >> Best, >> Aljoscha >> >> > On 10. Apr 2017, at 10:10, Kenneth Knowles <[email protected]> >> wrote: >> > >> > On Sat, Apr 8, 2017 at 7:00 AM, Aljoscha Krettek <[email protected]> >> > wrote: >> > >> >> @kenn What’s the design you’re mentioning? (I probably missed it >> because >> >> I’m not completely up to data on the Jiras and ML because of Flink >> Forward >> >> preparations) >> >> >> > >> > There are three parts (I hope I say this in a way that makes everyone >> happy) >> > >> > 1. Each assertion transform is followed by a verifier transform that >> fails >> > if it sees a non-success (in addition to bumping metrics). >> > 2. Use the same trick PAssert already uses, flatten in a dummy value to >> > reduce the risk that the verifier transform never runs. >> > 3. Stable URNs for the assertion and verifier transforms so a runner >> has a >> > good chance to wire custom implementations, if it helps. >> > >> > I think someone mentioned it earlier, but these also work better with >> > metrics that overcount, since it is now about covering the verifier >> > transforms rather than an absolute number of successes. >> > >> > Kenn >> > >> > >> >>> On 7. Apr 2017, at 12:42, Kenneth Knowles <[email protected]> >> >> wrote: >> >>> >> >>> We also have a design that improves the signal even without metrics, >> so >> >> I'm >> >>> pretty happy with this. >> >>> >> >>> On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik <[email protected] >> > >> >>> wrote: >> >>> >> >>>> I like the usage of metrics since it doesn't depend on external >> >> resources. >> >>>> I believe there could be some small amount of code shared between >> >> runners >> >>>> for the PAssert metric verification. >> >>>> >> >>>> I would say that PAssert by itself and PAssert with metrics are two >> >> levels >> >>>> of testing available. For runners that don't support metrics than >> >> PAssert >> >>>> gives a signal (albeit weaker one) and ones that do support metrics >> will >> >>>> have a stronger signal for execution correctness. >> >>>> >> >>>> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <[email protected]> >> wrote: >> >>>> >> >>>>> Currently, PAssert assertions may not happen and tests will pass >> while >> >>>>> silently hiding issues. >> >>>>> >> >>>>> Up until now, several runners have implemented an assertion that the >> >>>> number >> >>>>> of expected successful assertions have actually happened, and that >> no >> >>>>> failed assertions have happened. (runners which check this are >> Dataflow >> >>>>> runner and Spark runner). >> >>>>> >> >>>>> This has been valuable in the past to find bugs which were hidden by >> >>>>> passing tests. >> >>>>> >> >>>>> The work to repeat this in https://issues.apache.org/ >> >>>> jira/browse/BEAM-1726 >> >>>>> has >> >>>>> surfaced bugs in the Flink runner that were also hidden by passing >> >> tests. >> >>>>> However, with the removal of aggregators in >> >>>>> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be >> >>>> harder >> >>>>> to implement, since Flink runner does not support metrics. >> >>>>> >> >>>>> I believe that validating that runners do in fact support Beam model >> >> is a >> >>>>> blocker for first stable release. (BEAM-1726 was also marked as a >> >> blocker >> >>>>> for Flink runner). >> >>>>> >> >>>>> I think we have one of 2 choices here: >> >>>>> 1. Keep implementing this for each runner separately. >> >>>>> 2. Implement this in a runner agnostic way (For runners which >> support >> >>>>> metrics - use metrics, for those that do not use a fallback >> >>>> implementation, >> >>>>> perhaps using files or some other method). This should be covered by >> >> the >> >>>>> following ticket: https://issues.apache.org/jira/browse/BEAM-1763 >> >>>>> >> >>>>> Thoughts? >> >>>>> >> >>>> >> >> >> >> >> >> >
