Re: [DISCUSS] Change "RunnableOnService" To A More Intuitive Name

Robert Bradshaw Mon, 27 Mar 2017 14:39:00 -0700

On Mon, Mar 27, 2017 at 1:16 PM, Eugene Kirpichov <
kirpic...@google.com.invalid> wrote:


> Thanks Kenn, this breakdown makes a lot of sense. We should probably
> clarify this in the documentation of both of these annotations. Though now
> that I look at the current docs, they seem clear enough, but perhaps they
> can be phrased stronger.
>
> To summarize:
> - ValidatesRunner tests are for testing a runner. They should be created
> mostly by core Beam SDK authors, when introducing a new Beam feature etc.
> It may even make sense to make this annotation private to the beam sdk core
> module.
>

+1, but see below.


> - NeedsRunner tests are for testing a transform. That's what all external
> users should be using, authors of IOs, etc. It's unspecified which runner
> will be used, but in practice usually it'll be direct runner.
>

My one issue with NeedsRunner is that as one moves further from core, the
public API of Beam libraries tend to be suites of PTransforms (including
IOs) and almost every strong test of the public API (which is where most
tests should live) has at least one test that becomes NeedsRunner. It seems
annotations should be used to mark the exception, not the norm.

Perhaps this also stems from my point of view that the Direct Runner is in
many ways part of the SDK as the reference implementation and test runner
(the SDK minus any runner isn't very useful) and so there's little or no
need to distinguish tests that require a runner from those that don't
(especially the further you get from core, where you should be able to
assume you have a runner if you have the ability to construct pipelines and
run PAssert).

- The current situation (use of both of these annotations) does not quite
> reflect that, and should be fixed.
> - There might be a use case for testing a transform against all runners,
> and we don't have an agreed-upon solution about how to do that:
> ValidatesRunner technically accomplishes this, but it's logically wrong to
> use it in this capacity.
>

I think such tests do validate runners insofar as the @ValidatesRunner
suites are not comprehensive.

In the ideal world, one could simply assume the primitive transforms are
perfectly tested with 100% coverage of all relevant permutations of
features, and tests of composite transforms can assume correct(*)
implementation of primitives and only test their assembly on a single
runner. In practice, our tests are not comprehensive, and these
higher-level tests form a much larger suite of additional tests biased
towards how the primitive are actually composed and used. Somewhat like a
second line of defense. As such, its useful to be able to run this larger
suite on a variety of runners, even if it's theoretically redundant.

(There's also the sticky issue that if a runner overrides a composite
transform there's suddenly value in running that transform's tests against
that runner. It's unclear where that information should live (seems to
belong to the runner, but it should just reference (and possibly augment)
the existing suite).)

(*) Even the notion of correctness is difficult here, as one might be
relying of hidden assumptions that don't hold on all runners because they
are not part of the spec. This is the case for the wild-and-crazy test
runner that goes out of its way to violate assumptions.

On Mon, Mar 27, 2017 at 12:50 PM Kenneth Knowles <k...@google.com.invalid>
> wrote:
>
> > Without claiming that this is the final set of categories or that they
> are
> > used correctly right now, here is what I think they mean:
> >
> >  - ValidatesRunner tests should be tests of the runner itself, generally
> > that it implements a primitive correctly
> >  - NeedsRunner tests should be tests of the PTransform/pipeline, assuming
> > the runner is correct
> >
> > Notably, to the extent the assumption of runner correctness holds, this
> > implies that it is OK to run NeedsRunner tests with just the direct
> runner.
> >
> > Pragmatically, in the Java SDK & IOs, this is not the current breakdown.
> >
> >  - In the Java SDK the NeedsRunner category is probably used more to flag
> > "run this with just the direct runner" than to express the semantic
> intent.
> > That isn't so bad; it is very close to the right usage.
> >
> >  - There are IOs that had RunnableOnService tests which are now
> > ValidatesRunner tests. While the ability to run an IO does validate a
> > runner, this is really an integration test of the IO. If they are to be
> run
> > with just the direct runner they don't need any annotation, because the
> IO
> > can take a test-scoped dependency on the direct runner. So it mostly
> makes
> > sense to tag those tests for which it is profitable to run against all
> > runners.
> >
> > I think the question of IO ITs with the intent to run across runners is
> > currently under design discussion and I would defer to other people on
> the
> > best way to do that. It could be a new category, or it could be a
> different
> > design pattern entirely.
> >
> > Kenn
> >
> > On Mon, Mar 27, 2017 at 11:55 AM, Eugene Kirpichov <
> > kirpic...@google.com.invalid> wrote:
> >
> > > Kenn - can you also remind for everybody, what is the difference
> between
> > > @NeedsRunner and @ValidatesRunner, and when should one use one or the
> > > other? I always find myself confused about this especially in code
> > reviews.
> > >
> > > On Mon, Mar 27, 2017 at 11:32 AM Kenneth Knowles
> <k...@google.com.invalid
> > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I just merged the rename from RunnableOnService to ValidatesRunner in
> > the
> > > > Java codebase (Python was already there)
> > > > https://github.com/apache/beam/pull/2157.
> > > >
> > > > I'm sure there will be stragglers throughout our docs, etc, so please
> > do
> > > > help me catch them and fix them. And start learning to say
> > > > "ValidatesRunner" in conversation :-)
> > > >
> > > > Kenn
> > > >
> > > > On Thu, Nov 10, 2016 at 1:01 PM, Lukasz Cwik
> <lc...@google.com.invalid
> > >
> > > > wrote:
> > > >
> > > > > The default is a crashing runner which throws an exception if its
> > > > executed.
> > > > > This makes SDK core/examples/... not depend on any implemented
> > runners.
> > > > >
> > > > > On Thu, Nov 10, 2016 at 12:37 PM, Robert Bradshaw <
> > > > > rober...@google.com.invalid> wrote:
> > > > >
> > > > > > +1 to ValidatesRunner. I'd be nice if it were (optionally?)
> > > > > > parameterized by which feature it validates.
> > > > > >
> > > > > > @NeedsRunner is odd, as using a runner is the most natural way to
> > > > > > write many (most) tests, but an annotation should be used to mark
> > the
> > > > > > exception, not the norm. (I'd just assume a runner is available
> for
> > > > > > all tests, e.g. CoreTests depends on DirectRunner depends on
> Core).
> > > > > >
> > > > > > On Thu, Nov 10, 2016 at 10:14 AM, Mark Liu
> > > <mark...@google.com.invalid
> > > > >
> > > > > > wrote:
> > > > > > > +1 ValidatesRunner
> > > > > > >
> > > > > > > On Thu, Nov 10, 2016 at 8:40 AM, Kenneth Knowles
> > > > > <k...@google.com.invalid
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Nice. I like ValidatesRunner.
> > > > > > >>
> > > > > > >> On Nov 10, 2016 03:39, "Amit Sela" <amitsel...@gmail.com>
> > wrote:
> > > > > > >>
> > > > > > >> > How about @ValidatesRunner ?
> > > > > > >> > Seems to complement @NeedsRunner as well.
> > > > > > >> >
> > > > > > >> > On Thu, Nov 10, 2016 at 9:47 AM Aljoscha Krettek <
> > > > > aljos...@apache.org
> > > > > > >
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > +1
> > > > > > >> > >
> > > > > > >> > > What I would really like to see is automatic derivation of
> > the
> > > > > > >> capability
> > > > > > >> > > matrix from an extended Runner Test Suite. (As outlined in
> > > > Thomas'
> > > > > > >> doc).
> > > > > > >> > >
> > > > > > >> > > On Wed, 9 Nov 2016 at 21:42 Kenneth Knowles
> > > > > <k...@google.com.invalid
> > > > > > >
> > > > > > >> > > wrote:
> > > > > > >> > >
> > > > > > >> > > > Huge +1 to this.
> > > > > > >> > > >
> > > > > > >> > > > The two categories I care most about are:
> > > > > > >> > > >
> > > > > > >> > > > 1. Tests that need a runner, but are testing the other
> > > "thing
> > > > > > under
> > > > > > >> > > test";
> > > > > > >> > > > today this is NeedsRunner.
> > > > > > >> > > > 2. Tests that are intended to test a runner; today this
> is
> > > > > > >> > > > RunnableOnService.
> > > > > > >> > > >
> > > > > > >> > > > Actually the lines are not necessary clear between them,
> > > but I
> > > > > > think
> > > > > > >> we
> > > > > > >> > > can
> > > > > > >> > > > make good choices, like we already do.
> > > > > > >> > > >
> > > > > > >> > > > The idea of two categories with a common superclass
> > actually
> > > > > has a
> > > > > > >> > > pitfall:
> > > > > > >> > > > what if a test is put in the superclass category, when
> it
> > > does
> > > > > not
> > > > > > >> > have a
> > > > > > >> > > > clear meaning? And also, I don't have any good ideas for
> > > > names.
> > > > > > >> > > >
> > > > > > >> > > > So I think just replacing RunnableOnService with
> > RunnerTest
> > > to
> > > > > > make
> > > > > > >> > clear
> > > > > > >> > > > that it is there just to test the runner is good. We
> might
> > > > also
> > > > > > want
> > > > > > >> > > > RunnerIntegrationTest extends NeedsRunner to use in the
> IO
> > > > > > modules.
> > > > > > >> > > >
> > > > > > >> > > > See also Thomas's doc on capability matrix testing*
> which
> > is
> > > > > > aimed at
> > > > > > >> > > case
> > > > > > >> > > > 2. Those tests should all have a category from the doc,
> > or a
> > > > new
> > > > > > one
> > > > > > >> > > added.
> > > > > > >> > > >
> > > > > > >> > > > *
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > https://docs.google.com/document/d/1fICxq32t9yWn9qXhmT07xpclHeHX2
> > > > > > >> > VlUyVtpi2WzzGM/edit
> > > > > > >> > > >
> > > > > > >> > > > Kenn
> > > > > > >> > > >
> > > > > > >> > > > On Wed, Nov 9, 2016 at 12:20 PM, Jean-Baptiste Onofré <
> > > > > > >> j...@nanthrax.net
> > > > > > >> > >
> > > > > > >> > > > wrote:
> > > > > > >> > > >
> > > > > > >> > > > > Hi Mark,
> > > > > > >> > > > >
> > > > > > >> > > > > Generally speaking, I agree.
> > > > > > >> > > > >
> > > > > > >> > > > > As RunnableOnService extends NeedsRunner,
> > @TestsWithRunner
> > > > or
> > > > > > >> > > > @RunOnRunner
> > > > > > >> > > > > sound clearer.
> > > > > > >> > > > >
> > > > > > >> > > > > Regards
> > > > > > >> > > > > JB
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > On 11/09/2016 09:00 PM, Mark Liu wrote:
> > > > > > >> > > > >
> > > > > > >> > > > >> Hi all,
> > > > > > >> > > > >>
> > > > > > >> > > > >> I'm working on building RunnableOnService in Python
> > SDK.
> > > > > After
> > > > > > >> > having
> > > > > > >> > > > >> discussions with folks, "RunnableOnService" looks
> like
> > > not
> > > > a
> > > > > > very
> > > > > > >> > > > >> intuitive
> > > > > > >> > > > >> name for those unit tests that require runners and
> > build
> > > > > > >> lightweight
> > > > > > >> > > > >> pipelines to test specific components. Especially,
> they
> > > > don't
> > > > > > have
> > > > > > >> > to
> > > > > > >> > > > run
> > > > > > >> > > > >> on a service.
> > > > > > >> > > > >>
> > > > > > >> > > > >> So I want to raise this idea to the community and see
> > if
> > > > > anyone
> > > > > > >> have
> > > > > > >> > > > >> similar thoughts. Maybe we can come up with a name
> this
> > > is
> > > > > > tight
> > > > > > >> to
> > > > > > >> > > > >> runner.
> > > > > > >> > > > >> Currently, I have two names in my head:
> > > > > > >> > > > >>
> > > > > > >> > > > >> - TestsWithRunners
> > > > > > >> > > > >> - RunnerExecutable
> > > > > > >> > > > >>
> > > > > > >> > > > >> Any thoughts?
> > > > > > >> > > > >>
> > > > > > >> > > > >> Thanks,
> > > > > > >> > > > >> Mark
> > > > > > >> > > > >>
> > > > > > >> > > > >>
> > > > > > >> > > > > --
> > > > > > >> > > > > Jean-Baptiste Onofré
> > > > > > >> > > > > jbono...@apache.org
> > > > > > >> > > > > http://blog.nanthrax.net
> > > > > > >> > > > > Talend - http://www.talend.com
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Change "RunnableOnService" To A More Intuitive Name

Reply via email to