Re: Style: how much testing for transform builder classes?

Eugene Kirpichov Tue, 29 Aug 2017 14:17:27 -0700

The new guidance has been submitted
https://beam.apache.org/contribute/ptransform-style-guide/#testing-transform-construction-and-validation


On Fri, Aug 18, 2017 at 10:48 PM Jean-Baptiste Onofré <[email protected]>
wrote:

> Hi Eugene
>
> It sounds good to me.
>
> Regards
> JB
>
> On Aug 18, 2017, 21:42, at 21:42, Eugene Kirpichov
> <[email protected]> wrote:
> >Hi all,
> >
> >This seems to have slipped through the cracks. I'd like to raise this
> >again
> >since I'm doing a cleanup in https://github.com/apache/beam/pull/3730 ,
> >and
> >I'd like to get consensus and add the guidance to PTransform Style
> >Guide or
> >the Testing Guide.
> >
> >Let me rephrase my suggestions from the previous email more clearly and
> >concisely:
> >- Do not test successful validation
> >- Do not test trivial validation errors (e.g. "fails when a property is
> >unset/null/empty/negative/...")
> >- Add 1 test per non-trivial class of validation error (e.g.
> >"properties
> >foo and bar are exclusive", validation of query syntax, etc)
> >- Define "nontrivial" as: where missing/incorrect/uninformative
> >validation
> >may lead to serious problems: data loss, counter-intuitive behavior,
> >value
> >of a property being silently ignored, or other hard-to-debug errors.
> >- Test that each withFoo() method generally has effect (is not
> >ignored),
> >using TestPipeline and PAssert
> >
> >Does this sound reasonable?
> >
> >On Mon, Mar 27, 2017 at 2:28 PM Eugene Kirpichov <[email protected]>
> >wrote:
> >
> >> From Ismael's and Dan's comments, I think I agree that there's a
> >class of
> >> easy-to-make errors that justifies having some builder tests.
> >>
> >> 1. Errors of the form "withFoo() instead sets bar, which is of the
> >same
> >> type as foo so it still compiles" (what Ismael quoted). This, I
> >think,
> >> should be caught by NeedsRunner tests, because it means that
> >withFoo() is
> >> effectively being completely ignored, and NeedsRunner tests need to
> >test
> >> that each property is taking effect, semantically.
> >> 2. Errors of the form "withFoo() overrides bar with a default value"
> >(e.g.
> >> in the commit Dan quoted). This means that foo and bar don't commute,
> >i.e.
> >> .withFoo().withBar() is not the same as .withBar().withFoo().
> >>
> >> Errors of the form #2 warrant builder tests. To catch errors of this
> >form,
> >> it's sufficient to have a single test - that a transform fully
> >configured
> >> (with non-default values) in one order is equivalent to a transform
> >fully
> >> configured in precisely the reverse order - this way any two
> >properties
> >> foo,bar will be tested for whether they commute. Indirect effects
> >(where
> >> the relative order of more than two properties matters) seem very
> >unlikely.
> >>
> >> OK, here's my revised proposal for what testing of transforms with
> >many
> >> properties should look like, for inclusion (if people agree)
> >> almost-verbatim into the style/testing guide:
> >> - For each property with non-trivial validation or a non-trivial
> >error
> >> message produced from validation, a unit test verifying that invalid
> >values
> >> are rejected with the proper error message.
> >> - Same for rejecting invalid configuration of the transform as a
> >whole
> >> (e.g. unset required properties, or incompatible but individually
> >valid
> >> values specified for property combinations), one test per
> >substantially
> >> different class of error.
> >> - One "blackbox" (@NeedsRunner) test per property (or per combination
> >of
> >> property values causing the transform to behave substantially
> >differently),
> >> verifying that every property has the desired effect when the
> >transform is
> >> executed.
> >> - A single "whitebox" test that constructs a transform with
> >non-default
> >> values of all properties in one order and in the exact reverse order,
> >and
> >> verifies that the resulting transforms are configured equivalently.
> >>
> >> WDYT?
> >>
> >> On Tue, Mar 21, 2017 at 5:22 PM Robert Bradshaw
> >> <[email protected]> wrote:
> >>
> >>> On Tue, Mar 21, 2017 at 5:14 PM, Dan Halperin
> ><[email protected]
> >>> >
> >>> wrote:
> >>>
> >>> >
> >https://github.com/apache/beam/commit/b202548323b4d59b11bbdf06c99d0f
> >>> > 99e6a947ef
> >>> > is one example where tests of feature Bar exist but did not
> >discover
> >>> bugs
> >>> > that could be introduced by builders.
> >>> >
> >>>
> >>> True, though one would need to test the full cross-product of
> >builder
> >>> orderings to discover this, a simple test of setValidate() wouldn't
> >>> suffice
> >>> either. (And if not the full cross product, what subset?) Maybe some
> >>> automated fuzz testing would be a fair thing to do here (cheaper and
> >more
> >>> comprehensive than manual tests...)?
> >>>
> >>> AutoValue like alleviates many, but not all, of these concerns - as
> >Ismael
> >>> > points out.
> >>> >
> >>>
> >>> If two features are not orthogonal, that perhaps merits more test
> >(and
> >>> documentation).
> >>>
> >>>
> >>> >
> >>> >
> >>> >
> >>> > On Tue, Mar 21, 2017 at 1:18 PM, Robert Bradshaw <
> >>> > [email protected]> wrote:
> >>> >
> >>> > > On Wed, Mar 15, 2017 at 2:11 AM, Ismaël Mejía
> ><[email protected]>
> >>> wrote:
> >>> > >
> >>> > > > +1 to Vikas point maybe the right place to enforce things
> >correct
> >>> > > > build tests is in the validate and like this reduce the test
> >>> > > > boilerplate and only test the validate, but I wonder if this
> >totally
> >>> > > > covers both cases (the buildsCorrectly and
> >>> > > > buildsCorrectlyInDifferentOrder ones).
> >>> > > >
> >>> > > > I answer Eugene’s question here even if you are aware now
> >since you
> >>> > > > commented in the PR, so everyone understands the case.
> >>> > > >
> >>> > > > The case is pretty simple, when you extend an IO and add a new
> >>> > > > configuration parameter, suppose we have withFoo(String foo)
> >and we
> >>> > > > want to add withBar(String bar). In some cases the
> >implementation or
> >>> > > > even worse the combination of those are not built correctly,
> >so the
> >>> > > > only way to guarantee that this works is to have code that
> >tests the
> >>> > > > complete parameter combination or tests that at least assert
> >that
> >>> the
> >>> > > > object is built correctly.
> >>> > > >
> >>> > > > This is something that can happen both with or without
> >AutoValue
> >>> > > > because the with method is hand-written and the natural
> >tendency
> >>> with
> >>> > > > boilerplate methods like this is to copy/paste, so we can end
> >up
> >>> doing
> >>> > > > silly things like:
> >>> > > >
> >>> > > >     private Read(String foo, String bar) { … }
> >>> > > >
> >>> > > >     public Read withBar(String bar) {
> >>> > > >       return new Read(foo, null);
> >>> > > >     }
> >>> > > >
> >>> > > > in this case the reference to bar is not stored or assigned
> >(this is
> >>> > > > similar to the case of the DatastoreIO PR), and AutoValue may
> >seem
> >>> to
> >>> > > > solve this issue but you can also end up with this situation
> >if you
> >>> > > > copy paste the withFoo method and just change the method name:
> >>> > > >
> >>> > > >     public Read withBar(String foo) {
> >>> > > >       return builder().setFoo(foo).build();
> >>> > > >     }
> >>> > > >
> >>> > > > Of course both seem silly but both can happen and the tests at
> >least
> >>> > > > help to discover those,
> >>> > > >
> >>> > >
> >>> > > Such mistakes should be entirely discovered by tests of feature
> >Bar.
> >>> If
> >>> > Bar
> >>> > > is not actually being tested, that's a bigger problem with
> >coverage
> >>> that
> >>> > a
> >>> > > construction-only test actually obscures (giving it negative
> >value).
> >>> > >
> >>> > >
> >>> > > >
> >>> > > > On Wed, Mar 15, 2017 at 1:05 AM, vikas rk <[email protected]>
> >>> wrote:
> >>> > > > > Yes, what I meant is: Necessary tests are ones that blocks
> >users
> >>> if
> >>> > not
> >>> > > > > present. Trivial or non-trivial shouldn't be the issue in
> >such
> >>> cases.
> >>> > > > >
> >>> > > > > Some of the boilerplate code and tests is because IO
> >PTransforms
> >>> are
> >>> > > > > returned to the user before they are fully constructed and
> >actual
> >>> > > > > validation happens in the validate method rather than at
> >>> > construction.
> >>> > > I
> >>> > > > > understand that the reasoning here is that we want to
> >support
> >>> > > > constructing
> >>> > > > > them with options in any order and using Builder pattern can
> >be
> >>> > > > confusing.
> >>> > > > >
> >>> > > > > If validate method is where all the validation happens, then
> >we
> >>> > should
> >>> > > > able
> >>> > > > > to eliminate some redundant checks and tests during
> >construction
> >>> time
> >>> > > > like
> >>> > > > > in *withOption* methods here
> >>> > > > > <https://github.com/apache/beam/blob/master/sdks/java/io/
> >>> > > > google-cloud-platform/src/main/java/org/apache/beam/sdk/
> >>> > > > io/gcp/bigtable/BigtableIO.java#L199>
> >>> > > > >  and here
> >>> > > > > <https://github.com/apache/beam/blob/master/sdks/java/io/
> >>> > > > google-cloud-platform/src/main/java/org/apache/beam/sdk/
> >>> > > > io/gcp/datastore/DatastoreV1.java#L387>
> >>> > > > > as
> >>> > > > > these are also checked in the validate method.
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > > -Vikas
> >>> > > > >
> >>> > > > >
> >>> > > > >
> >>> > > > > On 14 March 2017 at 15:40, Eugene Kirpichov
> >>> > > <[email protected]
> >>> > > > >
> >>> > > > > wrote:
> >>> > > > >
> >>> > > > >> Thanks all. Looks like people are on board with the general
> >>> > direction
> >>> > > > >> though it remains to refine it to concrete guidelines to go
> >into
> >>> > style
> >>> > > > >> guide.
> >>> > > > >>
> >>> > > > >> Ismaël, can you give more details about the situation you
> >>> described
> >>> > in
> >>> > > > the
> >>> > > > >> first paragraph? Is it perhaps that really a
> >RunnableOnService
> >>> test
> >>> > > was
> >>> > > > >> missing (and perhaps still is), rather than a builder test?
> >>> > > > >>
> >>> > > > >> Vikas, regarding trivial tests and user waiting for a
> >>> work-around:
> >>> > in
> >>> > > > the
> >>> > > > >> situation I described, they don't really need a workaround
> >- they
> >>> > > > specified
> >>> > > > >> an invalid value and have been minorly inconvenienced
> >because the
> >>> > > error
> >>> > > > >> they got about it was not very readable, so fixing their
> >value
> >>> took
> >>> > > > them a
> >>> > > > >> little longer than it could have, but they fixed it and
> >their
> >>> work
> >>> > is
> >>> > > > not
> >>> > > > >> blocked. I think Robert's arguments about the cost of
> >trivial
> >>> tests
> >>> > > > apply.
> >>> > > > >>
> >>> > > > >> I agree that the author should be at liberty to choose
> >which
> >>> > > validation
> >>> > > > to
> >>> > > > >> unit-test and which to skip as trivial, so documentation on
> >this
> >>> > topic
> >>> > > > >> should be in the form of guidelines, high-quality example
> >code
> >>> (i.e.
> >>> > > > clean
> >>> > > > >> up the unit tests of IOs bundled with Beam SDK), and
> >informal
> >>> > > knowledge
> >>> > > > in
> >>> > > > >> the heads of readers of this thread, rather than hard
> >rules.
> >>> > > > >>
> >>> > > > >> On Tue, Mar 14, 2017 at 8:07 AM Ismaël Mejía
> ><[email protected]>
> >>> > > wrote:
> >>> > > > >>
> >>> > > > >> > +0.5
> >>> > > > >> >
> >>> > > > >> > I used to think that some of those tests were not worth,
> >for
> >>> > example
> >>> > > > >> > testBuildRead and
> >>> > > > >> > testBuildReadAlt. However the reality is that these tests
> >>> allowed
> >>> > me
> >>> > > > to
> >>> > > > >> > find bugs both during the development of HBaseIO and just
> >>> > yesterday
> >>> > > > when
> >>> > > > >> I
> >>> > > > >> > tried to test the write support for the emulator with
> >>> DataStoreIO
> >>> > > > (that
> >>> > > > >> > lacked a parameter in testBuildWrite and didn’t have a
> >>> > > > testBuildWriteAlt
> >>> > > > >> > and broke in that case too), so I now believe they are
> >not
> >>> > > necessarily
> >>> > > > >> > useless.
> >>> > > > >> >
> >>> > > > >> > I agree with the idea of trying to test the most
> >important
> >>> things
> >>> > > > first
> >>> > > > >> and
> >>> > > > >> > as Kenneth said trying to reduce the tests of successful
> >>> > validation,
> >>> > > > >> > however I am not sure how this can be formalized in the
> >style
> >>> > guide
> >>> > > > >> > considering that the value of the tests seems to be based
> >on
> >>> the
> >>> > > > judgment
> >>> > > > >> > of both the developer and the reviewer.
> >>> > > > >> >
> >>> > > > >> > One final aspect if we achieve consensus is that we
> >should not
> >>> > > forget
> >>> > > > >> that
> >>> > > > >> > if we reduce some of this tests we have to pay attention
> >to not
> >>> > > reduce
> >>> > > > >> the
> >>> > > > >> > current level of coverage.
> >>> > > > >> >
> >>> > > > >> > Regards,
> >>> > > > >> > Ismaël
> >>> > > > >> >
> >>> > > > >> >
> >>> > > > >> >
> >>> > > > >> >
> >>> > > > >> > On Mon, Mar 13, 2017 at 8:37 PM, vikas rk
> ><[email protected]>
> >>> > > wrote:
> >>> > > > >> >
> >>> > > > >> > > +0.5
> >>> > > > >> > >
> >>> > > > >> > > My two cents,
> >>> > > > >> > >
> >>> > > > >> > >   * However trivial the test is it should be added
> >unless
> >>> user
> >>> > > has a
> >>> > > > >> easy
> >>> > > > >> > > workaround to not having to wait for a few days until
> >the
> >>> > trivial
> >>> > > > fixes
> >>> > > > >> > are
> >>> > > > >> > > merged to beam and then propagated to the runner.
> >>> > > > >> > >
> >>> > > > >> > >   * While I agree with trivial tests like "ensuring
> >>> meaningful
> >>> > > error
> >>> > > > >> > > message thrown for null options" or "ordering of
> >parameters"
> >>> not
> >>> > > > being
> >>> > > > >> > > strictly necessary and definitely not something that
> >should
> >>> be
> >>> > > used
> >>> > > > as
> >>> > > > >> > > reference for other IOs, (staying consistent with other
> >IOs
> >>> is
> >>> > > > probably
> >>> > > > >> > the
> >>> > > > >> > > reason why DatastoreIO has some of them), I still think
> >some
> >>> > > degree
> >>> > > > of
> >>> > > > >> > > leeway for the author is desirable depending on what
> >level of
> >>> > user
> >>> > > > >> > > experience they want to provide, as long as it is not a
> >>> burden
> >>> > for
> >>> > > > >> adding
> >>> > > > >> > > new tests.
> >>> > > > >> > >
> >>> > > > >> > >
> >>> > > > >> > >
> >>> > > > >> > > On 11 March 2017 at 14:16, Jean-Baptiste Onofré <
> >>> > [email protected]>
> >>> > > > >> wrote:
> >>> > > > >> > >
> >>> > > > >> > > > +1
> >>> > > > >> > > >
> >>> > > > >> > > > Testing is always hard, especially to have concrete
> >tests.
> >>> > > > Reducing
> >>> > > > >> the
> >>> > > > >> > > > "noise" is a good idea.
> >>> > > > >> > > >
> >>> > > > >> > > > Regards
> >>> > > > >> > > > JB
> >>> > > > >> > > >
> >>> > > > >> > > >
> >>> > > > >> > > > On 03/10/2017 04:09 PM, Eugene Kirpichov wrote:
> >>> > > > >> > > >
> >>> > > > >> > > >> Hello,
> >>> > > > >> > > >>
> >>> > > > >> > > >> I've seen a pattern in a couple of different
> >transforms
> >>> (IOs)
> >>> > > > where
> >>> > > > >> > we,
> >>> > > > >> > > I
> >>> > > > >> > > >> think, spend an excessive amount of code
> >unit-testing the
> >>> > > trivial
> >>> > > > >> > > builder
> >>> > > > >> > > >> methods.
> >>> > > > >> > > >>
> >>> > > > >> > > >> E.g. a significant part of
> >>> > > > >> > > >>
> >>> https://github.com/apache/beam/blob/master/sdks/java/io/goog
> >>> > > > >> > > >>
> >le-cloud-platform/src/test/java/org/apache/beam/sdk/io/
> >>> > > > >> > > >> gcp/datastore/DatastoreV1Test.java
> >>> > > > >> > > >> is
> >>> > > > >> > > >> devoted to that, and a significant part of the
> >in-progress
> >>> > > Hadoop
> >>> > > > >> > > >> InputFormat IO.
> >>> > > > >> > > >>
> >>> > > > >> > > >> In particular, I mean unit-testing trivial builder
> >>> > > functionality
> >>> > > > >> such
> >>> > > > >> > > as:
> >>> > > > >> > > >> - That setting a parameter actually sets it
> >>> > > > >> > > >> - That setting parameters in one order gives the
> >same
> >>> effect
> >>> > > as a
> >>> > > > >> > > >> different
> >>> > > > >> > > >> order
> >>> > > > >> > > >> - That a null value of a parameter is rejected with
> >an
> >>> > > > appropriate
> >>> > > > >> > error
> >>> > > > >> > > >> message
> >>> > > > >> > > >> - That a non-null value of a parameter is accepted
> >>> > > > >> > > >>
> >>> > > > >> > > >> I'd like to come to a consensus as to how much
> >>> unit-testing
> >>> > of
> >>> > > > such
> >>> > > > >> > > stuff
> >>> > > > >> > > >> is appropriate when developing a new transform. And
> >then
> >>> put
> >>> > > this
> >>> > > > >> > > >> consensus
> >>> > > > >> > > >> into the PTransform Style Guide
> >>> > > > >> > > >> <
> >>> https://beam.apache.org/contribute/ptransform-style-guide/
> >>> > >.
> >>> > > > >> > > >>
> >>> > > > >> > > >> I think such tests are not worth their weight in
> >lines of
> >>> > code,
> >>> > > > for
> >>> > > > >> a
> >>> > > > >> > > few
> >>> > > > >> > > >> reasons:
> >>> > > > >> > > >> - Whether or not a parameter actually takes effect,
> >>> should be
> >>> > > > tested
> >>> > > > >> > > using
> >>> > > > >> > > >> a semantic test (RunnableOnService). Testing whether
> >a
> >>> getter
> >>> > > > >> returns
> >>> > > > >> > > the
> >>> > > > >> > > >> same thing a setter set is neither necessary
> >(already
> >>> covered
> >>> > > by
> >>> > > > a
> >>> > > > >> > > >> semantic
> >>> > > > >> > > >> test) nor sufficient (it's possible that the
> >expansion of
> >>> the
> >>> > > > >> > transform
> >>> > > > >> > > >> forgets to even call the getter).
> >>> > > > >> > > >> - Testing "a non-null value (or an otherwise valid
> >value)
> >>> is
> >>> > > > >> accepted"
> >>> > > > >> > > is
> >>> > > > >> > > >> redundant with every semantic test that supplies the
> >>> value.
> >>> > > > >> > > >> - For testing of supplying a different order of
> >>> parameters:
> >>> > my
> >>> > > > main
> >>> > > > >> > > >> objections are 1) I'm not sure what kind of
> >>> reasonably-likely
> >>> > > > coding
> >>> > > > >> > > error
> >>> > > > >> > > >> this guards against, especially given most
> >multi-parameter
> >>> > > > >> transforms
> >>> > > > >> > > use
> >>> > > > >> > > >> AutoValue anyway, and given that "does a setter take
> >>> effect"
> >>> > is
> >>> > > > >> > already
> >>> > > > >> > > >> tested via bullets above 2) given such a coding
> >error
> >>> exists,
> >>> > > > >> there's
> >>> > > > >> > no
> >>> > > > >> > > >> way to tell which orders  or how many you should
> >test;
> >>> > > > unit-testing
> >>> > > > >> > two
> >>> > > > >> > > or
> >>> > > > >> > > >> three orders probably gives next to no protection,
> >and
> >>> > > > unit-testing
> >>> > > > >> > more
> >>> > > > >> > > >> than that is impractical.
> >>> > > > >> > > >> - Testing "a null value is rejected" effectively
> >>> unit-tests a
> >>> > > > single
> >>> > > > >> > > line
> >>> > > > >> > > >> that says checkNotNull(...). I don't think it's
> >worth 10
> >>> or
> >>> > so
> >>> > > > lines
> >>> > > > >> > of
> >>> > > > >> > > >> code to test something this trivial: even if an
> >author
> >>> forgot
> >>> > > to
> >>> > > > add
> >>> > > > >> > > this
> >>> > > > >> > > >> line, and then a used supplied null - the user will
> >just
> >>> get
> >>> > a
> >>> > > > less
> >>> > > > >> > > >> informative error message, possibly report it as a
> >bug,
> >>> and
> >>> > the
> >>> > > > >> error
> >>> > > > >> > > will
> >>> > > > >> > > >> easily get fixed. I.e. it's a very low-risk thing to
> >get
> >>> > wrong.
> >>> > > > >> > > >>
> >>> > > > >> > > >> I think the following similar types of tests are
> >still
> >>> worth
> >>> > > > doing
> >>> > > > >> > > though:
> >>> > > > >> > > >> - Non-trivial parameter validation logic: where the
> >>> > validation
> >>> > > > rule
> >>> > > > >> is
> >>> > > > >> > > >> complex, or where the error message is interesting,
> >etc.
> >>> > > > >> > > >> - Cross-parameter invariants which, if violated,
> >lead to
> >>> data
> >>> > > > loss:
> >>> > > > >> > e.g.
> >>> > > > >> > > >> that if parameters foo and bar are mutually
> >exclusive,
> >>> verify
> >>> > > > that
> >>> > > > >> if
> >>> > > > >> > > the
> >>> > > > >> > > >> user specifies both, they get an error, instead of
> >one of
> >>> > them
> >>> > > > being
> >>> > > > >> > > >> silently ignored.
> >>> > > > >> > > >>
> >>> > > > >> > > >> Thoughts?
> >>> > > > >> > > >>
> >>> > > > >> > > >>
> >>> > > > >> > > > --
> >>> > > > >> > > > Jean-Baptiste Onofré
> >>> > > > >> > > > [email protected]
> >>> > > > >> > > > http://blog.nanthrax.net
> >>> > > > >> > > > Talend - http://www.talend.com
> >>> > > > >> > > >
> >>> > > > >> > >
> >>> > > > >> >
> >>> > > > >>
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
>

Re: Style: how much testing for transform builder classes?

Reply via email to