Re: Style: how much testing for transform builder classes?

Eugene Kirpichov Fri, 18 Aug 2017 13:43:06 -0700

Hi all,

This seems to have slipped through the cracks. I'd like to raise this again
since I'm doing a cleanup in https://github.com/apache/beam/pull/3730 , and
I'd like to get consensus and add the guidance to PTransform Style Guide or
the Testing Guide.


Let me rephrase my suggestions from the previous email more clearly and
concisely:
- Do not test successful validation
- Do not test trivial validation errors (e.g. "fails when a property is
unset/null/empty/negative/...")
- Add 1 test per non-trivial class of validation error (e.g. "properties
foo and bar are exclusive", validation of query syntax, etc)
- Define "nontrivial" as: where missing/incorrect/uninformative validation
may lead to serious problems: data loss, counter-intuitive behavior, value
of a property being silently ignored, or other hard-to-debug errors.
- Test that each withFoo() method generally has effect (is not ignored),
using TestPipeline and PAssert

Does this sound reasonable?

On Mon, Mar 27, 2017 at 2:28 PM Eugene Kirpichov <kirpic...@google.com>
wrote:

> From Ismael's and Dan's comments, I think I agree that there's a class of
> easy-to-make errors that justifies having some builder tests.
>
> 1. Errors of the form "withFoo() instead sets bar, which is of the same
> type as foo so it still compiles" (what Ismael quoted). This, I think,
> should be caught by NeedsRunner tests, because it means that withFoo() is
> effectively being completely ignored, and NeedsRunner tests need to test
> that each property is taking effect, semantically.
> 2. Errors of the form "withFoo() overrides bar with a default value" (e.g.
> in the commit Dan quoted). This means that foo and bar don't commute, i.e.
> .withFoo().withBar() is not the same as .withBar().withFoo().
>
> Errors of the form #2 warrant builder tests. To catch errors of this form,
> it's sufficient to have a single test - that a transform fully configured
> (with non-default values) in one order is equivalent to a transform fully
> configured in precisely the reverse order - this way any two properties
> foo,bar will be tested for whether they commute. Indirect effects (where
> the relative order of more than two properties matters) seem very unlikely.
>
> OK, here's my revised proposal for what testing of transforms with many
> properties should look like, for inclusion (if people agree)
> almost-verbatim into the style/testing guide:
> - For each property with non-trivial validation or a non-trivial error
> message produced from validation, a unit test verifying that invalid values
> are rejected with the proper error message.
> - Same for rejecting invalid configuration of the transform as a whole
> (e.g. unset required properties, or incompatible but individually valid
> values specified for property combinations), one test per substantially
> different class of error.
> - One "blackbox" (@NeedsRunner) test per property (or per combination of
> property values causing the transform to behave substantially differently),
> verifying that every property has the desired effect when the transform is
> executed.
> - A single "whitebox" test that constructs a transform with non-default
> values of all properties in one order and in the exact reverse order, and
> verifies that the resulting transforms are configured equivalently.
>
> WDYT?
>
> On Tue, Mar 21, 2017 at 5:22 PM Robert Bradshaw
> <rober...@google.com.invalid> wrote:
>
>> On Tue, Mar 21, 2017 at 5:14 PM, Dan Halperin <dhalp...@google.com.invalid
>> >
>> wrote:
>>
>> > https://github.com/apache/beam/commit/b202548323b4d59b11bbdf06c99d0f
>> > 99e6a947ef
>> > is one example where tests of feature Bar exist but did not discover
>> bugs
>> > that could be introduced by builders.
>> >
>>
>> True, though one would need to test the full cross-product of builder
>> orderings to discover this, a simple test of setValidate() wouldn't
>> suffice
>> either. (And if not the full cross product, what subset?) Maybe some
>> automated fuzz testing would be a fair thing to do here (cheaper and more
>> comprehensive than manual tests...)?
>>
>> AutoValue like alleviates many, but not all, of these concerns - as Ismael
>> > points out.
>> >
>>
>> If two features are not orthogonal, that perhaps merits more test (and
>> documentation).
>>
>>
>> >
>> >
>> >
>> > On Tue, Mar 21, 2017 at 1:18 PM, Robert Bradshaw <
>> > rober...@google.com.invalid> wrote:
>> >
>> > > On Wed, Mar 15, 2017 at 2:11 AM, Ismaël Mejía <ieme...@gmail.com>
>> wrote:
>> > >
>> > > > +1 to Vikas point maybe the right place to enforce things correct
>> > > > build tests is in the validate and like this reduce the test
>> > > > boilerplate and only test the validate, but I wonder if this totally
>> > > > covers both cases (the buildsCorrectly and
>> > > > buildsCorrectlyInDifferentOrder ones).
>> > > >
>> > > > I answer Eugene’s question here even if you are aware now since you
>> > > > commented in the PR, so everyone understands the case.
>> > > >
>> > > > The case is pretty simple, when you extend an IO and add a new
>> > > > configuration parameter, suppose we have withFoo(String foo) and we
>> > > > want to add withBar(String bar). In some cases the implementation or
>> > > > even worse the combination of those are not built correctly, so the
>> > > > only way to guarantee that this works is to have code that tests the
>> > > > complete parameter combination or tests that at least assert that
>> the
>> > > > object is built correctly.
>> > > >
>> > > > This is something that can happen both with or without AutoValue
>> > > > because the with method is hand-written and the natural tendency
>> with
>> > > > boilerplate methods like this is to copy/paste, so we can end up
>> doing
>> > > > silly things like:
>> > > >
>> > > >     private Read(String foo, String bar) { … }
>> > > >
>> > > >     public Read withBar(String bar) {
>> > > >       return new Read(foo, null);
>> > > >     }
>> > > >
>> > > > in this case the reference to bar is not stored or assigned (this is
>> > > > similar to the case of the DatastoreIO PR), and AutoValue may seem
>> to
>> > > > solve this issue but you can also end up with this situation if you
>> > > > copy paste the withFoo method and just change the method name:
>> > > >
>> > > >     public Read withBar(String foo) {
>> > > >       return builder().setFoo(foo).build();
>> > > >     }
>> > > >
>> > > > Of course both seem silly but both can happen and the tests at least
>> > > > help to discover those,
>> > > >
>> > >
>> > > Such mistakes should be entirely discovered by tests of feature Bar.
>> If
>> > Bar
>> > > is not actually being tested, that's a bigger problem with coverage
>> that
>> > a
>> > > construction-only test actually obscures (giving it negative value).
>> > >
>> > >
>> > > >
>> > > > On Wed, Mar 15, 2017 at 1:05 AM, vikas rk <vikky...@gmail.com>
>> wrote:
>> > > > > Yes, what I meant is: Necessary tests are ones that blocks users
>> if
>> > not
>> > > > > present. Trivial or non-trivial shouldn't be the issue in such
>> cases.
>> > > > >
>> > > > > Some of the boilerplate code and tests is because IO PTransforms
>> are
>> > > > > returned to the user before they are fully constructed and actual
>> > > > > validation happens in the validate method rather than at
>> > construction.
>> > > I
>> > > > > understand that the reasoning here is that we want to support
>> > > > constructing
>> > > > > them with options in any order and using Builder pattern can be
>> > > > confusing.
>> > > > >
>> > > > > If validate method is where all the validation happens, then we
>> > should
>> > > > able
>> > > > > to eliminate some redundant checks and tests during construction
>> time
>> > > > like
>> > > > > in *withOption* methods here
>> > > > > <https://github.com/apache/beam/blob/master/sdks/java/io/
>> > > > google-cloud-platform/src/main/java/org/apache/beam/sdk/
>> > > > io/gcp/bigtable/BigtableIO.java#L199>
>> > > > >  and here
>> > > > > <https://github.com/apache/beam/blob/master/sdks/java/io/
>> > > > google-cloud-platform/src/main/java/org/apache/beam/sdk/
>> > > > io/gcp/datastore/DatastoreV1.java#L387>
>> > > > > as
>> > > > > these are also checked in the validate method.
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > -Vikas
>> > > > >
>> > > > >
>> > > > >
>> > > > > On 14 March 2017 at 15:40, Eugene Kirpichov
>> > > <kirpic...@google.com.invalid
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > >> Thanks all. Looks like people are on board with the general
>> > direction
>> > > > >> though it remains to refine it to concrete guidelines to go into
>> > style
>> > > > >> guide.
>> > > > >>
>> > > > >> Ismaël, can you give more details about the situation you
>> described
>> > in
>> > > > the
>> > > > >> first paragraph? Is it perhaps that really a RunnableOnService
>> test
>> > > was
>> > > > >> missing (and perhaps still is), rather than a builder test?
>> > > > >>
>> > > > >> Vikas, regarding trivial tests and user waiting for a
>> work-around:
>> > in
>> > > > the
>> > > > >> situation I described, they don't really need a workaround - they
>> > > > specified
>> > > > >> an invalid value and have been minorly inconvenienced because the
>> > > error
>> > > > >> they got about it was not very readable, so fixing their value
>> took
>> > > > them a
>> > > > >> little longer than it could have, but they fixed it and their
>> work
>> > is
>> > > > not
>> > > > >> blocked. I think Robert's arguments about the cost of trivial
>> tests
>> > > > apply.
>> > > > >>
>> > > > >> I agree that the author should be at liberty to choose which
>> > > validation
>> > > > to
>> > > > >> unit-test and which to skip as trivial, so documentation on this
>> > topic
>> > > > >> should be in the form of guidelines, high-quality example code
>> (i.e.
>> > > > clean
>> > > > >> up the unit tests of IOs bundled with Beam SDK), and informal
>> > > knowledge
>> > > > in
>> > > > >> the heads of readers of this thread, rather than hard rules.
>> > > > >>
>> > > > >> On Tue, Mar 14, 2017 at 8:07 AM Ismaël Mejía <ieme...@gmail.com>
>> > > wrote:
>> > > > >>
>> > > > >> > +0.5
>> > > > >> >
>> > > > >> > I used to think that some of those tests were not worth, for
>> > example
>> > > > >> > testBuildRead and
>> > > > >> > testBuildReadAlt. However the reality is that these tests
>> allowed
>> > me
>> > > > to
>> > > > >> > find bugs both during the development of HBaseIO and just
>> > yesterday
>> > > > when
>> > > > >> I
>> > > > >> > tried to test the write support for the emulator with
>> DataStoreIO
>> > > > (that
>> > > > >> > lacked a parameter in testBuildWrite and didn’t have a
>> > > > testBuildWriteAlt
>> > > > >> > and broke in that case too), so I now believe they are not
>> > > necessarily
>> > > > >> > useless.
>> > > > >> >
>> > > > >> > I agree with the idea of trying to test the most important
>> things
>> > > > first
>> > > > >> and
>> > > > >> > as Kenneth said trying to reduce the tests of successful
>> > validation,
>> > > > >> > however I am not sure how this can be formalized in the style
>> > guide
>> > > > >> > considering that the value of the tests seems to be based on
>> the
>> > > > judgment
>> > > > >> > of both the developer and the reviewer.
>> > > > >> >
>> > > > >> > One final aspect if we achieve consensus is that we should not
>> > > forget
>> > > > >> that
>> > > > >> > if we reduce some of this tests we have to pay attention to not
>> > > reduce
>> > > > >> the
>> > > > >> > current level of coverage.
>> > > > >> >
>> > > > >> > Regards,
>> > > > >> > Ismaël
>> > > > >> >
>> > > > >> >
>> > > > >> >
>> > > > >> >
>> > > > >> > On Mon, Mar 13, 2017 at 8:37 PM, vikas rk <vikky...@gmail.com>
>> > > wrote:
>> > > > >> >
>> > > > >> > > +0.5
>> > > > >> > >
>> > > > >> > > My two cents,
>> > > > >> > >
>> > > > >> > >   * However trivial the test is it should be added unless
>> user
>> > > has a
>> > > > >> easy
>> > > > >> > > workaround to not having to wait for a few days until the
>> > trivial
>> > > > fixes
>> > > > >> > are
>> > > > >> > > merged to beam and then propagated to the runner.
>> > > > >> > >
>> > > > >> > >   * While I agree with trivial tests like "ensuring
>> meaningful
>> > > error
>> > > > >> > > message thrown for null options" or "ordering of parameters"
>> not
>> > > > being
>> > > > >> > > strictly necessary and definitely not something that should
>> be
>> > > used
>> > > > as
>> > > > >> > > reference for other IOs, (staying consistent with other IOs
>> is
>> > > > probably
>> > > > >> > the
>> > > > >> > > reason why DatastoreIO has some of them), I still think some
>> > > degree
>> > > > of
>> > > > >> > > leeway for the author is desirable depending on what level of
>> > user
>> > > > >> > > experience they want to provide, as long as it is not a
>> burden
>> > for
>> > > > >> adding
>> > > > >> > > new tests.
>> > > > >> > >
>> > > > >> > >
>> > > > >> > >
>> > > > >> > > On 11 March 2017 at 14:16, Jean-Baptiste Onofré <
>> > j...@nanthrax.net>
>> > > > >> wrote:
>> > > > >> > >
>> > > > >> > > > +1
>> > > > >> > > >
>> > > > >> > > > Testing is always hard, especially to have concrete tests.
>> > > > Reducing
>> > > > >> the
>> > > > >> > > > "noise" is a good idea.
>> > > > >> > > >
>> > > > >> > > > Regards
>> > > > >> > > > JB
>> > > > >> > > >
>> > > > >> > > >
>> > > > >> > > > On 03/10/2017 04:09 PM, Eugene Kirpichov wrote:
>> > > > >> > > >
>> > > > >> > > >> Hello,
>> > > > >> > > >>
>> > > > >> > > >> I've seen a pattern in a couple of different transforms
>> (IOs)
>> > > > where
>> > > > >> > we,
>> > > > >> > > I
>> > > > >> > > >> think, spend an excessive amount of code unit-testing the
>> > > trivial
>> > > > >> > > builder
>> > > > >> > > >> methods.
>> > > > >> > > >>
>> > > > >> > > >> E.g. a significant part of
>> > > > >> > > >>
>> https://github.com/apache/beam/blob/master/sdks/java/io/goog
>> > > > >> > > >> le-cloud-platform/src/test/java/org/apache/beam/sdk/io/
>> > > > >> > > >> gcp/datastore/DatastoreV1Test.java
>> > > > >> > > >> is
>> > > > >> > > >> devoted to that, and a significant part of the in-progress
>> > > Hadoop
>> > > > >> > > >> InputFormat IO.
>> > > > >> > > >>
>> > > > >> > > >> In particular, I mean unit-testing trivial builder
>> > > functionality
>> > > > >> such
>> > > > >> > > as:
>> > > > >> > > >> - That setting a parameter actually sets it
>> > > > >> > > >> - That setting parameters in one order gives the same
>> effect
>> > > as a
>> > > > >> > > >> different
>> > > > >> > > >> order
>> > > > >> > > >> - That a null value of a parameter is rejected with an
>> > > > appropriate
>> > > > >> > error
>> > > > >> > > >> message
>> > > > >> > > >> - That a non-null value of a parameter is accepted
>> > > > >> > > >>
>> > > > >> > > >> I'd like to come to a consensus as to how much
>> unit-testing
>> > of
>> > > > such
>> > > > >> > > stuff
>> > > > >> > > >> is appropriate when developing a new transform. And then
>> put
>> > > this
>> > > > >> > > >> consensus
>> > > > >> > > >> into the PTransform Style Guide
>> > > > >> > > >> <
>> https://beam.apache.org/contribute/ptransform-style-guide/
>> > >.
>> > > > >> > > >>
>> > > > >> > > >> I think such tests are not worth their weight in lines of
>> > code,
>> > > > for
>> > > > >> a
>> > > > >> > > few
>> > > > >> > > >> reasons:
>> > > > >> > > >> - Whether or not a parameter actually takes effect,
>> should be
>> > > > tested
>> > > > >> > > using
>> > > > >> > > >> a semantic test (RunnableOnService). Testing whether a
>> getter
>> > > > >> returns
>> > > > >> > > the
>> > > > >> > > >> same thing a setter set is neither necessary (already
>> covered
>> > > by
>> > > > a
>> > > > >> > > >> semantic
>> > > > >> > > >> test) nor sufficient (it's possible that the expansion of
>> the
>> > > > >> > transform
>> > > > >> > > >> forgets to even call the getter).
>> > > > >> > > >> - Testing "a non-null value (or an otherwise valid value)
>> is
>> > > > >> accepted"
>> > > > >> > > is
>> > > > >> > > >> redundant with every semantic test that supplies the
>> value.
>> > > > >> > > >> - For testing of supplying a different order of
>> parameters:
>> > my
>> > > > main
>> > > > >> > > >> objections are 1) I'm not sure what kind of
>> reasonably-likely
>> > > > coding
>> > > > >> > > error
>> > > > >> > > >> this guards against, especially given most multi-parameter
>> > > > >> transforms
>> > > > >> > > use
>> > > > >> > > >> AutoValue anyway, and given that "does a setter take
>> effect"
>> > is
>> > > > >> > already
>> > > > >> > > >> tested via bullets above 2) given such a coding error
>> exists,
>> > > > >> there's
>> > > > >> > no
>> > > > >> > > >> way to tell which orders  or how many you should test;
>> > > > unit-testing
>> > > > >> > two
>> > > > >> > > or
>> > > > >> > > >> three orders probably gives next to no protection, and
>> > > > unit-testing
>> > > > >> > more
>> > > > >> > > >> than that is impractical.
>> > > > >> > > >> - Testing "a null value is rejected" effectively
>> unit-tests a
>> > > > single
>> > > > >> > > line
>> > > > >> > > >> that says checkNotNull(...). I don't think it's worth 10
>> or
>> > so
>> > > > lines
>> > > > >> > of
>> > > > >> > > >> code to test something this trivial: even if an author
>> forgot
>> > > to
>> > > > add
>> > > > >> > > this
>> > > > >> > > >> line, and then a used supplied null - the user will just
>> get
>> > a
>> > > > less
>> > > > >> > > >> informative error message, possibly report it as a bug,
>> and
>> > the
>> > > > >> error
>> > > > >> > > will
>> > > > >> > > >> easily get fixed. I.e. it's a very low-risk thing to get
>> > wrong.
>> > > > >> > > >>
>> > > > >> > > >> I think the following similar types of tests are still
>> worth
>> > > > doing
>> > > > >> > > though:
>> > > > >> > > >> - Non-trivial parameter validation logic: where the
>> > validation
>> > > > rule
>> > > > >> is
>> > > > >> > > >> complex, or where the error message is interesting, etc.
>> > > > >> > > >> - Cross-parameter invariants which, if violated, lead to
>> data
>> > > > loss:
>> > > > >> > e.g.
>> > > > >> > > >> that if parameters foo and bar are mutually exclusive,
>> verify
>> > > > that
>> > > > >> if
>> > > > >> > > the
>> > > > >> > > >> user specifies both, they get an error, instead of one of
>> > them
>> > > > being
>> > > > >> > > >> silently ignored.
>> > > > >> > > >>
>> > > > >> > > >> Thoughts?
>> > > > >> > > >>
>> > > > >> > > >>
>> > > > >> > > > --
>> > > > >> > > > Jean-Baptiste Onofré
>> > > > >> > > > jbono...@apache.org
>> > > > >> > > > http://blog.nanthrax.net
>> > > > >> > > > Talend - http://www.talend.com
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > >
>> > >
>> >
>>
>

Re: Style: how much testing for transform builder classes?

Reply via email to