Re: Style: how much testing for transform builder classes?

Robert Bradshaw Tue, 21 Mar 2017 17:22:21 -0700

On Tue, Mar 21, 2017 at 5:14 PM, Dan Halperin <[email protected]>
wrote:


> https://github.com/apache/beam/commit/b202548323b4d59b11bbdf06c99d0f
> 99e6a947ef
> is one example where tests of feature Bar exist but did not discover bugs
> that could be introduced by builders.
>

True, though one would need to test the full cross-product of builder
orderings to discover this, a simple test of setValidate() wouldn't suffice
either. (And if not the full cross product, what subset?) Maybe some
automated fuzz testing would be a fair thing to do here (cheaper and more
comprehensive than manual tests...)?

AutoValue like alleviates many, but not all, of these concerns - as Ismael
> points out.
>

If two features are not orthogonal, that perhaps merits more test (and
documentation).


>
>
>
> On Tue, Mar 21, 2017 at 1:18 PM, Robert Bradshaw <
> [email protected]> wrote:
>
> > On Wed, Mar 15, 2017 at 2:11 AM, Ismaël Mejía <[email protected]> wrote:
> >
> > > +1 to Vikas point maybe the right place to enforce things correct
> > > build tests is in the validate and like this reduce the test
> > > boilerplate and only test the validate, but I wonder if this totally
> > > covers both cases (the buildsCorrectly and
> > > buildsCorrectlyInDifferentOrder ones).
> > >
> > > I answer Eugene’s question here even if you are aware now since you
> > > commented in the PR, so everyone understands the case.
> > >
> > > The case is pretty simple, when you extend an IO and add a new
> > > configuration parameter, suppose we have withFoo(String foo) and we
> > > want to add withBar(String bar). In some cases the implementation or
> > > even worse the combination of those are not built correctly, so the
> > > only way to guarantee that this works is to have code that tests the
> > > complete parameter combination or tests that at least assert that the
> > > object is built correctly.
> > >
> > > This is something that can happen both with or without AutoValue
> > > because the with method is hand-written and the natural tendency with
> > > boilerplate methods like this is to copy/paste, so we can end up doing
> > > silly things like:
> > >
> > >     private Read(String foo, String bar) { … }
> > >
> > >     public Read withBar(String bar) {
> > >       return new Read(foo, null);
> > >     }
> > >
> > > in this case the reference to bar is not stored or assigned (this is
> > > similar to the case of the DatastoreIO PR), and AutoValue may seem to
> > > solve this issue but you can also end up with this situation if you
> > > copy paste the withFoo method and just change the method name:
> > >
> > >     public Read withBar(String foo) {
> > >       return builder().setFoo(foo).build();
> > >     }
> > >
> > > Of course both seem silly but both can happen and the tests at least
> > > help to discover those,
> > >
> >
> > Such mistakes should be entirely discovered by tests of feature Bar. If
> Bar
> > is not actually being tested, that's a bigger problem with coverage that
> a
> > construction-only test actually obscures (giving it negative value).
> >
> >
> > >
> > > On Wed, Mar 15, 2017 at 1:05 AM, vikas rk <[email protected]> wrote:
> > > > Yes, what I meant is: Necessary tests are ones that blocks users if
> not
> > > > present. Trivial or non-trivial shouldn't be the issue in such cases.
> > > >
> > > > Some of the boilerplate code and tests is because IO PTransforms are
> > > > returned to the user before they are fully constructed and actual
> > > > validation happens in the validate method rather than at
> construction.
> > I
> > > > understand that the reasoning here is that we want to support
> > > constructing
> > > > them with options in any order and using Builder pattern can be
> > > confusing.
> > > >
> > > > If validate method is where all the validation happens, then we
> should
> > > able
> > > > to eliminate some redundant checks and tests during construction time
> > > like
> > > > in *withOption* methods here
> > > > <https://github.com/apache/beam/blob/master/sdks/java/io/
> > > google-cloud-platform/src/main/java/org/apache/beam/sdk/
> > > io/gcp/bigtable/BigtableIO.java#L199>
> > > >  and here
> > > > <https://github.com/apache/beam/blob/master/sdks/java/io/
> > > google-cloud-platform/src/main/java/org/apache/beam/sdk/
> > > io/gcp/datastore/DatastoreV1.java#L387>
> > > > as
> > > > these are also checked in the validate method.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > -Vikas
> > > >
> > > >
> > > >
> > > > On 14 March 2017 at 15:40, Eugene Kirpichov
> > <[email protected]
> > > >
> > > > wrote:
> > > >
> > > >> Thanks all. Looks like people are on board with the general
> direction
> > > >> though it remains to refine it to concrete guidelines to go into
> style
> > > >> guide.
> > > >>
> > > >> Ismaël, can you give more details about the situation you described
> in
> > > the
> > > >> first paragraph? Is it perhaps that really a RunnableOnService test
> > was
> > > >> missing (and perhaps still is), rather than a builder test?
> > > >>
> > > >> Vikas, regarding trivial tests and user waiting for a work-around:
> in
> > > the
> > > >> situation I described, they don't really need a workaround - they
> > > specified
> > > >> an invalid value and have been minorly inconvenienced because the
> > error
> > > >> they got about it was not very readable, so fixing their value took
> > > them a
> > > >> little longer than it could have, but they fixed it and their work
> is
> > > not
> > > >> blocked. I think Robert's arguments about the cost of trivial tests
> > > apply.
> > > >>
> > > >> I agree that the author should be at liberty to choose which
> > validation
> > > to
> > > >> unit-test and which to skip as trivial, so documentation on this
> topic
> > > >> should be in the form of guidelines, high-quality example code (i.e.
> > > clean
> > > >> up the unit tests of IOs bundled with Beam SDK), and informal
> > knowledge
> > > in
> > > >> the heads of readers of this thread, rather than hard rules.
> > > >>
> > > >> On Tue, Mar 14, 2017 at 8:07 AM Ismaël Mejía <[email protected]>
> > wrote:
> > > >>
> > > >> > +0.5
> > > >> >
> > > >> > I used to think that some of those tests were not worth, for
> example
> > > >> > testBuildRead and
> > > >> > testBuildReadAlt. However the reality is that these tests allowed
> me
> > > to
> > > >> > find bugs both during the development of HBaseIO and just
> yesterday
> > > when
> > > >> I
> > > >> > tried to test the write support for the emulator with DataStoreIO
> > > (that
> > > >> > lacked a parameter in testBuildWrite and didn’t have a
> > > testBuildWriteAlt
> > > >> > and broke in that case too), so I now believe they are not
> > necessarily
> > > >> > useless.
> > > >> >
> > > >> > I agree with the idea of trying to test the most important things
> > > first
> > > >> and
> > > >> > as Kenneth said trying to reduce the tests of successful
> validation,
> > > >> > however I am not sure how this can be formalized in the style
> guide
> > > >> > considering that the value of the tests seems to be based on the
> > > judgment
> > > >> > of both the developer and the reviewer.
> > > >> >
> > > >> > One final aspect if we achieve consensus is that we should not
> > forget
> > > >> that
> > > >> > if we reduce some of this tests we have to pay attention to not
> > reduce
> > > >> the
> > > >> > current level of coverage.
> > > >> >
> > > >> > Regards,
> > > >> > Ismaël
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Mon, Mar 13, 2017 at 8:37 PM, vikas rk <[email protected]>
> > wrote:
> > > >> >
> > > >> > > +0.5
> > > >> > >
> > > >> > > My two cents,
> > > >> > >
> > > >> > >   * However trivial the test is it should be added unless user
> > has a
> > > >> easy
> > > >> > > workaround to not having to wait for a few days until the
> trivial
> > > fixes
> > > >> > are
> > > >> > > merged to beam and then propagated to the runner.
> > > >> > >
> > > >> > >   * While I agree with trivial tests like "ensuring meaningful
> > error
> > > >> > > message thrown for null options" or "ordering of parameters" not
> > > being
> > > >> > > strictly necessary and definitely not something that should be
> > used
> > > as
> > > >> > > reference for other IOs, (staying consistent with other IOs is
> > > probably
> > > >> > the
> > > >> > > reason why DatastoreIO has some of them), I still think some
> > degree
> > > of
> > > >> > > leeway for the author is desirable depending on what level of
> user
> > > >> > > experience they want to provide, as long as it is not a burden
> for
> > > >> adding
> > > >> > > new tests.
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > On 11 March 2017 at 14:16, Jean-Baptiste Onofré <
> [email protected]>
> > > >> wrote:
> > > >> > >
> > > >> > > > +1
> > > >> > > >
> > > >> > > > Testing is always hard, especially to have concrete tests.
> > > Reducing
> > > >> the
> > > >> > > > "noise" is a good idea.
> > > >> > > >
> > > >> > > > Regards
> > > >> > > > JB
> > > >> > > >
> > > >> > > >
> > > >> > > > On 03/10/2017 04:09 PM, Eugene Kirpichov wrote:
> > > >> > > >
> > > >> > > >> Hello,
> > > >> > > >>
> > > >> > > >> I've seen a pattern in a couple of different transforms (IOs)
> > > where
> > > >> > we,
> > > >> > > I
> > > >> > > >> think, spend an excessive amount of code unit-testing the
> > trivial
> > > >> > > builder
> > > >> > > >> methods.
> > > >> > > >>
> > > >> > > >> E.g. a significant part of
> > > >> > > >> https://github.com/apache/beam/blob/master/sdks/java/io/goog
> > > >> > > >> le-cloud-platform/src/test/java/org/apache/beam/sdk/io/
> > > >> > > >> gcp/datastore/DatastoreV1Test.java
> > > >> > > >> is
> > > >> > > >> devoted to that, and a significant part of the in-progress
> > Hadoop
> > > >> > > >> InputFormat IO.
> > > >> > > >>
> > > >> > > >> In particular, I mean unit-testing trivial builder
> > functionality
> > > >> such
> > > >> > > as:
> > > >> > > >> - That setting a parameter actually sets it
> > > >> > > >> - That setting parameters in one order gives the same effect
> > as a
> > > >> > > >> different
> > > >> > > >> order
> > > >> > > >> - That a null value of a parameter is rejected with an
> > > appropriate
> > > >> > error
> > > >> > > >> message
> > > >> > > >> - That a non-null value of a parameter is accepted
> > > >> > > >>
> > > >> > > >> I'd like to come to a consensus as to how much unit-testing
> of
> > > such
> > > >> > > stuff
> > > >> > > >> is appropriate when developing a new transform. And then put
> > this
> > > >> > > >> consensus
> > > >> > > >> into the PTransform Style Guide
> > > >> > > >> <https://beam.apache.org/contribute/ptransform-style-guide/
> >.
> > > >> > > >>
> > > >> > > >> I think such tests are not worth their weight in lines of
> code,
> > > for
> > > >> a
> > > >> > > few
> > > >> > > >> reasons:
> > > >> > > >> - Whether or not a parameter actually takes effect, should be
> > > tested
> > > >> > > using
> > > >> > > >> a semantic test (RunnableOnService). Testing whether a getter
> > > >> returns
> > > >> > > the
> > > >> > > >> same thing a setter set is neither necessary (already covered
> > by
> > > a
> > > >> > > >> semantic
> > > >> > > >> test) nor sufficient (it's possible that the expansion of the
> > > >> > transform
> > > >> > > >> forgets to even call the getter).
> > > >> > > >> - Testing "a non-null value (or an otherwise valid value) is
> > > >> accepted"
> > > >> > > is
> > > >> > > >> redundant with every semantic test that supplies the value.
> > > >> > > >> - For testing of supplying a different order of parameters:
> my
> > > main
> > > >> > > >> objections are 1) I'm not sure what kind of reasonably-likely
> > > coding
> > > >> > > error
> > > >> > > >> this guards against, especially given most multi-parameter
> > > >> transforms
> > > >> > > use
> > > >> > > >> AutoValue anyway, and given that "does a setter take effect"
> is
> > > >> > already
> > > >> > > >> tested via bullets above 2) given such a coding error exists,
> > > >> there's
> > > >> > no
> > > >> > > >> way to tell which orders  or how many you should test;
> > > unit-testing
> > > >> > two
> > > >> > > or
> > > >> > > >> three orders probably gives next to no protection, and
> > > unit-testing
> > > >> > more
> > > >> > > >> than that is impractical.
> > > >> > > >> - Testing "a null value is rejected" effectively unit-tests a
> > > single
> > > >> > > line
> > > >> > > >> that says checkNotNull(...). I don't think it's worth 10 or
> so
> > > lines
> > > >> > of
> > > >> > > >> code to test something this trivial: even if an author forgot
> > to
> > > add
> > > >> > > this
> > > >> > > >> line, and then a used supplied null - the user will just get
> a
> > > less
> > > >> > > >> informative error message, possibly report it as a bug, and
> the
> > > >> error
> > > >> > > will
> > > >> > > >> easily get fixed. I.e. it's a very low-risk thing to get
> wrong.
> > > >> > > >>
> > > >> > > >> I think the following similar types of tests are still worth
> > > doing
> > > >> > > though:
> > > >> > > >> - Non-trivial parameter validation logic: where the
> validation
> > > rule
> > > >> is
> > > >> > > >> complex, or where the error message is interesting, etc.
> > > >> > > >> - Cross-parameter invariants which, if violated, lead to data
> > > loss:
> > > >> > e.g.
> > > >> > > >> that if parameters foo and bar are mutually exclusive, verify
> > > that
> > > >> if
> > > >> > > the
> > > >> > > >> user specifies both, they get an error, instead of one of
> them
> > > being
> > > >> > > >> silently ignored.
> > > >> > > >>
> > > >> > > >> Thoughts?
> > > >> > > >>
> > > >> > > >>
> > > >> > > > --
> > > >> > > > Jean-Baptiste Onofré
> > > >> > > > [email protected]
> > > >> > > > http://blog.nanthrax.net
> > > >> > > > Talend - http://www.talend.com
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
>

Re: Style: how much testing for transform builder classes?

Reply via email to