Re: Style: how much testing for transform builder classes?

Robert Bradshaw Tue, 21 Mar 2017 13:19:59 -0700

On Wed, Mar 15, 2017 at 2:11 AM, Ismaël Mejía <ieme...@gmail.com> wrote:


> +1 to Vikas point maybe the right place to enforce things correct
> build tests is in the validate and like this reduce the test
> boilerplate and only test the validate, but I wonder if this totally
> covers both cases (the buildsCorrectly and
> buildsCorrectlyInDifferentOrder ones).
>
> I answer Eugene’s question here even if you are aware now since you
> commented in the PR, so everyone understands the case.
>
> The case is pretty simple, when you extend an IO and add a new
> configuration parameter, suppose we have withFoo(String foo) and we
> want to add withBar(String bar). In some cases the implementation or
> even worse the combination of those are not built correctly, so the
> only way to guarantee that this works is to have code that tests the
> complete parameter combination or tests that at least assert that the
> object is built correctly.
>
> This is something that can happen both with or without AutoValue
> because the with method is hand-written and the natural tendency with
> boilerplate methods like this is to copy/paste, so we can end up doing
> silly things like:
>
>     private Read(String foo, String bar) { … }
>
>     public Read withBar(String bar) {
>       return new Read(foo, null);
>     }
>
> in this case the reference to bar is not stored or assigned (this is
> similar to the case of the DatastoreIO PR), and AutoValue may seem to
> solve this issue but you can also end up with this situation if you
> copy paste the withFoo method and just change the method name:
>
>     public Read withBar(String foo) {
>       return builder().setFoo(foo).build();
>     }
>
> Of course both seem silly but both can happen and the tests at least
> help to discover those,
>

Such mistakes should be entirely discovered by tests of feature Bar. If Bar
is not actually being tested, that's a bigger problem with coverage that a
construction-only test actually obscures (giving it negative value).


>
> On Wed, Mar 15, 2017 at 1:05 AM, vikas rk <vikky...@gmail.com> wrote:
> > Yes, what I meant is: Necessary tests are ones that blocks users if not
> > present. Trivial or non-trivial shouldn't be the issue in such cases.
> >
> > Some of the boilerplate code and tests is because IO PTransforms are
> > returned to the user before they are fully constructed and actual
> > validation happens in the validate method rather than at construction. I
> > understand that the reasoning here is that we want to support
> constructing
> > them with options in any order and using Builder pattern can be
> confusing.
> >
> > If validate method is where all the validation happens, then we should
> able
> > to eliminate some redundant checks and tests during construction time
> like
> > in *withOption* methods here
> > <https://github.com/apache/beam/blob/master/sdks/java/io/
> google-cloud-platform/src/main/java/org/apache/beam/sdk/
> io/gcp/bigtable/BigtableIO.java#L199>
> >  and here
> > <https://github.com/apache/beam/blob/master/sdks/java/io/
> google-cloud-platform/src/main/java/org/apache/beam/sdk/
> io/gcp/datastore/DatastoreV1.java#L387>
> > as
> > these are also checked in the validate method.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > -Vikas
> >
> >
> >
> > On 14 March 2017 at 15:40, Eugene Kirpichov <kirpic...@google.com.invalid
> >
> > wrote:
> >
> >> Thanks all. Looks like people are on board with the general direction
> >> though it remains to refine it to concrete guidelines to go into style
> >> guide.
> >>
> >> Ismaël, can you give more details about the situation you described in
> the
> >> first paragraph? Is it perhaps that really a RunnableOnService test was
> >> missing (and perhaps still is), rather than a builder test?
> >>
> >> Vikas, regarding trivial tests and user waiting for a work-around: in
> the
> >> situation I described, they don't really need a workaround - they
> specified
> >> an invalid value and have been minorly inconvenienced because the error
> >> they got about it was not very readable, so fixing their value took
> them a
> >> little longer than it could have, but they fixed it and their work is
> not
> >> blocked. I think Robert's arguments about the cost of trivial tests
> apply.
> >>
> >> I agree that the author should be at liberty to choose which validation
> to
> >> unit-test and which to skip as trivial, so documentation on this topic
> >> should be in the form of guidelines, high-quality example code (i.e.
> clean
> >> up the unit tests of IOs bundled with Beam SDK), and informal knowledge
> in
> >> the heads of readers of this thread, rather than hard rules.
> >>
> >> On Tue, Mar 14, 2017 at 8:07 AM Ismaël Mejía <ieme...@gmail.com> wrote:
> >>
> >> > +0.5
> >> >
> >> > I used to think that some of those tests were not worth, for example
> >> > testBuildRead and
> >> > testBuildReadAlt. However the reality is that these tests allowed me
> to
> >> > find bugs both during the development of HBaseIO and just yesterday
> when
> >> I
> >> > tried to test the write support for the emulator with DataStoreIO
> (that
> >> > lacked a parameter in testBuildWrite and didn’t have a
> testBuildWriteAlt
> >> > and broke in that case too), so I now believe they are not necessarily
> >> > useless.
> >> >
> >> > I agree with the idea of trying to test the most important things
> first
> >> and
> >> > as Kenneth said trying to reduce the tests of successful validation,
> >> > however I am not sure how this can be formalized in the style guide
> >> > considering that the value of the tests seems to be based on the
> judgment
> >> > of both the developer and the reviewer.
> >> >
> >> > One final aspect if we achieve consensus is that we should not forget
> >> that
> >> > if we reduce some of this tests we have to pay attention to not reduce
> >> the
> >> > current level of coverage.
> >> >
> >> > Regards,
> >> > Ismaël
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Mar 13, 2017 at 8:37 PM, vikas rk <vikky...@gmail.com> wrote:
> >> >
> >> > > +0.5
> >> > >
> >> > > My two cents,
> >> > >
> >> > >   * However trivial the test is it should be added unless user has a
> >> easy
> >> > > workaround to not having to wait for a few days until the trivial
> fixes
> >> > are
> >> > > merged to beam and then propagated to the runner.
> >> > >
> >> > >   * While I agree with trivial tests like "ensuring meaningful error
> >> > > message thrown for null options" or "ordering of parameters" not
> being
> >> > > strictly necessary and definitely not something that should be used
> as
> >> > > reference for other IOs, (staying consistent with other IOs is
> probably
> >> > the
> >> > > reason why DatastoreIO has some of them), I still think some degree
> of
> >> > > leeway for the author is desirable depending on what level of user
> >> > > experience they want to provide, as long as it is not a burden for
> >> adding
> >> > > new tests.
> >> > >
> >> > >
> >> > >
> >> > > On 11 March 2017 at 14:16, Jean-Baptiste Onofré <j...@nanthrax.net>
> >> wrote:
> >> > >
> >> > > > +1
> >> > > >
> >> > > > Testing is always hard, especially to have concrete tests.
> Reducing
> >> the
> >> > > > "noise" is a good idea.
> >> > > >
> >> > > > Regards
> >> > > > JB
> >> > > >
> >> > > >
> >> > > > On 03/10/2017 04:09 PM, Eugene Kirpichov wrote:
> >> > > >
> >> > > >> Hello,
> >> > > >>
> >> > > >> I've seen a pattern in a couple of different transforms (IOs)
> where
> >> > we,
> >> > > I
> >> > > >> think, spend an excessive amount of code unit-testing the trivial
> >> > > builder
> >> > > >> methods.
> >> > > >>
> >> > > >> E.g. a significant part of
> >> > > >> https://github.com/apache/beam/blob/master/sdks/java/io/goog
> >> > > >> le-cloud-platform/src/test/java/org/apache/beam/sdk/io/
> >> > > >> gcp/datastore/DatastoreV1Test.java
> >> > > >> is
> >> > > >> devoted to that, and a significant part of the in-progress Hadoop
> >> > > >> InputFormat IO.
> >> > > >>
> >> > > >> In particular, I mean unit-testing trivial builder functionality
> >> such
> >> > > as:
> >> > > >> - That setting a parameter actually sets it
> >> > > >> - That setting parameters in one order gives the same effect as a
> >> > > >> different
> >> > > >> order
> >> > > >> - That a null value of a parameter is rejected with an
> appropriate
> >> > error
> >> > > >> message
> >> > > >> - That a non-null value of a parameter is accepted
> >> > > >>
> >> > > >> I'd like to come to a consensus as to how much unit-testing of
> such
> >> > > stuff
> >> > > >> is appropriate when developing a new transform. And then put this
> >> > > >> consensus
> >> > > >> into the PTransform Style Guide
> >> > > >> <https://beam.apache.org/contribute/ptransform-style-guide/>.
> >> > > >>
> >> > > >> I think such tests are not worth their weight in lines of code,
> for
> >> a
> >> > > few
> >> > > >> reasons:
> >> > > >> - Whether or not a parameter actually takes effect, should be
> tested
> >> > > using
> >> > > >> a semantic test (RunnableOnService). Testing whether a getter
> >> returns
> >> > > the
> >> > > >> same thing a setter set is neither necessary (already covered by
> a
> >> > > >> semantic
> >> > > >> test) nor sufficient (it's possible that the expansion of the
> >> > transform
> >> > > >> forgets to even call the getter).
> >> > > >> - Testing "a non-null value (or an otherwise valid value) is
> >> accepted"
> >> > > is
> >> > > >> redundant with every semantic test that supplies the value.
> >> > > >> - For testing of supplying a different order of parameters: my
> main
> >> > > >> objections are 1) I'm not sure what kind of reasonably-likely
> coding
> >> > > error
> >> > > >> this guards against, especially given most multi-parameter
> >> transforms
> >> > > use
> >> > > >> AutoValue anyway, and given that "does a setter take effect" is
> >> > already
> >> > > >> tested via bullets above 2) given such a coding error exists,
> >> there's
> >> > no
> >> > > >> way to tell which orders  or how many you should test;
> unit-testing
> >> > two
> >> > > or
> >> > > >> three orders probably gives next to no protection, and
> unit-testing
> >> > more
> >> > > >> than that is impractical.
> >> > > >> - Testing "a null value is rejected" effectively unit-tests a
> single
> >> > > line
> >> > > >> that says checkNotNull(...). I don't think it's worth 10 or so
> lines
> >> > of
> >> > > >> code to test something this trivial: even if an author forgot to
> add
> >> > > this
> >> > > >> line, and then a used supplied null - the user will just get a
> less
> >> > > >> informative error message, possibly report it as a bug, and the
> >> error
> >> > > will
> >> > > >> easily get fixed. I.e. it's a very low-risk thing to get wrong.
> >> > > >>
> >> > > >> I think the following similar types of tests are still worth
> doing
> >> > > though:
> >> > > >> - Non-trivial parameter validation logic: where the validation
> rule
> >> is
> >> > > >> complex, or where the error message is interesting, etc.
> >> > > >> - Cross-parameter invariants which, if violated, lead to data
> loss:
> >> > e.g.
> >> > > >> that if parameters foo and bar are mutually exclusive, verify
> that
> >> if
> >> > > the
> >> > > >> user specifies both, they get an error, instead of one of them
> being
> >> > > >> silently ignored.
> >> > > >>
> >> > > >> Thoughts?
> >> > > >>
> >> > > >>
> >> > > > --
> >> > > > Jean-Baptiste Onofré
> >> > > > jbono...@apache.org
> >> > > > http://blog.nanthrax.net
> >> > > > Talend - http://www.talend.com
> >> > > >
> >> > >
> >> >
> >>
>

Re: Style: how much testing for transform builder classes?

Reply via email to