Re: Style: how much testing for transform builder classes?

Ismaël Mejía Wed, 15 Mar 2017 02:11:51 -0700

+1 to Vikas point maybe the right place to enforce things correct
build tests is in the validate and like this reduce the test
boilerplate and only test the validate, but I wonder if this totally
covers both cases (the buildsCorrectly and
buildsCorrectlyInDifferentOrder ones).


I answer Eugene’s question here even if you are aware now since you
commented in the PR, so everyone understands the case.

The case is pretty simple, when you extend an IO and add a new
configuration parameter, suppose we have withFoo(String foo) and we
want to add withBar(String bar). In some cases the implementation or
even worse the combination of those are not built correctly, so the
only way to guarantee that this works is to have code that tests the
complete parameter combination or tests that at least assert that the
object is built correctly.

This is something that can happen both with or without AutoValue
because the with method is hand-written and the natural tendency with
boilerplate methods like this is to copy/paste, so we can end up doing
silly things like:

    private Read(String foo, String bar) { … }

    public Read withBar(String bar) {
      return new Read(foo, null);
    }

in this case the reference to bar is not stored or assigned (this is
similar to the case of the DatastoreIO PR), and AutoValue may seem to
solve this issue but you can also end up with this situation if you
copy paste the withFoo method and just change the method name:

    public Read withBar(String foo) {
      return builder().setFoo(foo).build();
    }

Of course both seem silly but both can happen and the tests at least
help to discover those, if Vikas proposition covers the
testsBuildCorrectly and testsBuildCorrectlyInDifferentOrder kind of
tests I think it is OK to get rid of those.

On Wed, Mar 15, 2017 at 1:05 AM, vikas rk <[email protected]> wrote:
> Yes, what I meant is: Necessary tests are ones that blocks users if not
> present. Trivial or non-trivial shouldn't be the issue in such cases.
>
> Some of the boilerplate code and tests is because IO PTransforms are
> returned to the user before they are fully constructed and actual
> validation happens in the validate method rather than at construction. I
> understand that the reasoning here is that we want to support constructing
> them with options in any order and using Builder pattern can be confusing.
>
> If validate method is where all the validation happens, then we should able
> to eliminate some redundant checks and tests during construction time like
> in *withOption* methods here
> <https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java#L199>
>  and here
> <https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java#L387>
> as
> these are also checked in the validate method.
>
>
>
>
>
>
>
>
>
>
>
> -Vikas
>
>
>
> On 14 March 2017 at 15:40, Eugene Kirpichov <[email protected]>
> wrote:
>
>> Thanks all. Looks like people are on board with the general direction
>> though it remains to refine it to concrete guidelines to go into style
>> guide.
>>
>> Ismaël, can you give more details about the situation you described in the
>> first paragraph? Is it perhaps that really a RunnableOnService test was
>> missing (and perhaps still is), rather than a builder test?
>>
>> Vikas, regarding trivial tests and user waiting for a work-around: in the
>> situation I described, they don't really need a workaround - they specified
>> an invalid value and have been minorly inconvenienced because the error
>> they got about it was not very readable, so fixing their value took them a
>> little longer than it could have, but they fixed it and their work is not
>> blocked. I think Robert's arguments about the cost of trivial tests apply.
>>
>> I agree that the author should be at liberty to choose which validation to
>> unit-test and which to skip as trivial, so documentation on this topic
>> should be in the form of guidelines, high-quality example code (i.e. clean
>> up the unit tests of IOs bundled with Beam SDK), and informal knowledge in
>> the heads of readers of this thread, rather than hard rules.
>>
>> On Tue, Mar 14, 2017 at 8:07 AM Ismaël Mejía <[email protected]> wrote:
>>
>> > +0.5
>> >
>> > I used to think that some of those tests were not worth, for example
>> > testBuildRead and
>> > testBuildReadAlt. However the reality is that these tests allowed me to
>> > find bugs both during the development of HBaseIO and just yesterday when
>> I
>> > tried to test the write support for the emulator with DataStoreIO (that
>> > lacked a parameter in testBuildWrite and didn’t have a testBuildWriteAlt
>> > and broke in that case too), so I now believe they are not necessarily
>> > useless.
>> >
>> > I agree with the idea of trying to test the most important things first
>> and
>> > as Kenneth said trying to reduce the tests of successful validation,
>> > however I am not sure how this can be formalized in the style guide
>> > considering that the value of the tests seems to be based on the judgment
>> > of both the developer and the reviewer.
>> >
>> > One final aspect if we achieve consensus is that we should not forget
>> that
>> > if we reduce some of this tests we have to pay attention to not reduce
>> the
>> > current level of coverage.
>> >
>> > Regards,
>> > Ismaël
>> >
>> >
>> >
>> >
>> > On Mon, Mar 13, 2017 at 8:37 PM, vikas rk <[email protected]> wrote:
>> >
>> > > +0.5
>> > >
>> > > My two cents,
>> > >
>> > >   * However trivial the test is it should be added unless user has a
>> easy
>> > > workaround to not having to wait for a few days until the trivial fixes
>> > are
>> > > merged to beam and then propagated to the runner.
>> > >
>> > >   * While I agree with trivial tests like "ensuring meaningful error
>> > > message thrown for null options" or "ordering of parameters" not being
>> > > strictly necessary and definitely not something that should be used as
>> > > reference for other IOs, (staying consistent with other IOs is probably
>> > the
>> > > reason why DatastoreIO has some of them), I still think some degree of
>> > > leeway for the author is desirable depending on what level of user
>> > > experience they want to provide, as long as it is not a burden for
>> adding
>> > > new tests.
>> > >
>> > >
>> > >
>> > > On 11 March 2017 at 14:16, Jean-Baptiste Onofré <[email protected]>
>> wrote:
>> > >
>> > > > +1
>> > > >
>> > > > Testing is always hard, especially to have concrete tests. Reducing
>> the
>> > > > "noise" is a good idea.
>> > > >
>> > > > Regards
>> > > > JB
>> > > >
>> > > >
>> > > > On 03/10/2017 04:09 PM, Eugene Kirpichov wrote:
>> > > >
>> > > >> Hello,
>> > > >>
>> > > >> I've seen a pattern in a couple of different transforms (IOs) where
>> > we,
>> > > I
>> > > >> think, spend an excessive amount of code unit-testing the trivial
>> > > builder
>> > > >> methods.
>> > > >>
>> > > >> E.g. a significant part of
>> > > >> https://github.com/apache/beam/blob/master/sdks/java/io/goog
>> > > >> le-cloud-platform/src/test/java/org/apache/beam/sdk/io/
>> > > >> gcp/datastore/DatastoreV1Test.java
>> > > >> is
>> > > >> devoted to that, and a significant part of the in-progress Hadoop
>> > > >> InputFormat IO.
>> > > >>
>> > > >> In particular, I mean unit-testing trivial builder functionality
>> such
>> > > as:
>> > > >> - That setting a parameter actually sets it
>> > > >> - That setting parameters in one order gives the same effect as a
>> > > >> different
>> > > >> order
>> > > >> - That a null value of a parameter is rejected with an appropriate
>> > error
>> > > >> message
>> > > >> - That a non-null value of a parameter is accepted
>> > > >>
>> > > >> I'd like to come to a consensus as to how much unit-testing of such
>> > > stuff
>> > > >> is appropriate when developing a new transform. And then put this
>> > > >> consensus
>> > > >> into the PTransform Style Guide
>> > > >> <https://beam.apache.org/contribute/ptransform-style-guide/>.
>> > > >>
>> > > >> I think such tests are not worth their weight in lines of code, for
>> a
>> > > few
>> > > >> reasons:
>> > > >> - Whether or not a parameter actually takes effect, should be tested
>> > > using
>> > > >> a semantic test (RunnableOnService). Testing whether a getter
>> returns
>> > > the
>> > > >> same thing a setter set is neither necessary (already covered by a
>> > > >> semantic
>> > > >> test) nor sufficient (it's possible that the expansion of the
>> > transform
>> > > >> forgets to even call the getter).
>> > > >> - Testing "a non-null value (or an otherwise valid value) is
>> accepted"
>> > > is
>> > > >> redundant with every semantic test that supplies the value.
>> > > >> - For testing of supplying a different order of parameters: my main
>> > > >> objections are 1) I'm not sure what kind of reasonably-likely coding
>> > > error
>> > > >> this guards against, especially given most multi-parameter
>> transforms
>> > > use
>> > > >> AutoValue anyway, and given that "does a setter take effect" is
>> > already
>> > > >> tested via bullets above 2) given such a coding error exists,
>> there's
>> > no
>> > > >> way to tell which orders  or how many you should test; unit-testing
>> > two
>> > > or
>> > > >> three orders probably gives next to no protection, and unit-testing
>> > more
>> > > >> than that is impractical.
>> > > >> - Testing "a null value is rejected" effectively unit-tests a single
>> > > line
>> > > >> that says checkNotNull(...). I don't think it's worth 10 or so lines
>> > of
>> > > >> code to test something this trivial: even if an author forgot to add
>> > > this
>> > > >> line, and then a used supplied null - the user will just get a less
>> > > >> informative error message, possibly report it as a bug, and the
>> error
>> > > will
>> > > >> easily get fixed. I.e. it's a very low-risk thing to get wrong.
>> > > >>
>> > > >> I think the following similar types of tests are still worth doing
>> > > though:
>> > > >> - Non-trivial parameter validation logic: where the validation rule
>> is
>> > > >> complex, or where the error message is interesting, etc.
>> > > >> - Cross-parameter invariants which, if violated, lead to data loss:
>> > e.g.
>> > > >> that if parameters foo and bar are mutually exclusive, verify that
>> if
>> > > the
>> > > >> user specifies both, they get an error, instead of one of them being
>> > > >> silently ignored.
>> > > >>
>> > > >> Thoughts?
>> > > >>
>> > > >>
>> > > > --
>> > > > Jean-Baptiste Onofré
>> > > > [email protected]
>> > > > http://blog.nanthrax.net
>> > > > Talend - http://www.talend.com
>> > > >
>> > >
>> >
>>

Re: Style: how much testing for transform builder classes?

Reply via email to