Yes, what I meant is: Necessary tests are ones that blocks users if not
present. Trivial or non-trivial shouldn't be the issue in such cases.

Some of the boilerplate code and tests is because IO PTransforms are
returned to the user before they are fully constructed and actual
validation happens in the validate method rather than at construction. I
understand that the reasoning here is that we want to support constructing
them with options in any order and using Builder pattern can be confusing.

If validate method is where all the validation happens, then we should able
to eliminate some redundant checks and tests during construction time like
in *withOption* methods here
<https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java#L199>
 and here
<https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore/DatastoreV1.java#L387>
as
these are also checked in the validate method.











-Vikas



On 14 March 2017 at 15:40, Eugene Kirpichov <[email protected]>
wrote:

> Thanks all. Looks like people are on board with the general direction
> though it remains to refine it to concrete guidelines to go into style
> guide.
>
> Ismaël, can you give more details about the situation you described in the
> first paragraph? Is it perhaps that really a RunnableOnService test was
> missing (and perhaps still is), rather than a builder test?
>
> Vikas, regarding trivial tests and user waiting for a work-around: in the
> situation I described, they don't really need a workaround - they specified
> an invalid value and have been minorly inconvenienced because the error
> they got about it was not very readable, so fixing their value took them a
> little longer than it could have, but they fixed it and their work is not
> blocked. I think Robert's arguments about the cost of trivial tests apply.
>
> I agree that the author should be at liberty to choose which validation to
> unit-test and which to skip as trivial, so documentation on this topic
> should be in the form of guidelines, high-quality example code (i.e. clean
> up the unit tests of IOs bundled with Beam SDK), and informal knowledge in
> the heads of readers of this thread, rather than hard rules.
>
> On Tue, Mar 14, 2017 at 8:07 AM Ismaël Mejía <[email protected]> wrote:
>
> > ​+0.5
> >
> > I used to think that some of those tests were not worth, for example
> > testBuildRead and
> > testBuildReadAlt. However the reality is that these tests allowed me to
> > find bugs both during the development of HBaseIO and just yesterday when
> I
> > tried to test the write support for the emulator with DataStoreIO (that
> > lacked a parameter in testBuildWrite and didn’t have a testBuildWriteAlt
> > and broke in that case too), so I now believe they are not necessarily
> > useless.
> >
> > I agree with the idea of trying to test the most important things first
> and
> > as Kenneth said trying to reduce the tests of successful validation,
> > however I am not sure how this can be formalized in the style guide
> > considering that the value of the tests seems to be based on the judgment
> > of both the developer and the reviewer.
> >
> > One final aspect if we achieve consensus is that we should not forget
> that
> > if we reduce some of this tests we have to pay attention to not reduce
> the
> > current level of coverage.
> >
> > Regards,
> > Ismaël
> >
> > ​
> >
> >
> > On Mon, Mar 13, 2017 at 8:37 PM, vikas rk <[email protected]> wrote:
> >
> > > +0.5
> > >
> > > My two cents,
> > >
> > >   * However trivial the test is it should be added unless user has a
> easy
> > > workaround to not having to wait for a few days until the trivial fixes
> > are
> > > merged to beam and then propagated to the runner.
> > >
> > >   * While I agree with trivial tests like "ensuring meaningful error
> > > message thrown for null options" or "ordering of parameters" not being
> > > strictly necessary and definitely not something that should be used as
> > > reference for other IOs, (staying consistent with other IOs is probably
> > the
> > > reason why DatastoreIO has some of them), I still think some degree of
> > > leeway for the author is desirable depending on what level of user
> > > experience they want to provide, as long as it is not a burden for
> adding
> > > new tests.
> > >
> > >
> > >
> > > On 11 March 2017 at 14:16, Jean-Baptiste Onofré <[email protected]>
> wrote:
> > >
> > > > +1
> > > >
> > > > Testing is always hard, especially to have concrete tests. Reducing
> the
> > > > "noise" is a good idea.
> > > >
> > > > Regards
> > > > JB
> > > >
> > > >
> > > > On 03/10/2017 04:09 PM, Eugene Kirpichov wrote:
> > > >
> > > >> Hello,
> > > >>
> > > >> I've seen a pattern in a couple of different transforms (IOs) where
> > we,
> > > I
> > > >> think, spend an excessive amount of code unit-testing the trivial
> > > builder
> > > >> methods.
> > > >>
> > > >> E.g. a significant part of
> > > >> https://github.com/apache/beam/blob/master/sdks/java/io/goog
> > > >> le-cloud-platform/src/test/java/org/apache/beam/sdk/io/
> > > >> gcp/datastore/DatastoreV1Test.java
> > > >> is
> > > >> devoted to that, and a significant part of the in-progress Hadoop
> > > >> InputFormat IO.
> > > >>
> > > >> In particular, I mean unit-testing trivial builder functionality
> such
> > > as:
> > > >> - That setting a parameter actually sets it
> > > >> - That setting parameters in one order gives the same effect as a
> > > >> different
> > > >> order
> > > >> - That a null value of a parameter is rejected with an appropriate
> > error
> > > >> message
> > > >> - That a non-null value of a parameter is accepted
> > > >>
> > > >> I'd like to come to a consensus as to how much unit-testing of such
> > > stuff
> > > >> is appropriate when developing a new transform. And then put this
> > > >> consensus
> > > >> into the PTransform Style Guide
> > > >> <https://beam.apache.org/contribute/ptransform-style-guide/>.
> > > >>
> > > >> I think such tests are not worth their weight in lines of code, for
> a
> > > few
> > > >> reasons:
> > > >> - Whether or not a parameter actually takes effect, should be tested
> > > using
> > > >> a semantic test (RunnableOnService). Testing whether a getter
> returns
> > > the
> > > >> same thing a setter set is neither necessary (already covered by a
> > > >> semantic
> > > >> test) nor sufficient (it's possible that the expansion of the
> > transform
> > > >> forgets to even call the getter).
> > > >> - Testing "a non-null value (or an otherwise valid value) is
> accepted"
> > > is
> > > >> redundant with every semantic test that supplies the value.
> > > >> - For testing of supplying a different order of parameters: my main
> > > >> objections are 1) I'm not sure what kind of reasonably-likely coding
> > > error
> > > >> this guards against, especially given most multi-parameter
> transforms
> > > use
> > > >> AutoValue anyway, and given that "does a setter take effect" is
> > already
> > > >> tested via bullets above 2) given such a coding error exists,
> there's
> > no
> > > >> way to tell which orders  or how many you should test; unit-testing
> > two
> > > or
> > > >> three orders probably gives next to no protection, and unit-testing
> > more
> > > >> than that is impractical.
> > > >> - Testing "a null value is rejected" effectively unit-tests a single
> > > line
> > > >> that says checkNotNull(...). I don't think it's worth 10 or so lines
> > of
> > > >> code to test something this trivial: even if an author forgot to add
> > > this
> > > >> line, and then a used supplied null - the user will just get a less
> > > >> informative error message, possibly report it as a bug, and the
> error
> > > will
> > > >> easily get fixed. I.e. it's a very low-risk thing to get wrong.
> > > >>
> > > >> I think the following similar types of tests are still worth doing
> > > though:
> > > >> - Non-trivial parameter validation logic: where the validation rule
> is
> > > >> complex, or where the error message is interesting, etc.
> > > >> - Cross-parameter invariants which, if violated, lead to data loss:
> > e.g.
> > > >> that if parameters foo and bar are mutually exclusive, verify that
> if
> > > the
> > > >> user specifies both, they get an error, instead of one of them being
> > > >> silently ignored.
> > > >>
> > > >> Thoughts?
> > > >>
> > > >>
> > > > --
> > > > Jean-Baptiste Onofré
> > > > [email protected]
> > > > http://blog.nanthrax.net
> > > > Talend - http://www.talend.com
> > > >
> > >
> >
>

Reply via email to