+0.5

Tests of trivial validation failures, if they check the error message, are
actually tests of effective communication in the error message (example:
testReadValidationFailsQueryLimitZero). These are on easily on par with
functionality tests in terms of importance for users.

But tests of successful validation I agree don't add value - if validation
succeeds, then what matters is whether execution is as expected. This
doesn't imply a reduction in the number of tests, but an extension of any
of this sort to actually execute and check results - if we care that
validation succeeds in some case, then we must also want to actually run
such a transform.

All that said, the weight of a test is IMO measured in burden when making
changes, not LOC, so I would reserve judgment until you try to make a
change and a non-useful test encumbers your work... and then I'd go ahead
and wipe it out. You'll need to get your code reviewer to agree, and that
is the standard for any change. Probably a good idea to check git blame and
check with the person who wrote it in case there was some hidden reason,
and if it isn't obvious add that to a comment.

Kenn

On Fri, Mar 10, 2017 at 7:09 AM, Eugene Kirpichov <
[email protected]> wrote:

> Hello,
>
> I've seen a pattern in a couple of different transforms (IOs) where we, I
> think, spend an excessive amount of code unit-testing the trivial builder
> methods.
>
> E.g. a significant part of
> https://github.com/apache/beam/blob/master/sdks/java/io/
> google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/datastore/
> DatastoreV1Test.java
> is
> devoted to that, and a significant part of the in-progress Hadoop
> InputFormat IO.
>
> In particular, I mean unit-testing trivial builder functionality such as:
> - That setting a parameter actually sets it
> - That setting parameters in one order gives the same effect as a different
> order
> - That a null value of a parameter is rejected with an appropriate error
> message
> - That a non-null value of a parameter is accepted
>
> I'd like to come to a consensus as to how much unit-testing of such stuff
> is appropriate when developing a new transform. And then put this consensus
> into the PTransform Style Guide
> <https://beam.apache.org/contribute/ptransform-style-guide/>.
>
> I think such tests are not worth their weight in lines of code, for a few
> reasons:
> - Whether or not a parameter actually takes effect, should be tested using
> a semantic test (RunnableOnService). Testing whether a getter returns the
> same thing a setter set is neither necessary (already covered by a semantic
> test) nor sufficient (it's possible that the expansion of the transform
> forgets to even call the getter).
> - Testing "a non-null value (or an otherwise valid value) is accepted" is
> redundant with every semantic test that supplies the value.
> - For testing of supplying a different order of parameters: my main
> objections are 1) I'm not sure what kind of reasonably-likely coding error
> this guards against, especially given most multi-parameter transforms use
> AutoValue anyway, and given that "does a setter take effect" is already
> tested via bullets above 2) given such a coding error exists, there's no
> way to tell which orders  or how many you should test; unit-testing two or
> three orders probably gives next to no protection, and unit-testing more
> than that is impractical.
> - Testing "a null value is rejected" effectively unit-tests a single line
> that says checkNotNull(...). I don't think it's worth 10 or so lines of
> code to test something this trivial: even if an author forgot to add this
> line, and then a used supplied null - the user will just get a less
> informative error message, possibly report it as a bug, and the error will
> easily get fixed. I.e. it's a very low-risk thing to get wrong.
>
> I think the following similar types of tests are still worth doing though:
> - Non-trivial parameter validation logic: where the validation rule is
> complex, or where the error message is interesting, etc.
> - Cross-parameter invariants which, if violated, lead to data loss: e.g.
> that if parameters foo and bar are mutually exclusive, verify that if the
> user specifies both, they get an error, instead of one of them being
> silently ignored.
>
> Thoughts?
>

Reply via email to