On Tue, Mar 21, 2017 at 5:14 PM, Dan Halperin <[email protected]> wrote:
> https://github.com/apache/beam/commit/b202548323b4d59b11bbdf06c99d0f > 99e6a947ef > is one example where tests of feature Bar exist but did not discover bugs > that could be introduced by builders. > True, though one would need to test the full cross-product of builder orderings to discover this, a simple test of setValidate() wouldn't suffice either. (And if not the full cross product, what subset?) Maybe some automated fuzz testing would be a fair thing to do here (cheaper and more comprehensive than manual tests...)? AutoValue like alleviates many, but not all, of these concerns - as Ismael > points out. > If two features are not orthogonal, that perhaps merits more test (and documentation). > > > > On Tue, Mar 21, 2017 at 1:18 PM, Robert Bradshaw < > [email protected]> wrote: > > > On Wed, Mar 15, 2017 at 2:11 AM, Ismaël Mejía <[email protected]> wrote: > > > > > +1 to Vikas point maybe the right place to enforce things correct > > > build tests is in the validate and like this reduce the test > > > boilerplate and only test the validate, but I wonder if this totally > > > covers both cases (the buildsCorrectly and > > > buildsCorrectlyInDifferentOrder ones). > > > > > > I answer Eugene’s question here even if you are aware now since you > > > commented in the PR, so everyone understands the case. > > > > > > The case is pretty simple, when you extend an IO and add a new > > > configuration parameter, suppose we have withFoo(String foo) and we > > > want to add withBar(String bar). In some cases the implementation or > > > even worse the combination of those are not built correctly, so the > > > only way to guarantee that this works is to have code that tests the > > > complete parameter combination or tests that at least assert that the > > > object is built correctly. > > > > > > This is something that can happen both with or without AutoValue > > > because the with method is hand-written and the natural tendency with > > > boilerplate methods like this is to copy/paste, so we can end up doing > > > silly things like: > > > > > > private Read(String foo, String bar) { … } > > > > > > public Read withBar(String bar) { > > > return new Read(foo, null); > > > } > > > > > > in this case the reference to bar is not stored or assigned (this is > > > similar to the case of the DatastoreIO PR), and AutoValue may seem to > > > solve this issue but you can also end up with this situation if you > > > copy paste the withFoo method and just change the method name: > > > > > > public Read withBar(String foo) { > > > return builder().setFoo(foo).build(); > > > } > > > > > > Of course both seem silly but both can happen and the tests at least > > > help to discover those, > > > > > > > Such mistakes should be entirely discovered by tests of feature Bar. If > Bar > > is not actually being tested, that's a bigger problem with coverage that > a > > construction-only test actually obscures (giving it negative value). > > > > > > > > > > On Wed, Mar 15, 2017 at 1:05 AM, vikas rk <[email protected]> wrote: > > > > Yes, what I meant is: Necessary tests are ones that blocks users if > not > > > > present. Trivial or non-trivial shouldn't be the issue in such cases. > > > > > > > > Some of the boilerplate code and tests is because IO PTransforms are > > > > returned to the user before they are fully constructed and actual > > > > validation happens in the validate method rather than at > construction. > > I > > > > understand that the reasoning here is that we want to support > > > constructing > > > > them with options in any order and using Builder pattern can be > > > confusing. > > > > > > > > If validate method is where all the validation happens, then we > should > > > able > > > > to eliminate some redundant checks and tests during construction time > > > like > > > > in *withOption* methods here > > > > <https://github.com/apache/beam/blob/master/sdks/java/io/ > > > google-cloud-platform/src/main/java/org/apache/beam/sdk/ > > > io/gcp/bigtable/BigtableIO.java#L199> > > > > and here > > > > <https://github.com/apache/beam/blob/master/sdks/java/io/ > > > google-cloud-platform/src/main/java/org/apache/beam/sdk/ > > > io/gcp/datastore/DatastoreV1.java#L387> > > > > as > > > > these are also checked in the validate method. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Vikas > > > > > > > > > > > > > > > > On 14 March 2017 at 15:40, Eugene Kirpichov > > <[email protected] > > > > > > > > wrote: > > > > > > > >> Thanks all. Looks like people are on board with the general > direction > > > >> though it remains to refine it to concrete guidelines to go into > style > > > >> guide. > > > >> > > > >> Ismaël, can you give more details about the situation you described > in > > > the > > > >> first paragraph? Is it perhaps that really a RunnableOnService test > > was > > > >> missing (and perhaps still is), rather than a builder test? > > > >> > > > >> Vikas, regarding trivial tests and user waiting for a work-around: > in > > > the > > > >> situation I described, they don't really need a workaround - they > > > specified > > > >> an invalid value and have been minorly inconvenienced because the > > error > > > >> they got about it was not very readable, so fixing their value took > > > them a > > > >> little longer than it could have, but they fixed it and their work > is > > > not > > > >> blocked. I think Robert's arguments about the cost of trivial tests > > > apply. > > > >> > > > >> I agree that the author should be at liberty to choose which > > validation > > > to > > > >> unit-test and which to skip as trivial, so documentation on this > topic > > > >> should be in the form of guidelines, high-quality example code (i.e. > > > clean > > > >> up the unit tests of IOs bundled with Beam SDK), and informal > > knowledge > > > in > > > >> the heads of readers of this thread, rather than hard rules. > > > >> > > > >> On Tue, Mar 14, 2017 at 8:07 AM Ismaël Mejía <[email protected]> > > wrote: > > > >> > > > >> > +0.5 > > > >> > > > > >> > I used to think that some of those tests were not worth, for > example > > > >> > testBuildRead and > > > >> > testBuildReadAlt. However the reality is that these tests allowed > me > > > to > > > >> > find bugs both during the development of HBaseIO and just > yesterday > > > when > > > >> I > > > >> > tried to test the write support for the emulator with DataStoreIO > > > (that > > > >> > lacked a parameter in testBuildWrite and didn’t have a > > > testBuildWriteAlt > > > >> > and broke in that case too), so I now believe they are not > > necessarily > > > >> > useless. > > > >> > > > > >> > I agree with the idea of trying to test the most important things > > > first > > > >> and > > > >> > as Kenneth said trying to reduce the tests of successful > validation, > > > >> > however I am not sure how this can be formalized in the style > guide > > > >> > considering that the value of the tests seems to be based on the > > > judgment > > > >> > of both the developer and the reviewer. > > > >> > > > > >> > One final aspect if we achieve consensus is that we should not > > forget > > > >> that > > > >> > if we reduce some of this tests we have to pay attention to not > > reduce > > > >> the > > > >> > current level of coverage. > > > >> > > > > >> > Regards, > > > >> > Ismaël > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > On Mon, Mar 13, 2017 at 8:37 PM, vikas rk <[email protected]> > > wrote: > > > >> > > > > >> > > +0.5 > > > >> > > > > > >> > > My two cents, > > > >> > > > > > >> > > * However trivial the test is it should be added unless user > > has a > > > >> easy > > > >> > > workaround to not having to wait for a few days until the > trivial > > > fixes > > > >> > are > > > >> > > merged to beam and then propagated to the runner. > > > >> > > > > > >> > > * While I agree with trivial tests like "ensuring meaningful > > error > > > >> > > message thrown for null options" or "ordering of parameters" not > > > being > > > >> > > strictly necessary and definitely not something that should be > > used > > > as > > > >> > > reference for other IOs, (staying consistent with other IOs is > > > probably > > > >> > the > > > >> > > reason why DatastoreIO has some of them), I still think some > > degree > > > of > > > >> > > leeway for the author is desirable depending on what level of > user > > > >> > > experience they want to provide, as long as it is not a burden > for > > > >> adding > > > >> > > new tests. > > > >> > > > > > >> > > > > > >> > > > > > >> > > On 11 March 2017 at 14:16, Jean-Baptiste Onofré < > [email protected]> > > > >> wrote: > > > >> > > > > > >> > > > +1 > > > >> > > > > > > >> > > > Testing is always hard, especially to have concrete tests. > > > Reducing > > > >> the > > > >> > > > "noise" is a good idea. > > > >> > > > > > > >> > > > Regards > > > >> > > > JB > > > >> > > > > > > >> > > > > > > >> > > > On 03/10/2017 04:09 PM, Eugene Kirpichov wrote: > > > >> > > > > > > >> > > >> Hello, > > > >> > > >> > > > >> > > >> I've seen a pattern in a couple of different transforms (IOs) > > > where > > > >> > we, > > > >> > > I > > > >> > > >> think, spend an excessive amount of code unit-testing the > > trivial > > > >> > > builder > > > >> > > >> methods. > > > >> > > >> > > > >> > > >> E.g. a significant part of > > > >> > > >> https://github.com/apache/beam/blob/master/sdks/java/io/goog > > > >> > > >> le-cloud-platform/src/test/java/org/apache/beam/sdk/io/ > > > >> > > >> gcp/datastore/DatastoreV1Test.java > > > >> > > >> is > > > >> > > >> devoted to that, and a significant part of the in-progress > > Hadoop > > > >> > > >> InputFormat IO. > > > >> > > >> > > > >> > > >> In particular, I mean unit-testing trivial builder > > functionality > > > >> such > > > >> > > as: > > > >> > > >> - That setting a parameter actually sets it > > > >> > > >> - That setting parameters in one order gives the same effect > > as a > > > >> > > >> different > > > >> > > >> order > > > >> > > >> - That a null value of a parameter is rejected with an > > > appropriate > > > >> > error > > > >> > > >> message > > > >> > > >> - That a non-null value of a parameter is accepted > > > >> > > >> > > > >> > > >> I'd like to come to a consensus as to how much unit-testing > of > > > such > > > >> > > stuff > > > >> > > >> is appropriate when developing a new transform. And then put > > this > > > >> > > >> consensus > > > >> > > >> into the PTransform Style Guide > > > >> > > >> <https://beam.apache.org/contribute/ptransform-style-guide/ > >. > > > >> > > >> > > > >> > > >> I think such tests are not worth their weight in lines of > code, > > > for > > > >> a > > > >> > > few > > > >> > > >> reasons: > > > >> > > >> - Whether or not a parameter actually takes effect, should be > > > tested > > > >> > > using > > > >> > > >> a semantic test (RunnableOnService). Testing whether a getter > > > >> returns > > > >> > > the > > > >> > > >> same thing a setter set is neither necessary (already covered > > by > > > a > > > >> > > >> semantic > > > >> > > >> test) nor sufficient (it's possible that the expansion of the > > > >> > transform > > > >> > > >> forgets to even call the getter). > > > >> > > >> - Testing "a non-null value (or an otherwise valid value) is > > > >> accepted" > > > >> > > is > > > >> > > >> redundant with every semantic test that supplies the value. > > > >> > > >> - For testing of supplying a different order of parameters: > my > > > main > > > >> > > >> objections are 1) I'm not sure what kind of reasonably-likely > > > coding > > > >> > > error > > > >> > > >> this guards against, especially given most multi-parameter > > > >> transforms > > > >> > > use > > > >> > > >> AutoValue anyway, and given that "does a setter take effect" > is > > > >> > already > > > >> > > >> tested via bullets above 2) given such a coding error exists, > > > >> there's > > > >> > no > > > >> > > >> way to tell which orders or how many you should test; > > > unit-testing > > > >> > two > > > >> > > or > > > >> > > >> three orders probably gives next to no protection, and > > > unit-testing > > > >> > more > > > >> > > >> than that is impractical. > > > >> > > >> - Testing "a null value is rejected" effectively unit-tests a > > > single > > > >> > > line > > > >> > > >> that says checkNotNull(...). I don't think it's worth 10 or > so > > > lines > > > >> > of > > > >> > > >> code to test something this trivial: even if an author forgot > > to > > > add > > > >> > > this > > > >> > > >> line, and then a used supplied null - the user will just get > a > > > less > > > >> > > >> informative error message, possibly report it as a bug, and > the > > > >> error > > > >> > > will > > > >> > > >> easily get fixed. I.e. it's a very low-risk thing to get > wrong. > > > >> > > >> > > > >> > > >> I think the following similar types of tests are still worth > > > doing > > > >> > > though: > > > >> > > >> - Non-trivial parameter validation logic: where the > validation > > > rule > > > >> is > > > >> > > >> complex, or where the error message is interesting, etc. > > > >> > > >> - Cross-parameter invariants which, if violated, lead to data > > > loss: > > > >> > e.g. > > > >> > > >> that if parameters foo and bar are mutually exclusive, verify > > > that > > > >> if > > > >> > > the > > > >> > > >> user specifies both, they get an error, instead of one of > them > > > being > > > >> > > >> silently ignored. > > > >> > > >> > > > >> > > >> Thoughts? > > > >> > > >> > > > >> > > >> > > > >> > > > -- > > > >> > > > Jean-Baptiste Onofré > > > >> > > > [email protected] > > > >> > > > http://blog.nanthrax.net > > > >> > > > Talend - http://www.talend.com > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > >
