On Wed, Mar 15, 2017 at 2:11 AM, Ismaël Mejía <ieme...@gmail.com> wrote:
> +1 to Vikas point maybe the right place to enforce things correct > build tests is in the validate and like this reduce the test > boilerplate and only test the validate, but I wonder if this totally > covers both cases (the buildsCorrectly and > buildsCorrectlyInDifferentOrder ones). > > I answer Eugene’s question here even if you are aware now since you > commented in the PR, so everyone understands the case. > > The case is pretty simple, when you extend an IO and add a new > configuration parameter, suppose we have withFoo(String foo) and we > want to add withBar(String bar). In some cases the implementation or > even worse the combination of those are not built correctly, so the > only way to guarantee that this works is to have code that tests the > complete parameter combination or tests that at least assert that the > object is built correctly. > > This is something that can happen both with or without AutoValue > because the with method is hand-written and the natural tendency with > boilerplate methods like this is to copy/paste, so we can end up doing > silly things like: > > private Read(String foo, String bar) { … } > > public Read withBar(String bar) { > return new Read(foo, null); > } > > in this case the reference to bar is not stored or assigned (this is > similar to the case of the DatastoreIO PR), and AutoValue may seem to > solve this issue but you can also end up with this situation if you > copy paste the withFoo method and just change the method name: > > public Read withBar(String foo) { > return builder().setFoo(foo).build(); > } > > Of course both seem silly but both can happen and the tests at least > help to discover those, > Such mistakes should be entirely discovered by tests of feature Bar. If Bar is not actually being tested, that's a bigger problem with coverage that a construction-only test actually obscures (giving it negative value). > > On Wed, Mar 15, 2017 at 1:05 AM, vikas rk <vikky...@gmail.com> wrote: > > Yes, what I meant is: Necessary tests are ones that blocks users if not > > present. Trivial or non-trivial shouldn't be the issue in such cases. > > > > Some of the boilerplate code and tests is because IO PTransforms are > > returned to the user before they are fully constructed and actual > > validation happens in the validate method rather than at construction. I > > understand that the reasoning here is that we want to support > constructing > > them with options in any order and using Builder pattern can be > confusing. > > > > If validate method is where all the validation happens, then we should > able > > to eliminate some redundant checks and tests during construction time > like > > in *withOption* methods here > > <https://github.com/apache/beam/blob/master/sdks/java/io/ > google-cloud-platform/src/main/java/org/apache/beam/sdk/ > io/gcp/bigtable/BigtableIO.java#L199> > > and here > > <https://github.com/apache/beam/blob/master/sdks/java/io/ > google-cloud-platform/src/main/java/org/apache/beam/sdk/ > io/gcp/datastore/DatastoreV1.java#L387> > > as > > these are also checked in the validate method. > > > > > > > > > > > > > > > > > > > > > > > > -Vikas > > > > > > > > On 14 March 2017 at 15:40, Eugene Kirpichov <kirpic...@google.com.invalid > > > > wrote: > > > >> Thanks all. Looks like people are on board with the general direction > >> though it remains to refine it to concrete guidelines to go into style > >> guide. > >> > >> Ismaël, can you give more details about the situation you described in > the > >> first paragraph? Is it perhaps that really a RunnableOnService test was > >> missing (and perhaps still is), rather than a builder test? > >> > >> Vikas, regarding trivial tests and user waiting for a work-around: in > the > >> situation I described, they don't really need a workaround - they > specified > >> an invalid value and have been minorly inconvenienced because the error > >> they got about it was not very readable, so fixing their value took > them a > >> little longer than it could have, but they fixed it and their work is > not > >> blocked. I think Robert's arguments about the cost of trivial tests > apply. > >> > >> I agree that the author should be at liberty to choose which validation > to > >> unit-test and which to skip as trivial, so documentation on this topic > >> should be in the form of guidelines, high-quality example code (i.e. > clean > >> up the unit tests of IOs bundled with Beam SDK), and informal knowledge > in > >> the heads of readers of this thread, rather than hard rules. > >> > >> On Tue, Mar 14, 2017 at 8:07 AM Ismaël Mejía <ieme...@gmail.com> wrote: > >> > >> > +0.5 > >> > > >> > I used to think that some of those tests were not worth, for example > >> > testBuildRead and > >> > testBuildReadAlt. However the reality is that these tests allowed me > to > >> > find bugs both during the development of HBaseIO and just yesterday > when > >> I > >> > tried to test the write support for the emulator with DataStoreIO > (that > >> > lacked a parameter in testBuildWrite and didn’t have a > testBuildWriteAlt > >> > and broke in that case too), so I now believe they are not necessarily > >> > useless. > >> > > >> > I agree with the idea of trying to test the most important things > first > >> and > >> > as Kenneth said trying to reduce the tests of successful validation, > >> > however I am not sure how this can be formalized in the style guide > >> > considering that the value of the tests seems to be based on the > judgment > >> > of both the developer and the reviewer. > >> > > >> > One final aspect if we achieve consensus is that we should not forget > >> that > >> > if we reduce some of this tests we have to pay attention to not reduce > >> the > >> > current level of coverage. > >> > > >> > Regards, > >> > Ismaël > >> > > >> > > >> > > >> > > >> > On Mon, Mar 13, 2017 at 8:37 PM, vikas rk <vikky...@gmail.com> wrote: > >> > > >> > > +0.5 > >> > > > >> > > My two cents, > >> > > > >> > > * However trivial the test is it should be added unless user has a > >> easy > >> > > workaround to not having to wait for a few days until the trivial > fixes > >> > are > >> > > merged to beam and then propagated to the runner. > >> > > > >> > > * While I agree with trivial tests like "ensuring meaningful error > >> > > message thrown for null options" or "ordering of parameters" not > being > >> > > strictly necessary and definitely not something that should be used > as > >> > > reference for other IOs, (staying consistent with other IOs is > probably > >> > the > >> > > reason why DatastoreIO has some of them), I still think some degree > of > >> > > leeway for the author is desirable depending on what level of user > >> > > experience they want to provide, as long as it is not a burden for > >> adding > >> > > new tests. > >> > > > >> > > > >> > > > >> > > On 11 March 2017 at 14:16, Jean-Baptiste Onofré <j...@nanthrax.net> > >> wrote: > >> > > > >> > > > +1 > >> > > > > >> > > > Testing is always hard, especially to have concrete tests. > Reducing > >> the > >> > > > "noise" is a good idea. > >> > > > > >> > > > Regards > >> > > > JB > >> > > > > >> > > > > >> > > > On 03/10/2017 04:09 PM, Eugene Kirpichov wrote: > >> > > > > >> > > >> Hello, > >> > > >> > >> > > >> I've seen a pattern in a couple of different transforms (IOs) > where > >> > we, > >> > > I > >> > > >> think, spend an excessive amount of code unit-testing the trivial > >> > > builder > >> > > >> methods. > >> > > >> > >> > > >> E.g. a significant part of > >> > > >> https://github.com/apache/beam/blob/master/sdks/java/io/goog > >> > > >> le-cloud-platform/src/test/java/org/apache/beam/sdk/io/ > >> > > >> gcp/datastore/DatastoreV1Test.java > >> > > >> is > >> > > >> devoted to that, and a significant part of the in-progress Hadoop > >> > > >> InputFormat IO. > >> > > >> > >> > > >> In particular, I mean unit-testing trivial builder functionality > >> such > >> > > as: > >> > > >> - That setting a parameter actually sets it > >> > > >> - That setting parameters in one order gives the same effect as a > >> > > >> different > >> > > >> order > >> > > >> - That a null value of a parameter is rejected with an > appropriate > >> > error > >> > > >> message > >> > > >> - That a non-null value of a parameter is accepted > >> > > >> > >> > > >> I'd like to come to a consensus as to how much unit-testing of > such > >> > > stuff > >> > > >> is appropriate when developing a new transform. And then put this > >> > > >> consensus > >> > > >> into the PTransform Style Guide > >> > > >> <https://beam.apache.org/contribute/ptransform-style-guide/>. > >> > > >> > >> > > >> I think such tests are not worth their weight in lines of code, > for > >> a > >> > > few > >> > > >> reasons: > >> > > >> - Whether or not a parameter actually takes effect, should be > tested > >> > > using > >> > > >> a semantic test (RunnableOnService). Testing whether a getter > >> returns > >> > > the > >> > > >> same thing a setter set is neither necessary (already covered by > a > >> > > >> semantic > >> > > >> test) nor sufficient (it's possible that the expansion of the > >> > transform > >> > > >> forgets to even call the getter). > >> > > >> - Testing "a non-null value (or an otherwise valid value) is > >> accepted" > >> > > is > >> > > >> redundant with every semantic test that supplies the value. > >> > > >> - For testing of supplying a different order of parameters: my > main > >> > > >> objections are 1) I'm not sure what kind of reasonably-likely > coding > >> > > error > >> > > >> this guards against, especially given most multi-parameter > >> transforms > >> > > use > >> > > >> AutoValue anyway, and given that "does a setter take effect" is > >> > already > >> > > >> tested via bullets above 2) given such a coding error exists, > >> there's > >> > no > >> > > >> way to tell which orders or how many you should test; > unit-testing > >> > two > >> > > or > >> > > >> three orders probably gives next to no protection, and > unit-testing > >> > more > >> > > >> than that is impractical. > >> > > >> - Testing "a null value is rejected" effectively unit-tests a > single > >> > > line > >> > > >> that says checkNotNull(...). I don't think it's worth 10 or so > lines > >> > of > >> > > >> code to test something this trivial: even if an author forgot to > add > >> > > this > >> > > >> line, and then a used supplied null - the user will just get a > less > >> > > >> informative error message, possibly report it as a bug, and the > >> error > >> > > will > >> > > >> easily get fixed. I.e. it's a very low-risk thing to get wrong. > >> > > >> > >> > > >> I think the following similar types of tests are still worth > doing > >> > > though: > >> > > >> - Non-trivial parameter validation logic: where the validation > rule > >> is > >> > > >> complex, or where the error message is interesting, etc. > >> > > >> - Cross-parameter invariants which, if violated, lead to data > loss: > >> > e.g. > >> > > >> that if parameters foo and bar are mutually exclusive, verify > that > >> if > >> > > the > >> > > >> user specifies both, they get an error, instead of one of them > being > >> > > >> silently ignored. > >> > > >> > >> > > >> Thoughts? > >> > > >> > >> > > >> > >> > > > -- > >> > > > Jean-Baptiste Onofré > >> > > > jbono...@apache.org > >> > > > http://blog.nanthrax.net > >> > > > Talend - http://www.talend.com > >> > > > > >> > > > >> > > >> >