I haven't been involved with ValidatesRunner/NeedsRunner too much so I'll avoid commenting on whether a particular interpretation is correct or not.
re: "- There might be a use case for testing a transform against all runners, and we don't have an agreed-upon solution about how to do that: ValidatesRunner technically accomplishes this, but it's logically wrong to use it in this capacity." If I take the words "Needs Runner" at face value, what is is saying is that in order to run the particular test, it must have a pipeline runner in order to execute. Is there anything stopping us from pulling the set of NeedsRunner tests and running them against all runners? I assume the problem with that is that it multiplies the size of our test matrix, and from a logical perspective we shouldn't *need* to do that. So then I think the problem you're getting at is that we want to indicate "this is a test that needs to be run on multiple runners, but is not formally part of validating them." That is: when the test breaks, it's likely a problem with the transform, not the runner. Do we think there's a large category of these? If there was, that would seem to be a distinct problem from the other two annotations and worthy of a new one. Things like the BatchingDoFn might be an example. @NeedsAllRunners ?? As a side comment - I've been discouraging folks from adding either of these annotations on *IT.java files used for IO ITs, since I think that an IT should by definition involve using a runner, and we have specific plans for how we want to run IO ITs. (The IO *Test.java files should still have @NeedsRunner) I thought I'd mention it since it is related to this topic. S On Mon, Mar 27, 2017 at 1:16 PM Eugene Kirpichov <[email protected]> wrote: > Thanks Kenn, this breakdown makes a lot of sense. We should probably > clarify this in the documentation of both of these annotations. Though now > that I look at the current docs, they seem clear enough, but perhaps they > can be phrased stronger. > > To summarize: > - ValidatesRunner tests are for testing a runner. They should be created > mostly by core Beam SDK authors, when introducing a new Beam feature etc. > It may even make sense to make this annotation private to the beam sdk core > module. > - NeedsRunner tests are for testing a transform. That's what all external > users should be using, authors of IOs, etc. It's unspecified which runner > will be used, but in practice usually it'll be direct runner. > - The current situation (use of both of these annotations) does not quite > reflect that, and should be fixed. > - There might be a use case for testing a transform against all runners, > and we don't have an agreed-upon solution about how to do that: > ValidatesRunner technically accomplishes this, but it's logically wrong to > use it in this capacity. > > Right? > > On Mon, Mar 27, 2017 at 12:50 PM Kenneth Knowles <[email protected]> > wrote: > > > Without claiming that this is the final set of categories or that they > are > > used correctly right now, here is what I think they mean: > > > > - ValidatesRunner tests should be tests of the runner itself, generally > > that it implements a primitive correctly > > - NeedsRunner tests should be tests of the PTransform/pipeline, assuming > > the runner is correct > > > > Notably, to the extent the assumption of runner correctness holds, this > > implies that it is OK to run NeedsRunner tests with just the direct > runner. > > > > Pragmatically, in the Java SDK & IOs, this is not the current breakdown. > > > > - In the Java SDK the NeedsRunner category is probably used more to flag > > "run this with just the direct runner" than to express the semantic > intent. > > That isn't so bad; it is very close to the right usage. > > > > - There are IOs that had RunnableOnService tests which are now > > ValidatesRunner tests. While the ability to run an IO does validate a > > runner, this is really an integration test of the IO. If they are to be > run > > with just the direct runner they don't need any annotation, because the > IO > > can take a test-scoped dependency on the direct runner. So it mostly > makes > > sense to tag those tests for which it is profitable to run against all > > runners. > > > > I think the question of IO ITs with the intent to run across runners is > > currently under design discussion and I would defer to other people on > the > > best way to do that. It could be a new category, or it could be a > different > > design pattern entirely. > > > > Kenn > > > > On Mon, Mar 27, 2017 at 11:55 AM, Eugene Kirpichov < > > [email protected]> wrote: > > > > > Kenn - can you also remind for everybody, what is the difference > between > > > @NeedsRunner and @ValidatesRunner, and when should one use one or the > > > other? I always find myself confused about this especially in code > > reviews. > > > > > > On Mon, Mar 27, 2017 at 11:32 AM Kenneth Knowles > <[email protected] > > > > > > wrote: > > > > > > > Hi all, > > > > > > > > I just merged the rename from RunnableOnService to ValidatesRunner in > > the > > > > Java codebase (Python was already there) > > > > https://github.com/apache/beam/pull/2157. > > > > > > > > I'm sure there will be stragglers throughout our docs, etc, so please > > do > > > > help me catch them and fix them. And start learning to say > > > > "ValidatesRunner" in conversation :-) > > > > > > > > Kenn > > > > > > > > On Thu, Nov 10, 2016 at 1:01 PM, Lukasz Cwik > <[email protected] > > > > > > > wrote: > > > > > > > > > The default is a crashing runner which throws an exception if its > > > > executed. > > > > > This makes SDK core/examples/... not depend on any implemented > > runners. > > > > > > > > > > On Thu, Nov 10, 2016 at 12:37 PM, Robert Bradshaw < > > > > > [email protected]> wrote: > > > > > > > > > > > +1 to ValidatesRunner. I'd be nice if it were (optionally?) > > > > > > parameterized by which feature it validates. > > > > > > > > > > > > @NeedsRunner is odd, as using a runner is the most natural way to > > > > > > write many (most) tests, but an annotation should be used to mark > > the > > > > > > exception, not the norm. (I'd just assume a runner is available > for > > > > > > all tests, e.g. CoreTests depends on DirectRunner depends on > Core). > > > > > > > > > > > > On Thu, Nov 10, 2016 at 10:14 AM, Mark Liu > > > <[email protected] > > > > > > > > > > > wrote: > > > > > > > +1 ValidatesRunner > > > > > > > > > > > > > > On Thu, Nov 10, 2016 at 8:40 AM, Kenneth Knowles > > > > > <[email protected] > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > >> Nice. I like ValidatesRunner. > > > > > > >> > > > > > > >> On Nov 10, 2016 03:39, "Amit Sela" <[email protected]> > > wrote: > > > > > > >> > > > > > > >> > How about @ValidatesRunner ? > > > > > > >> > Seems to complement @NeedsRunner as well. > > > > > > >> > > > > > > > >> > On Thu, Nov 10, 2016 at 9:47 AM Aljoscha Krettek < > > > > > [email protected] > > > > > > > > > > > > > >> > wrote: > > > > > > >> > > > > > > > >> > > +1 > > > > > > >> > > > > > > > > >> > > What I would really like to see is automatic derivation of > > the > > > > > > >> capability > > > > > > >> > > matrix from an extended Runner Test Suite. (As outlined in > > > > Thomas' > > > > > > >> doc). > > > > > > >> > > > > > > > > >> > > On Wed, 9 Nov 2016 at 21:42 Kenneth Knowles > > > > > <[email protected] > > > > > > > > > > > > > >> > > wrote: > > > > > > >> > > > > > > > > >> > > > Huge +1 to this. > > > > > > >> > > > > > > > > > >> > > > The two categories I care most about are: > > > > > > >> > > > > > > > > > >> > > > 1. Tests that need a runner, but are testing the other > > > "thing > > > > > > under > > > > > > >> > > test"; > > > > > > >> > > > today this is NeedsRunner. > > > > > > >> > > > 2. Tests that are intended to test a runner; today this > is > > > > > > >> > > > RunnableOnService. > > > > > > >> > > > > > > > > > >> > > > Actually the lines are not necessary clear between them, > > > but I > > > > > > think > > > > > > >> we > > > > > > >> > > can > > > > > > >> > > > make good choices, like we already do. > > > > > > >> > > > > > > > > > >> > > > The idea of two categories with a common superclass > > actually > > > > > has a > > > > > > >> > > pitfall: > > > > > > >> > > > what if a test is put in the superclass category, when > it > > > does > > > > > not > > > > > > >> > have a > > > > > > >> > > > clear meaning? And also, I don't have any good ideas for > > > > names. > > > > > > >> > > > > > > > > > >> > > > So I think just replacing RunnableOnService with > > RunnerTest > > > to > > > > > > make > > > > > > >> > clear > > > > > > >> > > > that it is there just to test the runner is good. We > might > > > > also > > > > > > want > > > > > > >> > > > RunnerIntegrationTest extends NeedsRunner to use in the > IO > > > > > > modules. > > > > > > >> > > > > > > > > > >> > > > See also Thomas's doc on capability matrix testing* > which > > is > > > > > > aimed at > > > > > > >> > > case > > > > > > >> > > > 2. Those tests should all have a category from the doc, > > or a > > > > new > > > > > > one > > > > > > >> > > added. > > > > > > >> > > > > > > > > > >> > > > * > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > https://docs.google.com/document/d/1fICxq32t9yWn9qXhmT07xpclHeHX2 > > > > > > >> > VlUyVtpi2WzzGM/edit > > > > > > >> > > > > > > > > > >> > > > Kenn > > > > > > >> > > > > > > > > > >> > > > On Wed, Nov 9, 2016 at 12:20 PM, Jean-Baptiste Onofré < > > > > > > >> [email protected] > > > > > > >> > > > > > > > > >> > > > wrote: > > > > > > >> > > > > > > > > > >> > > > > Hi Mark, > > > > > > >> > > > > > > > > > > >> > > > > Generally speaking, I agree. > > > > > > >> > > > > > > > > > > >> > > > > As RunnableOnService extends NeedsRunner, > > @TestsWithRunner > > > > or > > > > > > >> > > > @RunOnRunner > > > > > > >> > > > > sound clearer. > > > > > > >> > > > > > > > > > > >> > > > > Regards > > > > > > >> > > > > JB > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > On 11/09/2016 09:00 PM, Mark Liu wrote: > > > > > > >> > > > > > > > > > > >> > > > >> Hi all, > > > > > > >> > > > >> > > > > > > >> > > > >> I'm working on building RunnableOnService in Python > > SDK. > > > > > After > > > > > > >> > having > > > > > > >> > > > >> discussions with folks, "RunnableOnService" looks > like > > > not > > > > a > > > > > > very > > > > > > >> > > > >> intuitive > > > > > > >> > > > >> name for those unit tests that require runners and > > build > > > > > > >> lightweight > > > > > > >> > > > >> pipelines to test specific components. Especially, > they > > > > don't > > > > > > have > > > > > > >> > to > > > > > > >> > > > run > > > > > > >> > > > >> on a service. > > > > > > >> > > > >> > > > > > > >> > > > >> So I want to raise this idea to the community and see > > if > > > > > anyone > > > > > > >> have > > > > > > >> > > > >> similar thoughts. Maybe we can come up with a name > this > > > is > > > > > > tight > > > > > > >> to > > > > > > >> > > > >> runner. > > > > > > >> > > > >> Currently, I have two names in my head: > > > > > > >> > > > >> > > > > > > >> > > > >> - TestsWithRunners > > > > > > >> > > > >> - RunnerExecutable > > > > > > >> > > > >> > > > > > > >> > > > >> Any thoughts? > > > > > > >> > > > >> > > > > > > >> > > > >> Thanks, > > > > > > >> > > > >> Mark > > > > > > >> > > > >> > > > > > > >> > > > >> > > > > > > >> > > > > -- > > > > > > >> > > > > Jean-Baptiste Onofré > > > > > > >> > > > > [email protected] > > > > > > >> > > > > http://blog.nanthrax.net > > > > > > >> > > > > Talend - http://www.talend.com > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >
