Thanks everyone for the responses. I put up a WIP PR [1] that removes
OnCreateMatcher and OnSuccessMatcher.

[1] https://github.com/apache/beam/pull/10056

On Fri, Nov 8, 2019 at 9:48 AM Luke Cwik <lc...@google.com> wrote:

> It can be marked as deprecated and we can remove its usage everywhere but
> leave this interface and mark it for removal at some future time.
>
> On Thu, Nov 7, 2019 at 2:23 PM Ismaël Mejía <ieme...@gmail.com> wrote:
>
>> Thanks for bringing this to the ML Brian
>>
>> +1 For full TestPipelineOptions deprecation. Even worth to remove it,
>> bad part is that this class resides in 'sdks/core/main/java' and not
>> in testing as I imagined so this could count as a 'breaking' change.
>>
>> On Thu, Nov 7, 2019 at 8:27 PM Luke Cwik <lc...@google.com> wrote:
>> >
>> > There was issue with asynchrony of p.run(), some runners blocked till
>> the pipeline was complete with p.run() which was never meant to be the
>> intent.
>> >
>> > The test timeout one makes sense to be able to configure it per runner
>> (since Dataflow takes a lot longer than other runners) but we may be able
>> to configure a Junit test timeout attribute instead.
>> >
>> > I would be for getting rid of them.
>> >
>> >
>> > On Wed, Nov 6, 2019 at 3:36 PM Robert Bradshaw <rober...@google.com>
>> wrote:
>> >>
>> >> +1 to all of these are probably obsolete at this point and would be
>> >> nice to remove.
>> >>
>> >>
>> >> On Wed, Nov 6, 2019 at 3:00 PM Kenneth Knowles <k...@apache.org>
>> wrote:
>> >> >
>> >> > Good find. I think TestPipelineOptions is from very early days. It
>> makes sense to me that these are all obsolete. Some guesses, though I
>> haven't dug through commit history to confirm:
>> >> >
>> >> >  - TempRoot: a while ago TempLocation was optional, so I think this
>> would provide a default for things like gcpTempLocation and stagingLocation
>> >> >  - OnSuccessMatcher: for runners where pipeline used to not
>> terminate in streaming mode. Now I think every runner can successfully
>> waitUntilFinish. Also the current API for waitUntilFinish went through some
>> evolutions around asynchrony so it wasn't always a good choice.
>> >> >  - OnCreateMatcher: just for symmetry? I don't know
>> >> >  - TestTimeoutSeconds: probably also for the
>> asychrony/waitUntilfinish issue
>> >> >
>> >> > Kenn
>> >> >
>> >> > On Wed, Nov 6, 2019 at 12:19 PM Brian Hulette <bhule...@google.com>
>> wrote:
>> >> >>
>> >> >> I recently came across TestPipelineOptions, and now I'm wondering
>> if maybe it should be deprecated. It only seems to actually be supported
>> for Spark and Dataflow (via TestSparkRunner and TestDataflowRunner), and I
>> think it may make more sense to move the functionality it provides into the
>> tests that need it.
>> >> >>
>> >> >> TestPipelineOptions currently has four attributes:
>> >> >>
>> >> >> # TempRoot
>> >> >> It's purpose isn't documented, but many tests read TempRoot and use
>> it to set a TempLocation (example). I think this attribute makes sense
>> (e.g. we can set TempRoot once and each test has its own subdirectory), but
>> I'm not sure. Can anyone confirm the motivation for it? I'd like to at
>> least add a docstring for it.
>> >> >>
>> >> >> # OnCreateMatcher
>> >> >> A way to register a matcher that will be checked right after a
>> pipeline has started. It's never set except for in TestDataflowRunnerTest,
>> so I think this is absolutely safe to remove.
>> >> >>
>> >> >> # OnSuccessMatcher
>> >> >> A way to register a matcher that will be checked right after a
>> pipeline has successfully completed. This is used in several tests
>> (RequiresStableInputIT, WordCountIT, ... 8 total occurrences), but I don't
>> see why they couldn't all be replaced with a `p.run().waitUntilFinish()`,
>> followed by an assert.
>> >> >>
>> >> >> I think the current approach is actually dangerous, because running
>> these tests with runners other than TestDataflowRunner or TestSparkRunner
>> means the matchers are never actually checked. This is actually how I came
>> across TestPipelineOptions - I tried running a test with the DirectRunner
>> and couldn't make it fail.
>> >> >>
>> >> >> # TestTimeoutSeconds
>> >> >> Seems to just be a wrapper for `waitUntilFinish(duration)`, and
>> only used in one place. I think it would be cleaner for the test to be
>> responsible for calling waitUntilFinish (which we do elsewhere), the only
>> drawback is it requires a small refactor so the test has access to the
>> PipelineResult object.
>> >> >>
>> >> >>
>> >> >> So I have a couple of questions for the community
>> >> >> 1) Are there thoughts on TempRoot? Can we get rid of it?
>> >> >> 2) Are there any objections to removing the other three attributes?
>> Am I missing something? Unless there are any objections I think I'll write
>> a patch to remove them.
>> >> >>
>> >> >> Thanks,
>> >> >> Brian
>>
>

Reply via email to